vak.transforms.frame_labels.functional.postprocess#

vak.transforms.frame_labels.functional.postprocess(frame_labels: ndarray, timebin_dur: float, unlabeled_label: int = 0, min_segment_dur: float | None = None, majority_vote: bool = False) ndarray[source]#

Apply post-processing transformations to a vector of frame labels.

Optional post-processing consist of two transforms, that both rely on there being a label that corresponds to the ā€œunlabeledā€ (or ā€œbackgroundā€) class. The first removes any segments that are shorter than a specified duration, by converting labels in those segments to the ā€œbackgroundā€ / ā€œunlabeledā€ class label. The second performs a ā€œmajority voteā€ transform within run of labels that is bordered on both sides by the ā€œbackgroundā€ label. I.e., it counts the number of times any label occurs in that segment, and then assigns all bins the most common label.

The function performs those steps in this order (pseudo-code):

if min_segment_dur:
    frame_labels = remove_short_segments(frame_labels, labelmap, min_segment_dur)
if majority_vote:
    frame_labels = majority_vote(frame_labels, labelmap)
return frame_labels
Parameters:
  • frame_labels (numpy.ndarray) ā€“ A vector where each element represents a label for a frame, either a single sample in audio or a single time bin from a spectrogram. Output of a neural network.

  • timebin_dur (float) ā€“ Duration of a time bin in a spectrogram, e.g., as estimated from vector of times using vak.timebins.timebin_dur_from_vec.

  • unlabeled_label (int) ā€“ Label that was given to segments that were not labeled in annotation, e.g. silent periods between annotated segments. Default is 0.

  • min_segment_dur (float) ā€“ Minimum duration of segment, in seconds. If specified, then any segment with a duration less than min_segment_dur is removed from frame_labels. Default is None, in which case no segments are removed.

  • majority_vote (bool) ā€“ If True, transform segments containing multiple labels into segments with a single label by taking a ā€œmajority voteā€, i.e. assign all time bins in the segment the most frequently occurring label in the segment. This transform can only be applied if the labelmap contains an ā€˜unlabeledā€™ label, because unlabeled segments makes it possible to identify the labeled segments. Default is False.

Returns:

frame_labels ā€“ Vector of frame labels after post-processing is applied.

Return type:

numpy.ndarray