vak.transforms.frame_labels.functional.postprocess#

vak.transforms.frame_labels.functional.postprocess(frame_labels: ndarray, timebin_dur: float, unlabeled_label: int = 0, min_segment_dur: float | None = None, majority_vote: bool = False) → ndarray[source]#

Apply post-processing transformations to a vector of frame labels.

Optional post-processing consist of two transforms, that both rely on there being a label that corresponds to the “unlabeled” (or “background”) class. The first removes any segments that are shorter than a specified duration, by converting labels in those segments to the “background” / “unlabeled” class label. The second performs a “majority vote” transform within run of labels that is bordered on both sides by the “background” label. I.e., it counts the number of times any label occurs in that segment, and then assigns all bins the most common label.

The function performs those steps in this order (pseudo-code):

if min_segment_dur:
    frame_labels = remove_short_segments(frame_labels, labelmap, min_segment_dur)
if majority_vote:
    frame_labels = majority_vote(frame_labels, labelmap)
return frame_labels

Parameters:

frame_labels (numpy.ndarray) – A vector where each element represents a label for a frame, either a single sample in audio or a single time bin from a spectrogram. Output of a neural network.
timebin_dur (float) – Duration of a time bin in a spectrogram, e.g., as estimated from vector of times using vak.timebins.timebin_dur_from_vec.
unlabeled_label (int) – Label that was given to segments that were not labeled in annotation, e.g. silent periods between annotated segments. Default is 0.
min_segment_dur (float) – Minimum duration of segment, in seconds. If specified, then any segment with a duration less than min_segment_dur is removed from frame_labels. Default is None, in which case no segments are removed.
majority_vote (bool) – If True, transform segments containing multiple labels into segments with a single label by taking a “majority vote”, i.e. assign all time bins in the segment the most frequently occurring label in the segment. This transform can only be applied if the labelmap contains an ‘unlabeled’ label, because unlabeled segments makes it possible to identify the labeled segments. Default is False.

Returns:

frame_labels – Vector of frame labels after post-processing is applied.

Return type:

numpy.ndarray