vak.transforms.frame_labels.functional.postprocess#
- vak.transforms.frame_labels.functional.postprocess(frame_labels: ndarray, timebin_dur: float, unlabeled_label: int = 0, min_segment_dur: float | None = None, majority_vote: bool = False) ndarray [source]#
Apply post-processing transformations to a vector of frame labels.
Optional post-processing consist of two transforms, that both rely on there being a label that corresponds to the āunlabeledā (or ābackgroundā) class. The first removes any segments that are shorter than a specified duration, by converting labels in those segments to the ābackgroundā / āunlabeledā class label. The second performs a āmajority voteā transform within run of labels that is bordered on both sides by the ābackgroundā label. I.e., it counts the number of times any label occurs in that segment, and then assigns all bins the most common label.
The function performs those steps in this order (pseudo-code):
if min_segment_dur: frame_labels = remove_short_segments(frame_labels, labelmap, min_segment_dur) if majority_vote: frame_labels = majority_vote(frame_labels, labelmap) return frame_labels
- Parameters:
frame_labels (numpy.ndarray) ā A vector where each element represents a label for a frame, either a single sample in audio or a single time bin from a spectrogram. Output of a neural network.
timebin_dur (float) ā Duration of a time bin in a spectrogram, e.g., as estimated from vector of times using
vak.timebins.timebin_dur_from_vec
.unlabeled_label (int) ā Label that was given to segments that were not labeled in annotation, e.g. silent periods between annotated segments. Default is 0.
min_segment_dur (float) ā Minimum duration of segment, in seconds. If specified, then any segment with a duration less than min_segment_dur is removed from frame_labels. Default is None, in which case no segments are removed.
majority_vote (bool) ā If True, transform segments containing multiple labels into segments with a single label by taking a āmajority voteā, i.e. assign all time bins in the segment the most frequently occurring label in the segment. This transform can only be applied if the labelmap contains an āunlabeledā label, because unlabeled segments makes it possible to identify the labeled segments. Default is False.
- Returns:
frame_labels ā Vector of frame labels after post-processing is applied.
- Return type: