vak.transforms.frame_labels.functional.remove_short_segments

vak.transforms.frame_labels.functional.remove_short_segments(frame_labels: ndarray[Any, dtype[_ScalarType_co]], segment_inds_list: list[ndarray[Any, dtype[_ScalarType_co]]], timebin_dur: float, min_segment_dur: float | int, background_label: int = 0) tuple[ndarray[Any, dtype[_ScalarType_co]], list[ndarray[Any, dtype[_ScalarType_co]]]][source]

Remove segments from vector of frame labels that are shorter than a specified duration.

Parameters:
  • frame_labels (numpy.ndarray) – A vector where each element represents a label for a frame, either a single sample in audio or a single time bin from a spectrogram. Output of a neural network.

  • segment_inds_list (list) – Of numpy.ndarray, indices that will recover segments list from frame_labels. Returned by function vak.labels.frame_labels_segment_inds_list.

  • timebin_dur (float) – Duration of a single timebin in the spectrogram, in seconds. Used to convert onset and offset indices in frame_labels to seconds.

  • min_segment_dur (float) – Minimum duration of segment, in seconds. If specified, then any segment with a duration less than min_segment_dur is removed from frame_labels. Default is None, in which case no segments are removed.

  • background_label (int) – Label that was given to segments that were not labeled in annotation, e.g. silent periods between annotated segments. Default is 0.

Returns:

  • frame_labels (numpy.ndarray) – A vector where each element represents a label for a frame, either a single sample in audio or a single time bin from a spectrogram. With segments whose duration is shorter than min_segment_dur set to background_label

  • segment_inds_list (list) – Of numpy.ndarray, with arrays removed that represented segments in frame_labels that were shorter than min_segment_dur.