vak.datapipes.frame_classification.infer_datapipe.InferDatapipe¶
- class vak.datapipes.frame_classification.infer_datapipe.InferDatapipe(dataset_path: str | pathlib.Path, dataset_df: pd.DataFrame, input_type: str, split: str, sample_ids: npt.NDArray, inds_in_sample: npt.NDArray, frame_dur: float, window_size: int, frames_standardizer: FramesStandardizer | None = None, frames_padval: float = 0.0, frame_labels_padval: int = -1, return_padding_mask: bool = False, subset: str | None = None)[source]¶
Bases:
object
A datapipe class used for neural network models with the frame classification task, where the source data consists of audio signals or spectrograms of varying lengths.
- dataset_path¶
Path to directory that represents a frame classification dataset, as created by
vak.prep.prep_frame_classification_dataset()
.- Type:
- subset¶
Name of subset to use. If specified, this takes precedence over split. Subsets are typically taken from the training data for use when generating a learning curve.
- Type:
str, optional
- dataset_df¶
A frame classification dataset, represented as a
pandas.DataFrame
. This will be only the rows that correspond to eithersubset
orsplit
from thedataset_df
that was passed in when instantiating the class.- Type:
- frames_paths¶
Paths to npy files containing frames, either spectrograms or audio signals that are input to the model.
- Type:
- frame_labels_paths¶
Paths to npy files containing vectors with a label for each frame. The targets for the outputs of the model.
- Type:
- sample_ids¶
Indexing vector representing which sample from the dataset every frame belongs to.
- Type:
- inds_in_sample¶
Indexing vector representing which index within each sample from the dataset that every frame belongs to.
- Type:
- frame_dur¶
Duration of a frame, i.e., a single sample in audio or a single timebin in a spectrogram.
- Type:
- frames_standardizer¶
Transform applied to frames, the input to the neural network model. Optional, default is None. If supplied, will be used with the transform applied to inputs and targets,
vak.transforms.defaults.frame_classification.TrainItemTransform
.- Type:
vak.transforms.FramesStandardizer, optional
- __init__(dataset_path: str | pathlib.Path, dataset_df: pd.DataFrame, input_type: str, split: str, sample_ids: npt.NDArray, inds_in_sample: npt.NDArray, frame_dur: float, window_size: int, frames_standardizer: FramesStandardizer | None = None, frames_padval: float = 0.0, frame_labels_padval: int = -1, return_padding_mask: bool = False, subset: str | None = None)[source]¶
Initialize a new instance of an
InferDatapipe
.- Parameters:
dataset_path (pathlib.Path) – Path to directory that represents a frame classification dataset, as created by
vak.prep.prep_frame_classification_dataset()
.dataset_df (pandas.DataFrame) – A frame classification dataset, represented as a
pandas.DataFrame
.input_type (str) – The type of input to the neural network model. One of {‘audio’, ‘spect’}.
split (str) – The name of a split from the dataset, one of {‘train’, ‘val’, ‘test’}.
sample_ids (numpy.ndarray) – Indexing vector representing which sample from the dataset every frame belongs to.
inds_in_sample (numpy.ndarray) – Indexing vector representing which index within each sample from the dataset that every frame belongs to.
frame_dur (float) – Duration of a frame, i.e., a single sample in audio or a single timebin in a spectrogram.
frames_standardizer (vak.transforms.FramesStandardizer, optional) – Transform applied to frames, the input to the neural network model. Optional, default is None. If supplied, will be used with the transform applied to inputs and targets,
vak.transforms.defaults.frame_classification.InferItemTransform
.window_size (int) – Size of windows to return; number of frames.
frames_padval (float) – Value to pad frames with. Added to end of array, the “right side”. Argument to PadToWindow transform. Default is 0.0.
frame_labels_padval (int) – Value to pad frame labels vector with. Added to the end of the array. Argument to PadToWindow transform. Default is -1. Used with
ignore_index
argument oftorch.nn.CrossEntropyLoss
.return_padding_mask (bool) – if True, the dictionary returned by ItemTransform classes will include a boolean vector to use for cropping back down to size before padding. padding_mask has size equal to width of padded array, i.e. original size plus padding at the end, and has values of 1 where columns in padded are from the original array, and values of 0 where columns were added for padding.
subset (str, optional) – Name of subset to use. If specified, this takes precedence over split. Subsets are typically taken from the training data for use when generating a learning curve.
Methods
__init__
(dataset_path, dataset_df, ...[, ...])Initialize a new instance of an
InferDatapipe
.from_dataset_path
(dataset_path, window_size)Make a
InferDatapipe
instance, given the path to a frame classification dataset.Attributes
duration
shape
- classmethod from_dataset_path(dataset_path: str | pathlib.Path, window_size: int, frames_standardizer: FramesStandardizer | None = None, frames_padval: float = 0.0, frame_labels_padval: int = -1, return_padding_mask: bool = False, split: str = 'val', subset: str | None = None)[source]¶
Make a
InferDatapipe
instance, given the path to a frame classification dataset.- Parameters:
dataset_path (pathlib.Path) – Path to directory that represents a frame classification dataset, as created by
vak.prep.prep_frame_classification_dataset()
.window_size (int) – Size of windows to return; number of frames.
frames_standardizer (vak.transforms.FramesStandardizer, optional) – Transform applied to frames, the input to the neural network model. Optional, default is None. If supplied, will be used with the transform applied to inputs and targets,
vak.transforms.defaults.frame_classification.TrainItemTransform
.frames_padval (float) – Value to pad frames with. Added to end of array, the “right side”. Argument to PadToWindow transform. Default is 0.0.
frame_labels_padval (int) – Value to pad frame labels vector with. Added to the end of the array. Argument to PadToWindow transform. Default is -1. Used with
ignore_index
argument oftorch.nn.CrossEntropyLoss
.return_padding_mask (bool) – if True, the dictionary returned by ItemTransform classes will include a boolean vector to use for cropping back down to size before padding. padding_mask has size equal to width of padded array, i.e. original size plus padding at the end, and has values of 1 where columns in padded are from the original array, and values of 0 where columns were added for padding.
split (str) – The name of a split from the dataset, one of {‘train’, ‘val’, ‘test’}. Default is “val”.
subset (str, optional) – Name of subset to use. If specified, this takes precedence over split. Subsets are typically taken from the training data for use when generating a learning curve.
- Returns:
infer_datapipe
- Return type: