vak.transforms.defaults.frame_classification.InferItemTransform

class vak.transforms.defaults.frame_classification.InferItemTransform(window_size, frames_standardizer=None, frames_padval=0.0, frame_labels_padval=-1, return_padding_mask=True, channel_dim=1)[source]

Bases: object

Default transform used when running inference on frame classification models, for evaluation or to generate new predictions.

Returned item includes frames reshaped into a stack of windows, with padded added to make reshaping possible. Any frame_labels are not padded and reshaped, but are converted to torch.LongTensor. If return_padding_mask is True, item includes ‘padding_mask’ that can be used to crop off any predictions made on the padding.

frames_standardizer

instance that has already been fit to dataset, using fit_df method. Default is None, in which case no standardization transform is applied.

Type:

vak.transforms.FramesStandardizer

window_size

width of window in number of elements. Argument to PadToWindow transform.

Type:

int

frames_padval

Value to pad frames with. Added to end of array, the “right side”. Argument to PadToWindow transform. Default is 0.0.

Type:

float

frame_labels_padval

Value to pad frame labels vector with. Added to the end of the array. Argument to PadToWindow transform. Default is -1. Used with ignore_index argument of torch.nn.CrossEntropyLoss.

Type:

int

return_padding_mask

if True, the dictionary returned by ItemTransform classes will include a boolean vector to use for cropping back down to size before padding. padding_mask has size equal to width of padded array, i.e. original size plus padding at the end, and has values of 1 where columns in padded are from the original array, and values of 0 where columns were added for padding.

Type:

bool

__init__(window_size, frames_standardizer=None, frames_padval=0.0, frame_labels_padval=-1, return_padding_mask=True, channel_dim=1)[source]

Methods

__init__(window_size[, frames_standardizer, ...])