vak.eval.eval_.eval#

Evaluate a trained model.

Parameters:

model_name (str) – Model name, must be one of vak.models.registry.MODEL_NAMES.
model_config (dict) – Model configuration in a dict, as loaded from a .toml file, and used by the model method from_config.
dataset_path (str, pathlib.Path) – Path to dataset, e.g., a csv file generated by running vak prep.
checkpoint_path (str, pathlib.Path) – path to directory with checkpoint files saved by Torch, to reload model
output_dir (str, pathlib.Path) – Path to location where .csv files with evaluation metrics should be saved.
num_workers (int) – Number of processes to use for parallel loading of data. Argument to torch.DataLoader. Default is 2.
labelmap_path (str, pathlib.Path, optional) – Path to ‘labelmap.json’ file. Optional, default is None.
batch_size (int, optional.) – Number of samples per batch fed into model. Optional, default is None.
transform_params (dict, optional) – Parameters for data transform. Passed as keyword arguments. Optional, default is None.
dataset_params (dict, optional) – Parameters for dataset. Passed as keyword arguments. Optional, default is None.
split (str) – split of dataset on which model should be evaluated. One of {‘train’, ‘val’, ‘test’}. Default is ‘test’.
spect_scaler_path (str, pathlib.Path) – path to a saved SpectScaler object used to normalize spectrograms. If spectrograms were normalized and this is not provided, will give incorrect results. Default is None.
post_tfm_kwargs (dict) – Keyword arguments to post-processing transform. If None, then no additional clean-up is applied when transforming labeled timebins to segments, the default behavior. The transform used is vak.transforms.frame_labels.PostProcess`. Valid keyword argument names are 'majority_vote' and 'min_segment_dur', and should be appropriate values for those arguments: Boolean for ``majority_vote, a float value for min_segment_dur. See the docstring of the transform for more details on these arguments and how they work.
device (str) – Device on which to work with model + data. Defaults to ‘cuda’ if torch.cuda.is_available is True.

Notes

Note that unlike core.predict, this function can modify labelmap so that metrics like edit distance are correctly computed, by converting any string labels in labelmap with multiple characters to (mock) single-character labels, with vak.labels.multi_char_labels_to_single_char.