vak.eval.eval_.eval¶
- vak.eval.eval_.eval(model_config: dict, dataset_config: dict, trainer_config: dict, checkpoint_path: str | Path, output_dir: str | Path, num_workers: int, labelmap_path: str | Path | None = None, batch_size: int | None = None, frames_standardizer_path: str | Path = None, post_tfm_kwargs: dict | None = None, device: str | None = None) None[source]¶
Evaluate a trained model.
- Parameters:
model_config (dict) – Model configuration in a
dict. Can be obtained by callingvak.config.ModelConfig.asdict().dataset_config (dict) – Dataset configuration in a
dict. Can be obtained by callingvak.config.DatasetConfig.asdict().trainer_config (dict) – Configuration for
lightning.pytorch.Trainer. Can be obtained by callingvak.config.TrainerConfig.asdict().checkpoint_path (str, pathlib.Path) – path to directory with checkpoint files saved by Torch, to reload model
output_dir (str, pathlib.Path) – Path to location where .csv files with evaluation metrics should be saved.
num_workers (int) – Number of processes to use for parallel loading of data. Argument to torch.DataLoader. Default is 2.
labelmap_path (str, pathlib.Path, optional) – Path to ‘labelmap.json’ file. Optional, default is None.
batch_size (int, optional.) – Number of samples per batch fed into model. Optional, default is None.
split (str) – split of dataset on which model should be evaluated. One of {‘train’, ‘val’, ‘test’}. Default is ‘test’.
frames_standardizer_path (str, pathlib.Path) – path to a saved FramesStandardizer object used to standardize frames. If frames were standardized during training, and this is not provided, then evaluation will give incorrect results. Default is None.
post_tfm_kwargs (dict) – Keyword arguments to post-processing transform. If None, then no additional clean-up is applied when transforming labeled timebins to segments, the default behavior. The transform used is
vak.transforms.frame_labels.PostProcess`. Valid keyword argument names are 'majority_vote' and 'min_segment_dur', and should be appropriate values for those arguments: Boolean for ``majority_vote, a float value formin_segment_dur. See the docstring of the transform for more details on these arguments and how they work.device (str) – Device on which to work with model + data. Defaults to ‘cuda’ if torch.cuda.is_available is True.
Notes
Note that unlike
core.predict, this function can modifylabelmapso that metrics like edit distance are correctly computed, by converting any string labels inlabelmapwith multiple characters to (mock) single-character labels, withvak.labels.multi_char_labels_to_single_char.