vak.eval.frame_classification.eval_frame_classification_model¶
- vak.eval.frame_classification.eval_frame_classification_model(model_config: dict, dataset_config: dict, trainer_config: dict, checkpoint_path: str | Path, labelmap_path: str | Path, output_dir: str | Path, num_workers: int, frames_standardizer_path: str | Path = None, post_tfm_kwargs: dict | None = None) dict[str, Tensor][source]¶
Evaluate a trained model.
- Parameters:
model_config (dict) – Model configuration in a
dict. Can be obtained by callingvak.config.ModelConfig.asdict().dataset_config (dict) – Dataset configuration in a
dict. Can be obtained by callingvak.config.DatasetConfig.asdict().trainer_config (dict) – Configuration for
lightning.pytorch.Trainer. Can be obtained by callingvak.config.TrainerConfig.asdict().checkpoint_path (str, pathlib.Path) – Path to directory with checkpoint files saved by Torch, to reload model
output_dir (str, pathlib.Path) – Path to location where .csv files with evaluation metrics should be saved.
labelmap_path (str, pathlib.Path) – Path to ‘labelmap.json’ file.
num_workers (int) – Number of processes to use for parallel loading of data. Argument to torch.DataLoader. Default is 2.
frames_standardizer_path (str, pathlib.Path) – Path to a saved
vak.transforms.FramesStandardizerobject used to standardize (normalize) frames. If frames were standardized during training and this is not provided, will give incorrect results. Default is None.post_tfm_kwargs (dict) – Keyword arguments to post-processing transform. If None, then no additional clean-up is applied when transforming labeled timebins to segments, the default behavior. The transform used is
vak.transforms.frame_labels.PostProcess`. Valid keyword argument names are 'majority_vote' and 'min_segment_dur', and should be appropriate values for those arguments: Boolean for ``majority_vote, a float value formin_segment_dur. See the docstring of the transform for more details on these arguments and how they work.
Notes
Note that unlike
predict(), this function can modifylabelmapso that metrics like edit distance are correctly computed, by converting any string labels inlabelmapwith multiple characters to (mock) single-character labels, withvak.labels.multi_char_labels_to_single_char().