vak.eval.frame_classification.eval_frame_classification_model

vak.eval.frame_classification.eval_frame_classification_model(model_config: dict, dataset_config: dict, trainer_config: dict, checkpoint_path: str | Path, labelmap_path: str | Path, output_dir: str | Path, num_workers: int, frames_standardizer_path: str | Path | None = None, post_tfm_kwargs: dict | None = None) None[source]

Evaluate a trained model.

Parameters:
  • model_config (dict) – Model configuration in a dict. Can be obtained by calling vak.config.ModelConfig.asdict().

  • dataset_config (dict) – Dataset configuration in a dict. Can be obtained by calling vak.config.DatasetConfig.asdict().

  • trainer_config (dict) – Configuration for lightning.pytorch.Trainer. Can be obtained by calling vak.config.TrainerConfig.asdict().

  • checkpoint_path (str, pathlib.Path) – Path to directory with checkpoint files saved by Torch, to reload model

  • output_dir (str, pathlib.Path) – Path to location where .csv files with evaluation metrics should be saved.

  • labelmap_path (str, pathlib.Path) – Path to ‘labelmap.json’ file.

  • num_workers (int) – Number of processes to use for parallel loading of data. Argument to torch.DataLoader. Default is 2.

  • frames_standardizer_path (str, pathlib.Path) – Path to a saved vak.transforms.FramesStandardizer object used to standardize (normalize) frames. If frames were standardized during training and this is not provided, will give incorrect results. Default is None.

  • post_tfm_kwargs (dict) – Keyword arguments to post-processing transform. If None, then no additional clean-up is applied when transforming labeled timebins to segments, the default behavior. The transform used is vak.transforms.frame_labels.PostProcess`. Valid keyword argument names are 'majority_vote' and 'min_segment_dur', and should be appropriate values for those arguments: Boolean for ``majority_vote, a float value for min_segment_dur. See the docstring of the transform for more details on these arguments and how they work.

Notes

Note that unlike predict(), this function can modify labelmap so that metrics like edit distance are correctly computed, by converting any string labels in labelmap with multiple characters to (mock) single-character labels, with vak.labels.multi_char_labels_to_single_char().