vak.train.parametric_umap.train_parametric_umap_model

vak.train.parametric_umap.train_parametric_umap_model(model_config: dict, dataset_config: dict, trainer_config: dict, batch_size: int, num_epochs: int, num_workers: int, checkpoint_path: str | Path | None = None, root_results_dir: str | Path | None = None, results_path: str | Path | None = None, shuffle: bool = True, val_step: int | None = None, ckpt_step: int | None = None, subset: str | None = None) None[source]

Train a model from the parametric UMAP family and save results.

Saves checkpoint files for model, label map, and spectrogram scaler. These are saved either in results_path if specified, or a new directory made inside root_results_dir.

Parameters:
  • model_config (dict) – Model configuration in a dict. Can be obtained by calling vak.config.ModelConfig.asdict().

  • dataset_config (dict) – Dataset configuration in a dict. Can be obtained by calling vak.config.DatasetConfig.asdict().

  • trainer_config (dict) – Configuration for lightning.pytorch.Trainer in a dict. Can be obtained by calling vak.config.TrainerConfig.asdict().

  • batch_size (int) – number of samples per batch presented to models during training.

  • num_epochs (int) – number of training epochs. One epoch = one iteration through the entire training set.

  • num_workers (int) – Number of processes to use for parallel loading of data. Argument to torch.DataLoader.

  • checkpoint_path (str, pathlib.Path, optional) – path to a checkpoint file, e.g., one generated by a previous run of vak.core.train. If specified, this checkpoint will be loaded into model. Used when continuing training. Default is None, in which case a new model is initialized.

  • root_results_dir (str, pathlib.Path, optional) – Root directory in which a new directory will be created where results will be saved.

  • results_path (str, pathlib.Path, optional) – Directory where results will be saved. If specified, this parameter overrides root_results_dir.

  • val_step (int) – Computes the loss using validation set every val_step epochs. Default is None, in which case no validation is done.

  • ckpt_step (int) – Step on which to save to checkpoint file. If ckpt_step is n, then a checkpoint is saved every time the global step / n is a whole number, i.e., when ckpt_step modulo the global step is 0. Default is None, in which case checkpoint is only saved at the last epoch.

  • shuffle (bool) – if True, shuffle training data before each epoch. Default is True.

  • split (str) – Name of split from dataset found at dataset_path to use when training model. Default is ‘train’. This parameter is used by vak.learncurve.learncurve to specify specific subsets of the training set to use when training models for a learning curve.