vak.train.parametric_umap.train_parametric_umap_model#

Train a model from the parametric UMAP family and save results.

Saves checkpoint files for model, label map, and spectrogram scaler. These are saved either in results_path if specified, or a new directory made inside root_results_dir.

Parameters:

model_name (str) – Model name, must be one of vak.models.registry.MODEL_NAMES.
model_config (dict) – Model configuration in a dict, as loaded from a .toml file, and used by the model method from_config.
dataset_path (str) – Path to dataset, a directory generated by running vak prep.
batch_size (int) – number of samples per batch presented to models during training.
num_epochs (int) – number of training epochs. One epoch = one iteration through the entire training set.
num_workers (int) – Number of processes to use for parallel loading of data. Argument to torch.DataLoader.
train_dataset_params (dict, optional) – Parameters for training dataset. Passed as keyword arguments to vak.datasets.parametric_umap.ParametricUMAP. Optional, default is None.
val_dataset_params (dict, optional) – Parameters for validation dataset. Passed as keyword arguments to vak.datasets.parametric_umap.ParametricUMAP. Optional, default is None.
checkpoint_path (str, pathlib.Path, optional) – path to a checkpoint file, e.g., one generated by a previous run of vak.core.train. If specified, this checkpoint will be loaded into model. Used when continuing training. Default is None, in which case a new model is initialized.
root_results_dir (str, pathlib.Path, optional) – Root directory in which a new directory will be created where results will be saved.
results_path (str, pathlib.Path, optional) – Directory where results will be saved. If specified, this parameter overrides root_results_dir.
val_step (int) – Computes the loss using validation set every val_step epochs. Default is None, in which case no validation is done.
ckpt_step (int) – Step on which to save to checkpoint file. If ckpt_step is n, then a checkpoint is saved every time the global step / n is a whole number, i.e., when ckpt_step modulo the global step is 0. Default is None, in which case checkpoint is only saved at the last epoch.
device (str) – Device on which to work with model + data. Default is None. If None, then a device will be selected with vak.split.get_default. That function defaults to ‘cuda’ if torch.cuda.is_available is True.
shuffle (bool) – if True, shuffle training data before each epoch. Default is True.
split (str) – Name of split from dataset found at dataset_path to use when training model. Default is ‘train’. This parameter is used by vak.learncurve.learncurve to specify specific subsets of the training set to use when training models for a learning curve.