Configuration files¶
This document contains the specification
for the .toml
configuration files used
when running vak
commands through the command-line interface,
as described in vak command-line interface.
A .toml
configuration file is split up into sections.
The sections and their valid options
are represented in the vak
code
by classes.
To ensure that the code and this documentation
do not go out of sync,
the options are presented below
exactly as documented in the code
for each class.
Valid section names¶
Following is the set of valid section names:
{eval, learncurve, predict, prep, train}
.
In the code, these names correspond to attributes
of the main Config
class, as shown below.
The only other valid section name
is the name of a class
representing a neural network.
For such sections to be recognized as valid,
the model must be installed via the vak.models
entry point, so that it can be recognized by the function
vak.config.validators.is_valid_model_name
.
- class vak.config.config.Config(prep=None, train=None, eval=None, predict=None, learncurve=None)[source]¶
Class that represents the TOML configuration file used with the vak command-line interface.
- prep¶
Represents
[vak.prep]
table of config.toml file
- train¶
Represents
[vak.train]
table of config.toml file
- eval¶
Represents
[vak.eval]
table of config.toml file
- predict¶
Represents
[vak.predict]
table of config.toml file.
- learncurve¶
Represents
[vak.learncurve]
table of config.toml file
Valid Options by Section¶
Each section of the .toml
config
has a set of option names
that are considered valid.
Valid options for each section are presented below.
[vak.prep]
section¶
- class vak.config.prep.PrepConfig(data_dir, output_dir, dataset_type, input_type, audio_format=None, spect_format=None, spect_params=None, annot_file=None, annot_format=None, labelset=None, audio_dask_bag_kwargs=None, train_dur=None, val_dur=None, test_dur=None, train_set_durs=None, num_replicates=None)[source]¶
Class that represents
[vak.prep]
table of configuration file.- output_dir¶
Path to location where data sets should be saved. Default is None, in which case data sets are saved in the current working directory.
- Type:
- dataset_type¶
String name of the type of dataset, e.g., ‘frame_classification’. Dataset types are defined by machine learning tasks, e.g., a ‘frame_classification’ dataset would be used a
vak.models.FrameClassificationModel
model. Valid dataset types are defined asvak.prep.prep.DATASET_TYPES
.- Type:
- spect_format¶
format of files containg spectrograms as 2-d matrices. One of {‘mat’, ‘npy’}.
- Type:
- spect_params¶
Parameters for Short-Time Fourier Transform and post-processing of spectrograms. Instance of
vak.config.SpectParamsConfig
class. Optional, default is None.- Type:
vak.config.SpectParamsConfig, optional
- annot_format¶
format of annotations. Any format that can be used with the crowsetta library is valid.
- Type:
- annot_file¶
Path to a single annotation file. Default is None. Used when a single file contains annotations for multiple audio files.
- Type:
- labelset¶
of str or int, the set of labels that correspond to annotated segments that a network should learn to segment and classify. Note that if there are segments that are not annotated, e.g. silent gaps between songbird syllables, then vak will assign a dummy label to those segments – you don’t have to give them a label here. Value for
labelset
is converted to a Pythonset
usingvak.config.converters.labelset_from_toml_value
. See help for that function for details on how to specify labelset.- Type:
- audio_dask_bag_kwargs¶
Keyword arguments used when calling
dask.bag.from_sequence
insidevak.io.audio
, where it is used to parallelize the conversion of audio files into spectrograms. Option should be specified in config.toml file as an inline table, e.g.,audio_dask_bag_kwargs = { npartitions = 20 }
. Allows for finer-grained control when needed to process files of different sizes.- Type:
- train_dur¶
total duration of training set, in seconds. When creating a learning curve, training subsets of shorter duration (specified by the ‘train_set_durs’ option in the LEARNCURVE section of a config.toml file) will be drawn from this set.
- Type:
[vak.prep.spect_params]
section¶
- class vak.config.spect_params.SpectParamsConfig(fft_size=512, step_size=64, freq_cutoffs=None, thresh=None, transform_type=None, spect_key='s', freqbins_key='f', timebins_key='t', audio_path_key='audio_path')[source]¶
represents parameters for making spectrograms from audio and saving in files
- freq_cutoffs¶
of two elements, lower and higher frequencies. Used to bandpass filter audio (using a Butter filter) before generating spectrogram. Default is None, in which case no bandpass filtering is applied.
- Type:
- transform_type¶
one of {‘log_spect’, ‘log_spect_plus_one’}. ‘log_spect’ transforms the spectrogram to log(spectrogram), and ‘log_spect_plus_one’ does the same thing but adds one to each element. Default is None. If None, no transform is applied.
- Type:
[vak.train]
section¶
- class vak.config.train.TrainConfig(model, num_epochs, batch_size, root_results_dir, dataset: DatasetConfig, trainer: TrainerConfig, results_dirname=None, standardize_frames=False, num_workers=2, shuffle=True, val_step=None, ckpt_step=None, patience=None, checkpoint_path=None, frames_standardizer_path=None)[source]¶
Class that represents
[vak.train]
table of configuration file.- model¶
The model to use: its name, and the parameters to configure it. Must be an instance of
vak.config.ModelConfig
- Type:
vak.config.ModelConfig
- num_epochs¶
number of training epochs. One epoch = one iteration through the entire training set.
- Type:
- root_results_dir¶
directory in which results will be created. The vak.cli.train function will create a subdirectory in this directory each time it runs.
- Type:
- dataset¶
The dataset to use: the path to it, and optionally a path to a file representing splits, and the name, if it is a built-in dataset. Must be an instance of
vak.config.DatasetConfig
.- Type:
vak.config.DatasetConfig
- trainer¶
Configuration for
lightning.pytorch.Trainer
. Must be an instance ofvak.config.TrainerConfig
.- Type:
vak.config.TrainerConfig
- num_workers¶
Number of processes to use for parallel loading of data. Argument to torch.DataLoader.
- Type:
- standardize_frames¶
if True, use
vak.transforms.FramesStandardizer
to standardize the frames. Normalization is done by subtracting off the mean for each row of the training set and then dividing by the std for that frequency bin. This same normalization is then applied to validation + test data.- Type:
- val_step¶
Step on which to estimate accuracy using validation set. If val_step is n, then validation is carried out every time the global step / n is a whole number, i.e., when val_step modulo the global step is 0. Default is None, in which case no validation is done.
- Type:
- ckpt_step¶
Step on which to save to checkpoint file. If ckpt_step is n, then a checkpoint is saved every time the global step / n is a whole number, i.e., when ckpt_step modulo the global step is 0. Default is None, in which case checkpoint is only saved at the last epoch.
- Type:
- patience¶
number of validation steps to wait without performance on the validation set improving before stopping the training. Default is None, in which case training only stops after the specified number of epochs.
- Type:
- checkpoint_path¶
path to directory with checkpoint files saved by Torch, to reload model. Default is None, in which case a new model is initialized.
- Type:
[vak.eval]
section¶
- class vak.config.eval.EvalConfig(checkpoint_path, output_dir, model, batch_size, dataset: DatasetConfig, trainer: TrainerConfig, labelmap_path=None, frames_standardizer_path=None, post_tfm_kwargs: dict | None = None, num_workers=2)[source]¶
Class that represents [vak.eval] table in configuration file.
- model¶
The model to use: its name, and the parameters to configure it. Must be an instance of
vak.config.ModelConfig
- Type:
vak.config.ModelConfig
- dataset¶
The dataset to use: the path to it, and optionally a path to a file representing splits, and the name, if it is a built-in dataset. Must be an instance of
vak.config.DatasetConfig
.- Type:
vak.config.DatasetConfig
- trainer¶
Configuration for
lightning.pytorch.Trainer
. Must be an instance ofvak.config.TrainerConfig
.- Type:
vak.config.TrainerConfig
- num_workers¶
Number of processes to use for parallel loading of data. Argument to torch.DataLoader. Default is 2.
- Type:
- frames_standardizer_path¶
path to a saved
vak.transforms.FramesStandardizer
object used to standardize (normalize) frames. If spectrograms were normalized and this is not provided, will give incorrect results.- Type:
- post_tfm_kwargs¶
Keyword arguments to post-processing transform. If None, then no additional clean-up is applied when transforming labeled timebins to segments, the default behavior. The transform used is
vak.transforms.frame_labels.PostProcess`. Valid keyword argument names are 'majority_vote' and 'min_segment_dur', and should be appropriate values for those arguments: Boolean for ``majority_vote
, a float value formin_segment_dur
. See the docstring of the transform for more details on these arguments and how they work.- Type:
[vak.predict]
section¶
- class vak.config.predict.PredictConfig(checkpoint_path, labelmap_path, model, batch_size, dataset: DatasetConfig, trainer: TrainerConfig, frames_standardizer_path=None, num_workers=2, annot_csv_filename=None, output_dir=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/vak/checkouts/latest/doc'), min_segment_dur=None, majority_vote=True, save_net_outputs=False)[source]¶
Class that represents
[vak.predict]
table of configuration file.- checkpoint_pathstr
path to directory with checkpoint files saved by Torch, to reload model
- labelmap_pathstr
path to ‘labelmap.json’ file.
- modelvak.config.ModelConfig
The model to use: its name, and the parameters to configure it. Must be an instance of
vak.config.ModelConfig
- batch_sizeint
number of samples per batch presented to models during training.
- datasetvak.config.DatasetConfig
The dataset to use: the path to it, and optionally a path to a file representing splits, and the name, if it is a built-in dataset. Must be an instance of
vak.config.DatasetConfig
.- trainervak.config.TrainerConfig
Configuration for
lightning.pytorch.Trainer
. Must be an instance ofvak.config.TrainerConfig
.- num_workersint
Number of processes to use for parallel loading of data. Argument to torch.DataLoader. Default is 2.
- frames_standardizer_pathstr
path to a saved
vak.transforms.FramesStandardizer
object used to standardize (normalize) frames. If spectrograms were normalized and this is not provided, will give incorrect results.- annot_csv_filenamestr
name of .csv file containing predicted annotations. Default is None, in which case the name of the dataset .csv is used, with ‘.annot.csv’ appended to it.
- output_dirstr
path to location where .csv containing predicted annotation should be saved. Defaults to current working directory.
- min_segment_durfloat
minimum duration of segment, in seconds. If specified, then any segment with a duration less than min_segment_dur is removed from lbl_tb. Default is None, in which case no segments are removed.
- majority_votebool
if True, transform segments containing multiple labels into segments with a single label by taking a “majority vote”, i.e. assign all time bins in the segment the most frequently occurring label in the segment. This transform can only be applied if the labelmap contains an ‘unlabeled’ label, because unlabeled segments makes it possible to identify the labeled segments. Default is False.
- save_net_outputsbool
If True, save ‘raw’ outputs of neural networks before they are converted to annotations. Default is False. Typically the output will be “logits” to which a softmax transform might be applied. For each item in the dataset–each row in the dataset_path .csv– the output will be saved in a separate file in output_dir, with the extension {MODEL_NAME}.output.npz. E.g., if the input is a spectrogram with spect_path filename gy6or6_032312_081416.npz, and the network is TweetyNet, then the net output file will be gy6or6_032312_081416.tweetynet.output.npz.
[vak.learncurve]
section¶
- class vak.config.learncurve.LearncurveConfig(model, num_epochs, batch_size, root_results_dir, dataset: DatasetConfig, trainer: TrainerConfig, results_dirname=None, standardize_frames=False, num_workers=2, shuffle=True, val_step=None, ckpt_step=None, patience=None, checkpoint_path=None, frames_standardizer_path=None, post_tfm_kwargs: dict | None = None)[source]¶
Class that represents
[vak.learncurve]
table in configuration file.- model¶
The model to use: its name, and the parameters to configure it. Must be an instance of
vak.config.ModelConfig
- Type:
vak.config.ModelConfig
- num_epochs¶
number of training epochs. One epoch = one iteration through the entire training set.
- Type:
- root_results_dir¶
directory in which results will be created. The vak.cli.train function will create a subdirectory in this directory each time it runs.
- Type:
- dataset¶
The dataset to use: the path to it, and optionally a path to a file representing splits, and the name, if it is a built-in dataset. Must be an instance of
vak.config.DatasetConfig
.- Type:
vak.config.DatasetConfig
- trainer¶
Configuration for
lightning.pytorch.Trainer
. Must be an instance ofvak.config.TrainerConfig
.- Type:
vak.config.TrainerConfig
- num_workers¶
Number of processes to use for parallel loading of data. Argument to torch.DataLoader.
- Type:
- standardize_frames¶
if True, use
vak.transforms.FramesStandardizer
to standardize the frames. Normalization is done by subtracting off the mean for each row of the training set and then dividing by the std for that frequency bin. This same normalization is then applied to validation + test data.- Type:
- val_step¶
Step on which to estimate accuracy using validation set. If val_step is n, then validation is carried out every time the global step / n is a whole number, i.e., when val_step modulo the global step is 0. Default is None, in which case no validation is done.
- Type:
- ckpt_step¶
step/epoch at which to save to checkpoint file. Default is None, in which case checkpoint is only saved at the last epoch.
- Type:
- patience¶
number of epochs to wait without the error dropping before stopping the training. Default is None, in which case training continues for num_epochs
- Type:
- post_tfm_kwargs¶
Keyword arguments to post-processing transform. If None, then no additional clean-up is applied when transforming labeled timebins to segments, the default behavior. The transform used is
vak.transforms.frame_labels.ToSegmentsWithPostProcessing`. Valid keyword argument names are 'majority_vote' and 'min_segment_dur', and should be appropriate values for those arguments: Boolean for ``majority_vote
, a float value formin_segment_dur
. See the docstring of the transform for more details on these arguments and how they work.- Type: