vak.prep.split.algorithms.validate.validate_split_durations#

vak.prep.split.algorithms.validate.validate_split_durations(train_dur, val_dur, test_dur, dataset_dur)[source]#

helper function to validate durations specified for splits, so other functions can do the actual splitting.

First the functions checks for invalid conditions:

If train_dur, val_dur, and test_dur are all None, a ValueError is raised.
If any of train_dur, val_dur, or test_dur have a negative value that is not -1, an ValueError is raised. -1 is interpreted differently as explained below.
If only val_dur is specified, this raises a ValueError; not clear what durations of training and test set should be.

Then, if either train_dur or test_dur are None, they are set to 0. None means user did not specify a value.

Finally the function validates that the sum of the specified split durations is not greater than the the total duration of the dataset, dataset_dur.

If any split is specified as -1, this value is interpreted as “first get the split for the set with a value specified, then use the remainder of the dataset in the split whose duration is set to -1”. Functions that do the splitting have to “know” about this meaning of -1, so this validation function does not modify the value.

Parameters:

train_dur (int, float) – Target duration for training set split, in seconds.
val_dur (int, float) – Target duration for validation set, in seconds.
test_dur (int, float) – Target duration for test set, in seconds.
dataset_dur (int, float) – Total duration of dataset of vocalizations that will be split.

Returns:

train_dur, val_dur, test_dur

Return type:

int, float