vak.prep.split.algorithms.validate.validate_split_durations#

vak.prep.split.algorithms.validate.validate_split_durations(train_dur, val_dur, test_dur, dataset_dur)[source]#

helper function to validate durations specified for splits, so other functions can do the actual splitting.

First the functions checks for invalid conditions:
  • If train_dur, val_dur, and test_dur are all None, a ValueError is raised.

  • If any of train_dur, val_dur, or test_dur have a negative value that is not -1, an ValueError is raised. -1 is interpreted differently as explained below.

  • If only val_dur is specified, this raises a ValueError; not clear what durations of training and test set should be.

Then, if either train_dur or test_dur are None, they are set to 0. None means user did not specify a value.

Finally the function validates that the sum of the specified split durations is not greater than the the total duration of the dataset, dataset_dur.

If any split is specified as -1, this value is interpreted as β€œfirst get the split for the set with a value specified, then use the remainder of the dataset in the split whose duration is set to -1”. Functions that do the splitting have to β€œknow” about this meaning of -1, so this validation function does not modify the value.

Parameters:
  • train_dur (int, float) – Target duration for training set split, in seconds.

  • val_dur (int, float) – Target duration for validation set, in seconds.

  • test_dur (int, float) – Target duration for test set, in seconds.

  • dataset_dur (int, float) – Total duration of dataset of vocalizations that will be split.

Returns:

train_dur, val_dur, test_dur

Return type:

int, float