vak.prep.parametric_umap.dataset_arrays.move_files_into_split_subdirs#

vak.prep.parametric_umap.dataset_arrays.move_files_into_split_subdirs(dataset_df: DataFrame, dataset_path: Path, purpose: str) None[source]#

Move npy files in dataset into sub-directories, one for each split in the dataset.

This is run after calling vak.prep.unit_dataset.prep_unit_dataset() to generate dataset_df.

Parameters:
  • dataset_df (pandas.DataFrame) – A pandas.DataFrame returned by vak.prep.unit_dataset.prep_unit_dataset() with a 'split' column added, as a result of calling vak.prep.split.unit_dataframe() or because it was added “manually” by calling vak.core.prep.prep_helper.add_split_col() (as is done for ‘predict’ when the entire DataFrame belongs to this “split”).

  • dataset_path (pathlib.Path) – Path to directory that represents dataset.

  • purpose (str) – A string indicating what the dataset will be used for. One of {‘train’, ‘eval’, ‘predict’, ‘learncurve’}. Determined by vak.core.prep.prep() using the TOML configuration file.

Returns:

  • None

  • The DataFrame is modified in place

  • as the files are moved, so nothing is returned.