vak.prep.parametric_umap.dataset_arrays.move_files_into_split_subdirs#
- vak.prep.parametric_umap.dataset_arrays.move_files_into_split_subdirs(dataset_df: DataFrame, dataset_path: Path, purpose: str) None [source]#
Move npy files in dataset into sub-directories, one for each split in the dataset.
This is run after calling
vak.prep.unit_dataset.prep_unit_dataset()
to generatedataset_df
.- Parameters:
dataset_df (pandas.DataFrame) – A
pandas.DataFrame
returned byvak.prep.unit_dataset.prep_unit_dataset()
with a'split'
column added, as a result of callingvak.prep.split.unit_dataframe()
or because it was added “manually” by callingvak.core.prep.prep_helper.add_split_col()
(as is done for ‘predict’ when the entireDataFrame
belongs to this “split”).dataset_path (pathlib.Path) – Path to directory that represents dataset.
purpose (str) – A string indicating what the dataset will be used for. One of {‘train’, ‘eval’, ‘predict’, ‘learncurve’}. Determined by
vak.core.prep.prep()
using the TOML configuration file.
- Returns:
None
The
DataFrame
is modified in placeas the files are moved, so nothing is returned.