vak.prep.audio_dataset.prep_audio_dataset#
- vak.prep.audio_dataset.prep_audio_dataset(data_dir: str | Path, audio_format: str, annot_format: str | None = None, annot_file: str | Path | None = None, labelset: set | None = None) DataFrame [source]#
Gets a set of audio files from a directory, optionally paired with an annotation file or files, and return a Pandas DataFrame that represents the set of files.
Finds all files with
audio_format
indata_dir
, then finds any annotations withannot_format
if specified, and additionally filter the audio and annotation files bylabelset
if specified. Then creates the dataframe with columns specified byvak.prep.audio_dataset.DF_COLUMNS
:"audio_path"
,"annot_path"
,"annot_format"
,"samplerate"
,"sample_dur",
and"duration"
.- Parameters:
data_dir (str, pathlib.Path) – Path to directory containing audio files that should be used in dataset.
audio_format (str) – A
string
representing the format of audio files. One of :constant:`vak.common.constants.VALID_AUDIO_FORMATS`.annot_format (str) – Name of annotation format. Added as a column to the DataFrame if specified. Used by other functions that open annotation files via their paths from the DataFrame. Should be a format that the
crowsetta
library recognizes. Default is None.annot_file (str) – Path to a single annotation file. Default is None. Used when a single file contains annotations for multiple audio files.
labelset (str, list, set) – Iterable of str or int, set of unique labels for annotations. Default is None. If not None, then files will be skipped where the associated annotation contains labels not found in
labelset
.labelset
is converted to a Pythonset
usingvak.common.converters.labelset_to_set()
. See docstring of that function for details on how to specifylabelset
.
- Returns:
source_files_df – A set of source files that will be used to prepare a data set for use with neural network models, represented as a
pandas.DataFrame
. Will contain paths to audio files, possibly paired with annotation files. The columns of the dataframe are specified byvak.prep.audio_dataset.DF_COLUMNS
.- Return type:
pandas.Dataframe