vak.prep.spectrogram_dataset.spect_helper.make_dataframe_of_spect_files#

vak.prep.spectrogram_dataset.spect_helper.make_dataframe_of_spect_files(spect_format: str, spect_dir: str | Path | None = None, spect_files: list | None = None, spect_ext: str | None = None, annot_list: list | None = None, annot_format: str | None = None, labelset: set | None = None, n_decimals_trunc: int = 5, freqbins_key: str = 'f', timebins_key: str = 't', spect_key: str = 's', audio_path_key: str = 'audio_path') → DataFrame[source]#

Get a set of spectrogram files from a directory, optionally paired with an annotation file or files, and returns a Pandas DataFrame that represents all the files.

Spectrogram files are array in npz files created by numpy or in mat files created by Matlab.

Parameters:

spect_format (str) – Format of files containing spectrograms. One of {‘mat’, ‘npz’}
spect_dir (str) – Path to directory of files containing spectrograms as arrays. Default is None.
spect_files (list) – List of paths to array files. Default is None.
annot_list (list) – List of annotations for array files. Default is None
annot_format (str) – Name of annotation format. Added as a column to the DataFrame if specified. Used by other functions that open annotation files via their paths from the DataFrame. Should be a format that the crowsetta library recognizes. Default is None.
labelset (str, list, set) – Set of unique labels for vocalizations, of str or int. Default is None. If not None, then files will be skipped where the associated annotation contains labels not found in labelset. labelset is converted to a Python set using vak.common.converters.labelset_to_set(). See help for that function for details on how to specify labelset.
n_decimals_trunc (int) – number of decimal places to keep when truncating the time bin duration calculated from the vector of time bins. Default is 3, i.e. assumes milliseconds is the last significant digit.
freqbins_key (str) – Key for accessing vector of frequency bins in files. Default is ‘f’.
timebins_key (str) – Key for accessing vector of time bins in files. Default is ‘t’.
spect_key (str) – Key for accessing spectrogram in files. Default is ‘s’.
audio_path_key (str) – Key for accessing path to source audio file for spectrogram in files. Default is ‘audio_path’.

Returns:

source_files_df – A set of source files that will be used to prepare a data set for use with neural network models, represented as a pandas.DataFrame. Will contain paths to spectrogram files, possibly paired with annotation files, as well as the original audio files if the spectrograms were generated from audio by vak.prep.audio_helper.make_spectrogram_files_from_audio_files(). The columns of the dataframe are specified by vak.prep.spectrogram_dataset.spect_helper.DF_COLUMNS.

Return type:

pandas.DataFrame

Notes

Each file should contain a spectrogram as a matrix and two vectors associated with it, a vector of frequency bins and time bins, where the values in those vectors are the values at the bin centers. (As far as vak is concerned, “vector” and “matrix” are synonymous with “array”.)

Since both mat files and npz files load into a dictionary-like structure, the arrays will be accessed with keys. By convention, these keys are ‘s’, ‘f’, and ‘t’. If you use different keys you can let this function know by changing the appropriate arguments: spect_key, freqbins_key, timebins_key