vak.prep.spectrogram_dataset.audio_helper.make_spectrogram_files_from_audio_files#

Make spectrograms from audio files and save them in npz array files.

Parameters:

audio_format (str) – A string representing the format of audio files. One of :constant:`vak.common.constants.VALID_AUDIO_FORMATS`.
spect_params (dict or config.spect_params.SpectParamsConfig) – parameters for computing spectrogram, from .toml file. To see all related parameters, run: >>> help(vak.config.spect_params.SpectParamsConfig) To get a default configuration, create a SpectParamConfig with no arguments and then pass that to to_spect: >>> default_spect_params = vak.config.spect_params.SpectParamsConfig() >>> to_spect(audio_format=’wav’, spect_params=default_spect_params, output_dir=’.’)
audio_dir (str) – Path to directory containing audio files from which to make spectrograms.
audio_files (list) – of str, full paths to audio files from which to make spectrograms
annot_list (list) – of annotations for array files. Default is None.
audio_annot_map (dict) – Where keys are paths to array files and value corresponding to each key is the annotation for that array file. Default is None.
output_dir (str) – directory in which to save .spect.npz file generated for each audio file.
labelset (str, list) – of str or int, set of unique labels for vocalizations. Default is None. If not None, skip files where the associated annotations contain labels not in labelset. labelset is converted to a Python set using vak.converters.labelset_to_set. See help for that function for details on how to specify labelset.
dask_bag_kwargs (dict) – Keyword arguments used when calling dask.bag.from_sequence. E.g., {npartitions=20}. Allows for finer-grained control when needed to process files of different sizes.

Returns:

spect_files – of str, full paths to .spect.npz files

Return type:

list

Notes

For each audio file, a corresponding ‘spect.npz’ file will be created. Each ‘.spect.npz’ file contains the following arrays:

snumpy.ndarray
spectrogram, a 2-d array

fnumpy.ndarray
vector of centers of frequency bins from spectrogram

tnumpy.ndarray
vector of centers of tme bins from spectrogram

audio_pathnumpy.ndarray
path to source audio file used to create spectrogram

The names of the arrays are defaults, and will change if different values are specified in spect_params for ‘spect_key’, ‘freqbins_key’, ‘timebins_key’, or ‘audio_path_key’.