(spect-file-format)= # Spectrogram file format ## File type `vak` uses pre-computed files containing spectrograms. For these files, it accepts two types, either `.npz` or `.mat`. `.npz` is a `numpy` library format, for a file that can contain multiple arrays. `.mat` is the Matlab data file format---many labs have existing codebases that generate spectrograms using Matlab. To work with one of these formats, you will specify either `npz` or `vak` in the `[PREP]` section of your `.toml` configuration file. :::{note} `vak` loads `.mat` files with the function `scipy.io.loadmat`. That function can only load v4 (Level 1.0), v6 and v7 to 7.2 matfiles as stated here: Version 7.3 of the matfile format uses an HDF5-based format, which is not supported by `scipy` or `vak`. (For more details see [this page in the Matlab documentation](https://www.mathworks.com/help/matlab/import_export/mat-file-versions.html) ) If you have are working with Matlab, please either save your `.mat` files in a format that can be ready by `scipy.io.loadmat`, or convert your data to `.npz` files as described in {ref}`howto-user-spect`. ::: ## Conventions Regardless of whether they are `.npz` files or `.mat` files, `vak` expects any spectrogram files to obey the following conventions. ### Content A spectrogram array files should contain (at least) three items. 1. The spectrogram, an *m x n* matrix 2. A vector of *m* frequency bins, where the value of each element is the frequency at the center of the bin 3. A vector of *n* time bins, where the value each element is the time at the center of the bin A fourth item is not required, but is suggested. 4. A string path to the audio file from which the spectrogram was generated. Other arrays can be in the file, but they will be ignored. ### Array naming By convention each item should be associated with a string key. The defaults built into vak are: 's', 'f', 't', and 'audio_path'. These defaults can be changed when preparing a dataset by changing the corresponding options in the {ref}`[SPECT_PARAMS] ` section of a .toml configuration file. If you are using Matlab to generate the spectrogram files, then you will need to either save your workspace variables with the default names, or tell `vak` what names you used by changing the {ref}`[SPECT_PARAMS] ` options. As noted above, the `audio_path` is not required, but it is added by `vak.prep` when generating a dataset of spectrogram files from audio. ### Spectrogram file naming There are two valid ways to name spectrogram files. The first is to name each spectrogram file the same as the name of the audio file it was created from, with the spectrogram file format added. E.g., if your audio file is `bird1.wav`, then the spectrogram file should be `bird1.wav.npz`. The second way is to name the spectrogram file by replacing the audio file extension with the array file extension, e.g., the spectrogram from `bird1.wav` would be saved in `bird1.npz`. The second way may be more intuitive, while the first allows for other `.npz` files with the same stem in the same directory, e.g. `day1/bird1.wav.npz` and `day1/bird1.ftr.npz` can be found side by side. For more detail, please see the page {ref}`file-naming-conventions`. (example-spect-file-format)= ### Example array files that meet this spectrogram file format specification Please click on this link to download a .tar.gz archive containing spectrogram files generated by a run of `vak prep` on audio data: You can inspect the contents `.npz` array files by loading them with `numpy.load` These files are provided to demonstrate the specification described here. You may find them helpful as examples if you prefer to generate your own spectrograms, and you need to write a script to create array files containing your spectrograms so `vak` can work with them, as described in {ref}`howto-user-spect`.