File naming conventions#

This page documents naming conventions for data files consumed by vak: audio, annotation, and spectrogram files. Some of these files may in some cases be generated by vak, but they are different from other files in that they are required for any outputs, e.g., the files that represent training and test datasets and the files that represent parameters of trained neural network models.

Audio files#

vak assumes that audio files are the raw data that all other derived data can be traced back to, where derived data includes annotation files and array files containing spectrograms. As such, any filename is valid for an audio file, as long as the extension corresponds to one of the formats listed in vak.constants.VALID_AUDIO_FORMATS.

Annotation files#

There are two ways that annotation files can refer to the files they annotate (below, “annotated” files, either audio or spectrogram files). The first is when there is a one-to-one relationship; each annotated file has a corresponding annotation file. The second is when a single annotation file contains annotations for multiple annotated files.

One annotation file per annotated file#

When there is one annotation file per annotated file, there are two valid ways of naming the annotation file so that vak can associate it with other files.

The first way is to name each annotation file so that it contains the name of the audio file that it annotates. This assumes that the audio file is the raw data which the annotations can be traced back to, even if there is an intermediate spectrogram saved in an array file (see below).

For example, if you have an audio file named “BB_SGP16-1___20160521_214723.wav”, then the annotation file should be named “BB_SGP16-1___20160521_214723.wav.csv”.

This convention makes it possible to have other files with the .csv extension in the same directory, e.g., if you are also extracting features from each audio file and storing them in a .csv file. A file named “BB_SGP16-1___20160521_214723.wav.csv” can coexist with “BB_SGP16-1___20160521_214723.ftr.csv”, and lists of both file types in the same sort order are easy to produce with a glob using a wildcard and the “double extensions”, e.g. *.wav.csv and *.ftr.csv, facilitating analysis pipelines.

The second valid name for an annotation file is to name it with the “stem” of the annotated file, i.e., the part of the filename before the extension. For example, if you have an audio file named “BB_SGP16-1___20160521_214723.wav”, then the annotation file could be named “BB_SGP16-1___20160521_214723.csv”. This convention may be more intuitive for many users.

One annotation file, multiple annotated files#

When a single annotation file contains annotations for multiple files, there are no restrictions on the naming of the annotation file. This is because the annotation file itself must contain the name of each file that it annotates. An example of this format is the 'generic-seq' format used by crowsetta.

Spectrogram file naming convention#

For array files that contain spectrograms, which here we will just call “spectrogram files”, we assume that each file contains one spectrogram for one audio file. I.e., as with some annotation formats, we assume a one-to-one mapping from the derived file back to the source audio file. For this reason, the naming convention for spectrogram files is the same as for annotation files in formats that follow this one-to-one mapping.

There are two valid ways, then, to name spectrogram files. The first way is to give each spectrogram file the the name of the audio file it was created from, with the extension of the spectrogram file format added. For example, if you have an audio file named “BB_SGP16-1___20160521_214723.wav”, then the spectrogram file should be named “BB_SGP16-1___20160521_214723.wav.npz”. As with the naming convention for annotation files, this allows for multiple files with the same stem and .npz extension to coexist in the same directory, by inserting an “extra” extension, i.e., there could also be a “BB_SGP16-1___20160521_214723.ftr.npz”.

The second valid way to name spectrogram files is to replace the audio extension with the array file extension. For example, the audio file
“BB_SGP16-1___20160521_214723.wav”, then the annotation file should be named “BB_SGP16-1___20160521_214723.wav.npz”.