About `vak`¶

Background¶

Are humans unique among animals? We speak languages, but is speech somehow like other animal behaviors, such as birdsong? Questions like these are answered by studying how animals communicate with sound. This research requires cutting edge computational methods and big team science across a wide range of disciplines, including ecology, ethology, bioacoustics, psychology, neuroscience, linguistics, and genomics [1][2][3]. As in many other domains, this research is being revolutionized by deep learning algorithms [1][2][3]. Deep neural network models enable answering questions that were previously impossible to address, in part because these models automate analysis of very large datasets.

Goals¶

Within the study of animal acoustic communication, multiple models have been proposed for similar tasks, often implemented as research code with different libraries, such as Keras and Pytorch. This situation has created a real need for a framework that allows researchers to easily benchmark models and apply trained models to their own data. To address this need, we developed vak.

The vak library has two main goals:

make it easier for researchers studying animal vocalizations to apply neural network algorithms to their data
provide a common framework for benchmarking neural network algorithms on tasks related to animal vocalizations

Although models in the vak library can be used more generally for bioacoustics [2], our focus is on animal acoustic communication [1]. More colloquially, we can call this “vocal behavior”, a term that encompasses related researcher areas:
not only communication [4] but also culture [5], and vocal learning [6] [7]. Models in the vak library include deep learning algorithms developed for bioacoustics , but are designed specifically for computational studies of vocal behavior .

Publications using vak¶

We originally developed vak to benchmark a neural network model, TweetyNet [8], that automates annotation of birdsong by segmenting spectrograms. TweetyNet and vak have been used in both neuroscience [9][10][11][12] and bioacoustics [13][14].