Scaling Bioacoustic Signal Pre-training with Million Samples Via Mask-Modeling

We introduce GPM-BT (General Pre-training Model for Bioacoustic Tasks), a self-supervised, Transformer-based model pre-trained on approximately 1.2 million unannotated bioacoustic audio samples. GPM-BT achieves state-of-the-art performance on the BEANS benchmark demonstrating a strong ability to represent and understand bioacoustic audio signals amidst complex background noises. See our paper for more details.

GPM-BT builds upon the research of BEANS and SSAST. We presume you have prior knowledge of these two projects.

Installation

Using environments.yaml file to create a conda environment
```
conda env create -f environment.yaml
```
Refer to BEANS to install the benchmark itself
```
pip install -e .
```
Refer to BEANS for data preparation
Download the pre-trained model

pre-trained model	avg-classification	avg-detection	avg-auxiliary	avg-all
secondary-1.23M-bio-patch	0.797	0.440	0.879	0.662
from scrach-1.23M-bio-patch	0.793	0.440	0.869	0.658
from scrach-0.80M-bio-patch	0.774	0.416	0.850	0.637
from scrach-0.46M-bio-patch	0.749	0.358	0.851	0.603
from scrach-0.12M-bio-patch	0.738	0.351	0.812	0.589
from scrach-1.23M-bio-frame	0.781	0.349	0.822	0.608
from scrach-2.23M-gen-frame	0.766	0.318	0.805	0.586
from scrach-2.23M-gen-patch	0.785	0.433	0.873	0.653

Running the GPM-BT

You can run the GPM-BT model by running:

python run_benchmark.py

Before running, you need to replace the --data-directory-path parameter with the path to your data folder, replace the --load-pretrained-mdl-path parameter with the location of pre-trained model, and choose the correct masking method for the --Patch-or-Frame parameter.

This will write result files into the logs directory.

colaudiolab/GPM-BT

Scaling Bioacoustic Signal Pre-training with Million Samples Via Mask-Modeling

Installation

Running the GPM-BT