We introduce GPM-BT (General Pre-training Model for Bioacoustic Tasks), a self-supervised, Transformer-based model pre-trained on approximately 1.2 million unannotated bioacoustic audio samples. GPM-BT achieves state-of-the-art performance on the BEANS benchmark demonstrating a strong ability to represent and understand bioacoustic audio signals amidst complex background noises. See our paper for more details.
GPM-BT builds upon the research of BEANS and SSAST. We presume you have prior knowledge of these two projects.
-
Using environments.yaml file to create a conda environment
conda env create -f environment.yaml
-
Refer to BEANS to install the benchmark itself
pip install -e .
-
Refer to BEANS for data preparation
-
Download the pre-trained model
pre-trained model | avg-classification | avg-detection | avg-auxiliary | avg-all |
secondary-1.23M-bio-patch | 0.797 | 0.440 | 0.879 | 0.662 |
from scrach-1.23M-bio-patch | 0.793 | 0.440 | 0.869 | 0.658 |
from scrach-0.80M-bio-patch | 0.774 | 0.416 | 0.850 | 0.637 |
from scrach-0.46M-bio-patch | 0.749 | 0.358 | 0.851 | 0.603 |
from scrach-0.12M-bio-patch | 0.738 | 0.351 | 0.812 | 0.589 |
from scrach-1.23M-bio-frame | 0.781 | 0.349 | 0.822 | 0.608 |
from scrach-2.23M-gen-frame | 0.766 | 0.318 | 0.805 | 0.586 |
from scrach-2.23M-gen-patch | 0.785 | 0.433 | 0.873 | 0.653 |
You can run the GPM-BT model by running:
python run_benchmark.py
Before running, you need to replace the --data-directory-path
parameter with the path to your data
folder, replace the --load-pretrained-mdl-path
parameter with the location of pre-trained model, and choose the correct masking method for the --Patch-or-Frame
parameter.
This will write result files into the logs
directory.