/GPM-BT

Scaling Bioacoustic Signal Pre-training with Million Samples Via Mask-Modeling

Primary LanguagePython

Scaling Bioacoustic Signal Pre-training with Million Samples Via Mask-Modeling

GPM-BT5图片 tSNE图片

We introduce GPM-BT (General Pre-training Model for Bioacoustic Tasks), a self-supervised, Transformer-based model pre-trained on approximately 1.2 million unannotated bioacoustic audio samples. GPM-BT achieves state-of-the-art performance on the BEANS benchmark demonstrating a strong ability to represent and understand bioacoustic audio signals amidst complex background noises. See our paper for more details.

GPM-BT builds upon the research of BEANS and SSAST. We presume you have prior knowledge of these two projects.

Installation

  1. Using environments.yaml file to create a conda environment

    conda env create -f environment.yaml
    
  2. Refer to BEANS to install the benchmark itself

    pip install -e .
    
  3. Refer to BEANS for data preparation

  4. Download the pre-trained model

pre-trained model avg-classification avg-detection avg-auxiliary avg-all
secondary-1.23M-bio-patch 0.797 0.440 0.879 0.662
from scrach-1.23M-bio-patch 0.793 0.440 0.869 0.658
from scrach-0.80M-bio-patch 0.774 0.416 0.850 0.637
from scrach-0.46M-bio-patch 0.749 0.358 0.851 0.603
from scrach-0.12M-bio-patch 0.738 0.351 0.812 0.589
from scrach-1.23M-bio-frame 0.781 0.349 0.822 0.608
from scrach-2.23M-gen-frame 0.766 0.318 0.805 0.586
from scrach-2.23M-gen-patch 0.785 0.433 0.873 0.653

Running the GPM-BT

You can run the GPM-BT model by running:

python run_benchmark.py

Before running, you need to replace the --data-directory-path parameter with the path to your data folder, replace the --load-pretrained-mdl-path parameter with the location of pre-trained model, and choose the correct masking method for the --Patch-or-Frame parameter.

This will write result files into the logs directory.