(A pre-trained model and API for detecting humpback whale vocalizations in hydrophone recordings.)
Detecting and localizing biological sounds from hydrophone recordings is useful to remotely study the social behavior of marine mammals. Manually listening through hydrophone recordings to label the occurence of each vocalization is an incredibly time consuming process and is not feasible for annotating hundreds to thousands of hours of audio data. In order to reduce the amount of time that researchers must spend simply locating the signals which they wish to study, this repository aims to automatically detect and localize vocalizations in hydrophone recordings. To that end, this repository trains a convolutional neural network to detect humpback whale vocalizations in spectrograms (a visual representation of audio). Using CNNs to detect objects in images is a well-studied problem and therefor by using spectrograms as input to our model, we can benefit from the numerous successful approaches published in the object detection literature as well as open-source libraries which support this task.
First, we'll review object detection. The PASCAL VOC and MSCOCO datasets are both cannonical examples of the object detection task: given an image as input, produce a set of bounding boxes and class labels which identify and localize each object in the image. In the context of our task: given a spectrogram of an audio recording, produce a set of bounding boxes and class labels which identify the species of each vocalization and localize each in both time and frequency.
(TODO: include example spectrogram and labels)
(TODO: more specifically, which CNN architecture does this repo currently use and how is it trained?)
(TODO: Provide an example workflow from input files to detections and performance evaluations.)
The bioacoustic-detection package is organized with all of the source code in src/bioacoustic_detection
, executable scripts in scripts/
, and tests in tests/
. Nearly all of the interesting code will be contained in the package's source code while the executable scripts are only meant to provide a convenient command line interface (CLI) for using the package. The bioacoustic-detection package itself is composed of:
utils/
io.py
contains code for reading and writing annotation and wav files.annot.py
contains code for cleaning raw annotations and other annotation-related utilitieswav.py
contains code for pre-processing wav files, including decimationvis.py
contains code for constructing and saving visualizations
spectrogram_dataset_generator.py
contains the code for generating spectrogram datasets from annotations and recordings.- (TODO)
run_inference.py
contains the code for running inference on a recording. - (TODO)
evaluate.py
contains code for evaluating a trained model on a spectrogram dataset with annotations.
- Install python3 and python3-venv
- Clone the repo with
git clone --recursive git@github.com:jackson-waschura/bioacoustic-detection.git
(the--recursive
flag is important so that you also retrieve the TFmodels
submodule which is used to train object detection models) - Change the current directory with
cd bioacoustic-detection
. - Run the setup script with
./setup.sh
. This will create a python3 virtual environment (venv) in the directory "env/" and install the prerequisite packages there along with the bioacoustic-detection package. - Run
source env/bin/activate
to enter the python venv. When you are done using the bioacoustic-detection package, you can exit the python venv by runningdeactivate
. (You must do this every time you want to use the bioacoustic-detection package unless you prefer to install the package directly onto your local machine.)
The following scripts provide helpful interfaces for using the bioacoustic-detection package from the terminal:
clean_annotations.py [-h] [-i IN_DIR] [-o OUT_DIR] [-q]
generate_spectrogram_dataset.py [-h] --splits SPLITS -o OUT_DIR [optional parameters...]
You can train a new model using the Tensorflow Object Detection API. First, you'll need to create a .config
file which defines the model's architecture as well as its training and evaluation procedures. See the example in data/example_model.config
for a specific example (and check out the tensorflow Object Detection documentation for more information). Notice that your training dataset is also specified in the .config
file. To run training (on a machine with a GPU available), execute the following command:
python3 models/research/object_detection/model_main_tf2.py \
--pipeline_config_path="data/example_model.config" \
--model_dir="experiments/example_model" \
--alsologtostderr
This uses the example model and outputs the results into the directory experiments/example_model
. You can then view the progress of training with tensorboard:
tensorboard --logdir experiments/example_model
This hosts a webapp on your local machine that displays training info. You can connect to it (usually) at localhost:6006
.
DONE
Add a very permissive license.STARTED
Translate Jupyter Notebooks into python scriptsDONE
IO utility methodsDONE
Annotation utility methodsDONE
Data cleaning methodsSTARTED
Visualization utility methodsSTARTED
Spectrogram Dataset Generation- Training methods (adapted from TF's Object Detection)
- Inference methods
- Evaluation methods
- Create unit tests for python scripts
- IO utility methods
DONE
Annotation utility methods- Wav utility methods (PCEN)
- Visualization utility methods
DONE
Data cleaning methods- Spectrogram Dataset Generation
- Inference methods
- Evaluation methods
DONE
List all dependencies with specific versions in a requirements.txt fileDONE
Allow this codebase to be installed with pip (setuptools).DONE
Create a setup script which creates and sets up a python virtual environment (installing requirements and then this package).- Write notebooks in colab which others can use to execute tasks such as running inference or evaluation. (Pulls this source from github, installs all requirements, installs package, calls necessary functions to perform task, returns / visualizes results).
STARTED
Add detailed documentation in this README as well as within each of the notebooks which use it. (Look into expandable items in markdown)DONE
Add a .gitignore to ignore all of the__pycache__
files etc.