/pitch-detection

YIN(-FFT), PYIN, MPM, PMPM, SWIPE'

Primary LanguageC++MIT LicenseMIT

Pitch detection algorithms

Autocorrelation-based C++ pitch detection algorithms with O(nlogn) or lower running time:

*: SWIPE' appears to be O(n) but with an enormous constant factor. The implementation complexity is much higher than MPM and YIN and it brings in additional dependencies (BLAS + LAPACK).

**: There's a parallel version of SWIPE, Aud-SWIPE-P.

Suggested usage of this library can be seen in the utility wav_analyzer, which divides a wav file into chunks of 0.01s and checks the pitch of each chunk. Sample output of wav_analyzer:

At t: 0.5
        mpm: 162.529
        yin: 162.543
        swipe: 162.183
        pmpm: 162.529
        pyin: 162.543

Degraded audio tests

All testing files are here - the progressive degradations are described by the respective numbered JSON file, generated using audio-degradation-toolbox. The original clip is a Viola playing E3 from the University of Iowa MIS.

The results come from parsing the output of wav_analyzer to count how many 0.1s slices of the input clip were in the ballpark of the expected value of 164.81 - I considered anything 160-169 to be acceptable:

Degradation level MPM # correct YIN # correct SWIPE' # correct
0 26 22 5
1 23 21 13
2 19 21 9
3 18 19 7
4 19 19 6
5 18 19 5

Build and install

Using this project should be as easy as make && sudo make install on Linux with a modern GCC - I don't officially support other platforms.

This project depends on ffts, BLAS/LAPACK, and mlpack. To run the tests, you need googletest, and run make -C test/ && ./test/test. To run the bench, you need google benchmark, and run make -C test/ bench && ./test/bench.

Build and install pitch_detection, run the tests, and build the sample application, wav_analyzer:

# build libpitch_detection.so
make clean all

# build tests and benches
make -C test clean all

# run tests and benches 
./test/test
./test/bench

# install the library and headers to `/usr/local/lib` and `/usr/local/include`
sudo make install

# build and run C++ sample
make -C wav_analyzer clean all
./wav_analyzer/wav_analyzer

Docker image

To allow running and prototyping on unsupported operating systems, there's a Dockerfile that sets up a Linux container with all the dependencies for compiling the library and running the included tests and benchmarks.

To build the image, make sure you have installed and setup Docker on your computer. After following the setup instructions, run the following command from the cloned repository root:

docker build --rm --pull -f "Dockerfile" -t pitchdetection:latest "."

The --rm flag specifies that the intermediate containers will be removed after the build is complete. This just cleans up space on your system. The --pull flag specifies that the base image (ubuntu:latest) will be pulled from Docker Hub if it is available.

Once you've built your image (which should take about 20 minutes), you can run the image in a new container using the following command:

docker run --rm --init -it pitchdetection:latest

The --rm flag here specifies that Docker should remove the container after you exit the shell. If this is not desired you can remove this. The --init flag solves a common problem with Docker containers that you can read more about here. The -it flags specify that Docker should create the Container with a shell and enter into it, allowing you to start sending commands to the container right away.

Your container is now set up. You can run the following commands to to confirm that everything is working properly:

./test/test
./test/bench

A pre-compiled image can be found at esimkowitz/pitchdetection.

To use this, rather than building and running the image as described above, run the following commands:

docker pull esimkowitz/pitchdetection:latest
docker run --rm --init -it esimkowitz/pitchdetection:latest

Usage

Read the header and sample wav_analyzer.

The namespaces are pitch and pitch_alloc. The functions and classes are templated for <double> and <float> support.

The pitch namespace functions perform automatic buffer allocation, while pitch_alloc::{Yin, Mpm} give you a reusable object (useful for computing pitch for multiple uniformly-sized buffers):

#include <pitch_detection.h>

std::vector<double> audio_buffer(8092);

double pitch_yin = pitch::yin<double>(audio_buffer, 48000);
double pitch_mpm = pitch::mpm<double>(audio_buffer, 48000);
double pitch_pyin = pitch::pyin<double>(audio_buffer, 48000);
double pitch_pmpm = pitch::pmpm<double>(audio_buffer, 48000);
double pitch_swipe = pitch::swipe<double>(audio_buffer, 48000);

pitch_alloc::Mpm<double> ma(8092);
pitch_alloc::Yin<double> ya(8092);

for (int i = 0; i < 10000; ++i) {
        auto pitch_yin = ya.pitch(audio_buffer, 48000);
        auto pitch_mpm = ma.pitch(audio_buffer, 48000);
        auto pitch_pyin = ya.probabilistic_pitch(audio_buffer, 48000);
        auto pitch_pmpm = ma.probabilistic_pitch(audio_buffer, 48000);
}