I want to create have written a very basic music matching app similar to Shazam.
This repo contains various experiments to help:
- understand some of the concepts
- get some experience using the Web Audio API
I have tried it successfully on the following browsers:
- Chrome on macOS
- Firefox on macOS
- Chrome on Android
NOTE: matchTrackWithStreaming.html does not work on Firefox because it doesn't seem to support
AudioWorkletNode
:
ReferenceError: AudioWorkletNode is not defined [Learn More] pcmObservable.js:1:1
<anonymous> http://localhost:3002/common/utils/pcmObservable.js:1
InnerModuleEvaluation self-hosted:4364
InnerModuleEvaluation self-hosted:4353
evaluation self-hosted:4317
You can try it out by playing one of the YouTube links below and clicking Record
on one of these pages:
- matchTrack.html
- capture chosen length of sample then send to server for matching
- matchTrackWithDiagnostics.html
- as above but with various diagnostic charts
- matchTrackWithStreaming.html
- stream fingerprinted data to the server over a web socket for progressive matching
I have fingerprinted 12 tracks. I bought them all from iTunes. The .m4a files are not in the repo. However, you can play the tracks on YouTube:
- Walton: Henry V - A Musical Scenario after Shakespeare / Henry V: IV. Interlude: Touch Her Soft Lips and Part
- Cecilia Bartoli - Arie Antiche: Se tu m'ami / Caro mio ben
- Strauss: Vier letzte Lieder, Die Nacht, Allerseelen / Morgen, Op. 27, No. 4
- Stämning (feat. Eric Ericson) [22 Swedish songs] / I Seraillets Have (feat. Eric Ericson)
- Bridge: Piano Music, Vol. 3 / 3 Lyrics: No. 1, Heart's Ease. Andante tranquillo – Lento
- Minimal Piano Collection, Vol. X-XX / Ellis Island for Two Pianos
- Fantasie / Après un rêve
- Victoria: Requiem (Officium defunctorum). Lobo: Versa est in luctum / Taedet animam meam: I. Taedet animam meam
- Too Hot to Handle / Boogie Nights
- Reflections / Gun
- Bach: Cantatas for Alto Solo / Cantata No. 170, BWV 170: I. Aria "Vergnügte Ruh! Beliebte Seelenlust!"
- Haydn: Strings Quartets Op. 71 / Griogal Cridhe "Gregor’s Lament"
- this is actually a live performance but some of it still matches against the fingerprinted iTunes track!
Just being lazy really. The client-side code does not need to be built - no Babel
, Browserify
, webpack
, npm run build
, etc. This is achieved using the type=module
attribute of the <script>
tag. Here is an example:
<script type="module" src="matchTrack.js"></script>
After cloning the repo, do the following to run everything locally. This assumes that you have Node.js, npm and Docker installed.
# restore npm packages
npm install
# create a .env file
cat << 'EOF' >> .env
PGSSLMODE=disable
DATABASE_URL=postgres://postgres:mypassword@localhost:5432/postgres
EOF
# run PostgreSQL in a Docker container
scripts/db.sh run
# create database schema
scripts/db.sh create
# restore the database
scripts/db.sh restore
# (optional) show summary of restored database
scripts/db.sh show
# start the Express server
npm start
Then, open a browser to http://localhost:3002.
Or, you can specify a different port number e.g.:
PORT=3200 npm start
Copy data from the local Docker database instance to the Heroku database instanceStore more track metadata e.g. artist and album artworkAdd more tracks to the databaseCreate a new experiment that is a copy of experiment4 but without all the chartingi.e. a slimmed-down test page
Extend experiment4 to include visualisations of the hash matching (more charts!)see Fig 3A and Fig 3B in the original paper
Add a web page to list the tracks in the database- Tune the app to perform better.
Concentrate on reducing the amount of stored data whilst increasing the ability to find a match.
Some of the settings that can be adjusted are:
- Sample rate (currently 16 kHz)
- FFT size (currently 1024)
- Frequency bands (currently 0-100 Hz, 100-200 Hz, 200-400 Hz, 400-800 Hz, 800-1600 Hz, 1600-8000 Hz)
- Number of slivers (currently 20 per second)
- Size of the target zone (currently 5 points)
- Write a back end console tool in C# or F# to fingerprint a track and add it to the database
- Reconcile the FFT & fingerprint data calculated by JavaScript &
Web Audio API
vs C#/F# &Math.NET Numerics
- Currently, I am getting results that don't quite match
- Consequently, I am currently having to use
seedTracks.html
to fingerprint a track and store it in the database
- Call out to ffmpeg to convert the track from an mp3/m4a file to pcm
- Calculate the fingerprint data
- Store the fingerprint data and track metadata in the database
- Reconcile the FFT & fingerprint data calculated by JavaScript &
- Investigate hosting on AWS
- Load static resources from an Amazon S3 bucket configured for static website hosting
- Move the
match
web api endpoint to AWS Lambda - Store the track data in Amazon RDS for PostgreSQL
Implement streaming-based matchRather than capture a 5 second sample and then send it to the server for matching, stream data to the server and return a match as soon as possible e.g. after 2 seconds.