Deep Learning for Music (DL4M)

By Yann Bayle (Website, GitHub) from LaBRI (Website, Twitter), Univ. Bordeaux (Website, Twitter), CNRS (Website, Twitter) and SCRIME (Website).

TL;DR Non-exhaustive list of scientific articles on deep learning for music: summary (Article title, pdf link and code), details (table - more info), details (bib - all info)

The role of this curated list is to gather scientific articles, thesis and reports that use deep learning approaches applied to music. The list is currently under construction but feel free to contribute to the missing fields and to add other resources! To do so, please refer to the How To Contribute section. The resources provided here come from my review of the state-of-the-art for my PhD Thesis for which an article is being written. There are already surveys on deep learning for music generation, speech separation and speaker identification. However, these surveys do not cover music information retrieval tasks that are included in this repository.

DL4M summary
DL4M details
Code without articles
Statistics and visualisations
Advices for reviewers of dl4m articles
How To Contribute
FAQ
Acronyms used
Sources
Contributors
Other useful related lists
Cited by
License

DL4M summary

Year	Articles, Thesis and Reports	Code
1988	Neural net modeling of music	No
1988	Creation by refinement: A creativity paradigm for gradient descent learning networks	No
1988	A sequential network design for musical applications	No
1989	The representation of pitch in a neural net model of chord classification	No
1989	Algorithms for music composition by neural nets: Improved CBR paradigms	No
1989	A connectionist approach to algorithmic composition	No
1994	Neural network music composition by prediction: Exploring the benefits of psychoacoustic constraints and multi-scale processing	No
1995	Automatic source identification of monophonic musical instrument sounds	No
1995	Neural network based model for classification of music type	No
1997	A machine learning approach to musical style recognition	No
1998	Recognition of music types	No
1999	Musical networks: Parallel distributed perception and performance	No
2001	Multi-phase learning for jazz improvisation and interaction	No
2002	A supervised learning approach to musical style recognition	No
2002	Finding temporal structure in music: Blues improvisation with LSTM recurrent networks	No
2002	Neural networks for note onset detection in piano music	No
2004	A convolutional-kernel based approach for note onset detection in piano-solo audio signals	No
2009	Unsupervised feature learning for audio classification using convolutional deep belief networks	No
2010	Audio musical genre classification using convolutional neural networks and pitch and tempo transformations	No
2010	Automatic musical pattern feature extraction using convolutional neural network	No
2011	Audio-based music classification with a pretrained convolutional network	No
2012	Rethinking automatic chord recognition with convolutional neural networks	No
2012	Moving beyond feature design: Deep architectures and automatic feature learning in music informatics	No
2012	Local-feature-map integration using convolutional neural networks for music genre classification	No
2012	Learning sparse feature representations for music annotation and retrieval	No
2012	Unsupervised learning of local features for music classification	No
2013	Multiscale approaches to music audio feature learning	No
2013	Musical onset detection with convolutional neural networks	No
2013	Deep content-based music recommendation	No
2014	The munich LSTM-RNN approach to the MediaEval 2014 Emotion In Music task	No
2014	End-to-end learning for music audio	No
2014	Deep learning for music genre classification	No
2014	Recognition of acoustic events using deep neural networks	No
2014	Deep image features in music information retrieval	No
2014	From music audio to chord tablature: Teaching deep convolutional networks to play guitar	No
2014	Improved musical onset detection with convolutional neural networks	No
2014	Boundary detection in music structure analysis using convolutional neural networks	No
2014	Improving content-based and hybrid music recommendation using deep learning	No
2014	A deep representation for invariance and music classification	No
2015	Auralisation of deep convolutional neural networks: Listening to learned features	GitHub
2015	Downbeat tracking with multiple features and deep neural networks	No
2015	Music boundary detection using neural networks on spectrograms and self-similarity lag matrices	No
2015	Classification of spatial audio location and content using convolutional neural networks	No
2015	Deep learning, audio adversaries, and music content analysis	No
2015	Deep learning and music adversaries	GitHub
2015	Singing voice detection with deep recurrent neural networks	No
2015	Automatic instrument recognition in polyphonic music using convolutional neural networks	No
2015	A software framework for musical data augmentation	No
2015	A deep bag-of-features model for music auto-tagging	No
2015	Music-noise segmentation in spectrotemporal domain using convolutional neural networks	No
2015	Musical instrument sound classification with deep convolutional neural network using feature fusion approach	No
2015	Environmental sound classification with convolutional neural networks	No
2015	Exploring data augmentation for improved singing voice detection with neural networks	GitHub
2015	Singer traits identification using deep neural network	No
2015	A hybrid recurrent neural network for music transcription	No
2015	An end-to-end neural network for polyphonic music transcription	No
2015	Deep karaoke: Extracting vocals from musical mixtures using a convolutional deep neural network	No
2015	Folk music style modelling by recurrent neural networks with long short term memory units	GitHub
2015	Deep neural network based instrument extraction from music	No
2015	A deep neural network for modeling music	No
2016	An efficient approach for segmentation, feature extraction and classification of audio signals	No
2016	Text-based LSTM networks for automatic music composition	No
2016	Towards playlist generation algorithms using RNNs trained on within-track transitions	No
2016	Automatic tagging using deep convolutional neural networks	No
2016	Automatic chord estimation on seventhsbass chord vocabulary using deep neural network	No
2016	DeepBach: A steerable model for Bach chorales generation	GitHub
2016	Bayesian meter tracking on learned signal representations	No
2016	Deep learning for music	No
2016	Learning temporal features using a deep neural network and its application to music genre classification	No
2016	On the potential of simple framewise approaches to piano transcription	No
2016	Feature learning for chord recognition: The deep chroma extractor	GitHub
2016	A fully convolutional deep auditory model for musical chord recognition	No
2016	A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction	No
2016	Event localization in music auto-tagging	GitHub
2016	Deep convolutional networks on the pitch spiral for musical instrument recognition	GitHub
2016	SampleRNN: An unconditional end-to-end neural audio generation model	GitHub
2016	Robust audio event recognition with 1-max pooling convolutional neural networks	No
2016	Experimenting with musically motivated convolutional neural networks	GitHub
2016	Singing voice melody transcription using deep neural networks	No
2016	Singing voice separation using deep neural networks and F0 estimation	Website
2016	Learning to pinpoint singing voice from weakly labeled examples	No
2016	Analysis of time-frequency representations for musical onset detection with convolutional neural network	No
2016	Note onset detection in musical signals via neural-network-based multi-ODF fusion	No
2016	Music transcription modelling and composition using deep learning	GitHub
2016	Convolutional neural network for robust pitch determination	No
2016	Deep convolutional neural networks and data augmentation for acoustic event detection	Website
2017	Gabor frames and deep scattering networks in audio processing	No
2017	Vision-based detection of acoustic timed events: A case study on clarinet note onsets	No
2017	Deep learning techniques for music generation - A survey	No
2017	JamBot: Music theory aware chord based generation of polyphonic music with LSTMs	GitHub
2017	XFlow: 1D <-> 2D cross-modal deep neural networks for audiovisual classification	No
2017	Machine listening intelligence	No
2017	Monoaural audio source separation using deep convolutional neural networks	GitHub
2017	Deep multimodal network for multi-label classification	No
2017	A tutorial on deep learning for music information retrieval	GitHub
2017	A comparison on audio signal preprocessing methods for deep neural networks on music tagging	GitHub
2017	Transfer learning for music classification and regression tasks	GitHub
2017	Convolutional recurrent neural networks for music classification	GitHub
2017	An evaluation of convolutional neural networks for music classification using spectrograms	No
2017	Large vocabulary automatic chord estimation using deep neural nets: Design framework, system variations and limitations	No
2017	Basic filters for convolutional neural networks: Training or design?	No
2017	Ensemble Of Deep Neural Networks For Acoustic Scene Classification	No
2017	Robust downbeat tracking using an ensemble of convolutional networks	No
2017	Music signal processing using vector product neural networks	No
2017	Transforming musical signals through a genre classifying convolutional neural network	No
2017	Audio to score matching by combining phonetic and duration information	GitHub
2017	Interactive music generation with positional constraints using anticipation-RNNs	No
2017	Deep rank-based transposition-invariant distances on musical sequences	No
2017	GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures	No
2017	Deep convolutional neural networks for predominant instrument recognition in polyphonic music	No
2017	CNN architectures for large-scale audio classification	No
2017	DeepSheet: A sheet music generator based on deep learning	No
2017	Talking Drums: Generating drum grooves with neural networks	No
2017	Singing voice separation with deep U-Net convolutional networks	GitHub
2017	Music emotion recognition via end-to-end multimodal neural networks	No
2017	Chord label personalization through deep learning of integrated harmonic interval-based representations	No
2017	End-to-end musical key estimation using a convolutional neural network	No
2017	MediaEval 2017 AcousticBrainz genre task: Multilayer perceptron approach	No
2017	Classification-based singing melody extraction using deep convolutional neural networks	No
2017	Multi-level and multi-scale feature aggregation using pre-trained convolutional neural networks for music auto-tagging	No
2017	Multi-level and multi-scale feature aggregation using sample-level deep convolutional neural networks for music classification	GitHub
2017	Sample-level deep convolutional neural networks for music auto-tagging using raw waveforms	No
2017	A SeqGAN for Polyphonic Music Generation	GitHub
2017	Harmonic and percussive source separation using a convolutional auto encoder	No
2017	Stacked convolutional and recurrent neural networks for music emotion recognition	No
2017	A deep learning approach to source separation and remixing of hiphop music	No
2017	Music Genre Classification Using Masked Conditional Neural Networks	No
2017	Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask	GitHub
2017	Generating data to train convolutional neural networks for classical music source separation	GitHub
2017	Monaural score-informed source separation for classical music using convolutional neural networks	GitHub
2017	Multi-label music genre classification from audio, text, and images using deep features	GitHub
2017	A deep multimodal approach for cold-start music recommendation	GitHub
2017	Melody extraction and detection through LSTM-RNN with harmonic sum loss	No
2017	Representation learning of music using artist labels	No
2017	Toward inverse control of physics-based sound synthesis	Website
2017	DNN and CNN with weighted and multi-task loss functions for audio event detection	No
2017	Score-informed syllable segmentation for a cappella singing voice with convolutional neural networks	GitHub
2017	End-to-end learning for music audio tagging at scale	GitHub
2017	Designing efficient architectures for modeling temporal features with convolutional neural networks	GitHub
2017	Timbre analysis of music audio signals with convolutional neural networks	GitHub
2017	Deep learning and intelligent audio mixing	No
2017	Deep learning for event detection, sequence labelling and similarity estimation in music signals	No
2017	Music feature maps with convolutional neural networks for music genre classification	No
2017	Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks	GitHub
2017	Adversarial semi-supervised audio source separation applied to singing voice extraction	No
2017	Taking the models back to music practice: Evaluating generative transcription models built using deep learning	GitHub
2017	Generating nontrivial melodies for music as a service	No
2017	Invariances and data augmentation for supervised music transcription	GitHub
2017	Lyrics-based music genre classification using a hierarchical attention network	GitHub
2017	A hybrid DSP/deep learning approach to real-time full-band speech enhancement	GitHub
2017	Convolutional methods for music analysis	No
2017	Extending temporal feature integration for semantic audio analysis	No
2017	Recognition and retrieval of sound events using sparse coding convolutional neural network	No
2017	A two-stage approach to note-level transcription of a specific piano	No
2017	Reducing model complexity for DNN based large-scale audio classification	No
2017	Audio spectrogram representations for processing with convolutional neural networks	Website
2017	Unsupervised feature learning based on deep models for environmental audio tagging	No
2017	Attention and localization based on a deep convolutional recurrent model for weakly supervised audio tagging	GitHub
2017	Surrey-CVSSP system for DCASE2017 challenge task4	GitHub
2017	A study on LSTM networks for polyphonic music sequence modelling	Website
2018	MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment	GitHub
2018	Music transformer: Generating music with long-term structure	No
2018	Music theory inspired policy gradient method for piano music transcription	No
2019	Enabling factorized piano music modeling and generation with the MAESTRO dataset	No
2019	Generating Long Sequences with Sparse Transformers	GitHub

Go back to top

DL4M details

A human-readable table summarized version if displayed in the file dl4m.tsv. All details for each article are stored in the corresponding bib entry in dl4m.bib. Each entry has the regular bib field:

author
year
title
journal or booktitle

Each entry in dl4m.bib also displays additional information:

link - HTML link to the PDF file
code - Link to the source code if available
archi - Neural network architecture
layer - Number of layers
task - The proposed tasks studied in the article
dataset - The names of the dataset used
dataaugmentation - The type of data augmentation technique used
time - The computation time
hardware - The hardware used
note - Additional notes and information
repro - Indication to what extent the experiments are reproducible

Go back to top

Code without articles

Go back to top

Statistics and visualisations

165 papers referenced. See the details in dl4m.bib. There are more papers from 2017 than any other years combined. Number of articles per year:
If you are applying DL to music, there are 356 other researchers in your field.
34 tasks investigated. See the list of tasks. Tasks pie chart:
53 datasets used. See the list of datasets. Datasets pie chart:
30 architectures used. See the list of architectures. Architectures pie chart:
9 frameworks used. See the list of frameworks. Frameworks pie chart:
Only 44 articles (26%) provide their source code. Repeatability is the key to good science, so check out the list of useful resources on reproducibility for MIR and ML.

Go back to top

Advices for reviewers of dl4m articles

Please refer to the advice_review.md file.

How To Contribute

Contributions are welcome! Please refer to the CONTRIBUTING.md file.

Go back to top

FAQ

How are the articles sorted?

The articles are first sorted by decreasing year (to keep up with the latest news) and then alphabetically by the main author's family name.

Why are preprint from arXiv included in the list?

I want to have exhaustive research and the latest news on DL4M. However, one should take care of the information provided in the articles currently in review. If possible you should wait for the final accepted and peer-reviewed version before citing an arXiv paper. I regularly update the arXiv links to the corresponding published papers when available.

How much can I trust the results published in an article?

The list provided here does not guarantee the quality of the articles. You should either try to reproduce the experiments described or submit a request to ReScience. Use one article's conclusion at your own risks.

Go back to top

Acronyms used

A list of useful acronyms used in deep learning and music is stored in acronyms.md.

Go back to top

Sources

The list of conferences, journals and aggregators used to gather the proposed materials is stored in sources.md.

Go back to top

Contributors

Yann Bayle (GitHub) - Instigator and principal maintainer
Vincent Lostanlen (GitHub)
Keunwoo Choi (GitHub)
Bob L. Sturm (GitHub)
Stefan Balke (GitHub)
Jordi Pons (GitHub)
Mirza Zulfan (GitHub) for the logo
Devin Walters
https://github.com/LegendJ

Go back to top

Other useful related lists and resources

Audio

DL4MIR tutorial with keras - Tutorial for Deep Learning on Music Information Retrieval by Thomas Lidy
Video talk from Ron Weiss - Ron Weiss (Google) Talk on Training neural network acoustic models on waveforms
Slides on DL4M - A personal (re)view of the state-of-the-art by Jordi Pons
DL4MIR tutorial - Python tutorials for learning to solve MIR tasks with DL
Awesome Python Scientific Audio - Python resources for Audio and Machine Learning
ISMIR resources - Community maintained list
ISMIR Google group - Daily dose of general MIR
Awesome Python - Audio section of Python resources
Awesome Web Audio - WebAudio packages and resources
Awesome Music - Music softwares
Awesome Music Production - Music creation
The Asimov Institute - 6 deep learning tools for music generation
DLM Google group - Deep Learning in Music group
MIR community on Slack - Link to subscribe to the MIR community's Slack
Unclassified list of MIR-related links - Cory McKay's list of various links on DL, MIR, ...
MIRDL - Unmaintained list of DL articles for MIR from Jordi Pons
WWW 2018 Challenge - Learning to Recognize Musical Genre on the FMA dataset
Music generation with DL - List of resources on music generation with deep learning
Auditory Scene Analysis - Book about the perceptual organization of sound by Albert Bregman, the "father of Auditory Scene Analysis".
- Demonstrations of Auditory Scene Analysis - Audio demonstrations, which illustrate examples of auditory perceptual organization.

Go back to top

Music datasets

Go back to top

Deep learning

DLPaper2Code: Auto-generation of Code from Deep Learning Research Papers -
Model Convertors - Convertors for DL frameworks and backend
Deep architecture genealogy - Genealogy of DL architectures
Deep Learning as an Engineer - Slides from Jan Schlüter
Awesome Deep Learning - General deep learning resources
Awesome Deep Learning Resources - Papers regarding deep learning and deep reinforcement learning
Awesome RNNs - RNNs code, theory and applications
Cheatsheets AI - Cheat Sheets for Keras, neural networks, scikit-learn,...
DL PaperNotes - Summaries and notes on general deep learning research papers
General lists
Echo State Network
DL in NLP - Best practices for using neural networks by Sebastian Ruder
CNN overview - Stanford Course
Dilated Recurrent Neural Networks - How to improve RNNs?
Encoder-Decoder in RNNs - How Does Attention Work in Encoder-Decoder Recurrent Neural Networks
On the use of DL - Misc fun around DL
ML from scratch - Python implementations of ML models and algorithms from scratch from Data Mining to DL
Comparison of DL frameworks - Presentation describing the different existing frameworks for DL
ELU > ReLU - Article describing the differences between ELU and ReLU
Reinforcement Learning: An Introduction - Book about reinforcement learning
Estimating Optimal Learning Rate - Blog post on the learning rate optimisation
GitHub repo for sklearn add-on for imbalanced learning - ML in uneven datasets
Video on DL from Nando de Freitas, Scott Reed and Oriol Vinyals - Deep Learning: Practice and Trends (NIPS 2017 Tutorial, parts I & II)
Article "Are GANs Created Equal? A Large-Scale Study" - Actually comparing DL algorithms
Battle of the Deep Learning frameworks - DL frameworks comparison and evolution
Black-box optimization - There are other optimization algorithms than just gradient descent

Go back to top

Cited by

If you use the information contained in this repository, please let us know! This repository is cited by:

Go back to top

License

You are free to copy, modify, and distribute Deep Learning for Music (DL4M) with attribution under the terms of the MIT license. See the LICENSE file for details. This project use another projects and you may refer to them for appropriate license information :

Readme checklist - To build an universal Readme.
Pylint - To clean the python code.
Numpy - To manage python structure.
Matplotlib - To plot nice figures.
Bibtexparser - To deal with the bib entries.

Go back to top

Annabelle115/awesome-deep-learning-music

Deep Learning for Music (DL4M)

Table of contents

DL4M summary

DL4M details

Code without articles

Statistics and visualisations

Advices for reviewers of dl4m articles

How To Contribute

FAQ

Acronyms used

Sources

Contributors

Other useful related lists and resources

Audio

Music datasets

Deep learning

Cited by

License