A tutorial for Sound Source Localization researchers and practitioners. The purpose of this repo is to organize the world’s resources for Sound Source Localization, and make them universally accessible and useful.
This is a curated list of Awesome Sound Source Localization tutorials, papers, libraries, datasets, tools, scripts and results. The purpose of this repo is to organize the world’s resources for Sound Source Localization, and make them universally accessible and useful.
To add items to this page, you are welcomed to simply issue a Pull Request.
Publications
Survey
A Survey of Sound Source Localization with Deep Learning Methods, The Journal of the Acoustical Society of America, 2022 [paper]
Databases
SLoClas: A Database for Joint Sound Localization and Classification, 2021 [paper][note]
The LOCATA Challenge: Acoustic Source Localization and Tracking, TASLP 2020 [paper]
Network design
MLP
Deep Neural Networks for Multiple Speaker Detection and Localization [paper][code][note]
CNN
Deep Neural Networks for Multiple Speaker Detection and Localization, ICRA 2018 [paper][code][note]
Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network, Interspeech2018 [paper]
Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-adversarial Training, ICASSP 2019 [paper][code]
Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation, TASLP 2021 [paper][code][note]
Broadband DOA estimation using Convolutional neural networks trained with noise signals, 2017 [paper][note]
Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained with Noise Signals, JSTSP 2019 [paper][note]
Robust Source Counting and DOA Estimation Using Spatial Pseudo-Spectrum and Convolutional Neural Network, TASLP 2020 [paper][note]
RNN & LSTM & GRU
Time Difference of Arrival Estimation of Speech Signals Using Deep Neural Networks with Integrated Time-frequency Masking, ICASSP 2019 [paper][note]
CRNN
Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network, EUSIPCO 2018 [paper][note]
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks, JSTSP 2018 [paper][note][code]
CRNN-Based Multiple DoA Estimation Using Acoustic Intensity Features for Ambisonics Recordings, 2019 [paper][note]
Attention
A combination of various neural networks for sound event localization and detection, DCASE 2021 Challenge
Sound event localization and detection using cross-modal attention and parameter sharing, DCASE 2021 Challenge
Blind Speech Separation Through Direction of Arrival Estimation Using Deep Neural Networks with a Flexibility on the Number of Speakers, MMSP 2022 [paper]
Speech Enhancement + SSL
Time Difference of Arrival Estimation of Speech Signals Using Deep Neural Networks with Integrated Time-frequency Masking, ICASSP 2019 [paper][note]
Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking, Interspeech 2019 [paper][note]
Diverse Environments Multichannel Acoustic Noise Database provides a set of recordings that allow testing of algorithms using real-world noise in a variety of settings.
The noise bank for simulate noisy data with clean speech. For N1-N100 noises, they were collected by Guoning Hu and the other 15 home-made noise types by USTC.