source code for the ICASSP2021 paper:“Multi-target DoA Estimation with an Audio-visual Fusion Mechanism”
-
Clone this repository
git clone https://github.com/catherine-qian/Audio-visual-sound-localization.git
-
Download the extracted features (feature extraction source code) from
https://drive.google.com/drive/folders/1wDa3MNqVcYJ76uV2SQR1ZsaOzQ7fpDo_?usp=share_link
and put the features under data/
(you may specify the datapath in dataread.py)
-
Run the following command to get the results
python main_sslr.py -model 'MLP3'
If you use this code
please cite:
@inproceedings{qian2021multi, title={Multi-target DoA Estimation with an Audio-visual Fusion Mechanism}, author={Qian, Xinyuan and Madhavi, Maulik and Pan, Zexu and Wang, Jiadong and Li, Haizhou}, booktitle={ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages={4280--4284}, year={2021}, organization={IEEE} }