TSSL: Trusted-Sound-Source-Localization

This repository contains the python implementation for the paper "TSSL: Trusted Sound Source Localization".

Dataset

Source signals: LibriSpeech
Noise signals: Noise92X
The real-world dataset: LOCATA

These datasets mentioned above can be downloaded from this OneDrive link.

The data directory structure is shown as follows:

.
|---data
    |---LibriSpeech
        |---dev-clean
        |---test-clean
        |---train-clean-100
    |---NoiSig
    |---test
    |---train
    |---dev

Note: The data/ file does not have to be within your project, you can put it somewhere you want. Please remembet to fill the correct data path in config/tcrnn.yaml.

Get Started

Dependencies

We strongly recommend that you can use VSCode and Docker for this project, it can save you much time😁! Note that the related configurations has already been within .devcontainer. The detail information can be found in this Tutorial_for_Vscode&Dokcer.

The environment:

cuda:11.8.0
cudnn: 8
python: 3.10
pytorch: 2.1.0
pytorch lightning: 2.1

Configurations

The realted configurations are all saved in config/.

The data_simu.yaml is used to configure the data generation.
The tcrnn.yaml is used to configure the dataloader, model training & test.

You can change the value of these items based on your need.

Note: Do not forget to intall gpuRIR and webrtcvad.

🚀 Quick Start

Data Generation

Generate the training data:

python data_simu.py DATA_SIMU.TRAIN=True DATA_SIMU.TRAIN_NUM=10000

In the same way, you can also generate the validation and test datasets by changing the DATA_SIMU.TRAIN=True to DATA_SIMU.DEV=True or DATA_SIMU.TEST=True.

Model Training

python main_crnn.py fit --config /workspaces/tssl/config/tcrnn.yaml

The parameter for --config should point to your config file path.

Model Evaluation

Change the ckpt_path in the config/tcrnn.yaml to the trained model weight.
Use Multiple GPUs or Single GPU to test the model performance.

python main_crnn.py test --config /workspaces/tssl/config/tcrnn.yaml

If you want to evaluate the model using the Single GPU, you can change the value of the devices from "0,1" to "0," in the config/tcrnn.yaml.

🎓 Citation

If you find our work useful in your research, please consider citing:

Acknowledge

This repository adapts and integrates from some wonderful work, shown as follows:

Devin-Pi/trusted-sound-source-localization