Show example json to run ORPIT

Question

Show example json to run ORPIT

hangtingchen opened this issue 4 years ago · 2 comments

Hi,
First thanks for your efforts of the ORPIT example, which is the only one I can find on the GitHub.
The code is complex and I am in a hurry to run this. Is this possible to show several lines of dataset json files? And how to run the command to include both the 2 and 3 speaker conditions?

Answer 1 · 2020-09-07T12:21:36.000Z

Hi!

I agree that the code is a little (too) complex. It was part of a bigger project and had to support complex use-cases there. The hardest part in this code is support for variable-length sequences, which is not required for simple source separation.

To create the JSON file for WSJ0-2mix and WSJ0-3mix data, you can have a look at the files in padertorch/padertorch/contrib/data/wsj0_mix/. In general, the JSON structure has to look like this:

{
    "datasets": {
        "<dataset_name>": {
            "<example_id>": {
                "audio_path": {
                    "observation": "<path/to/observation.wav>",
                    "speech_source": [
                        "path/to/first/speech/source.wav",
                        "path/to/second/speech/source.wav",
                        "... (for three speakers)"
                    ]
                },
            },
            "...": {},
    }
}

To run the training on both two- and three-speaker mixtures, you have to include all datasets in the JSON file (or provide multiple JSON files) and set train_datasets to contain all datasets you want to train on. For WSJ0-2mix, the (simplest) command to run this (after installing padertorch and its dependencies) is

$ python -m padertorch.contrib.examples.source_separation.or_pit.train with database_json=/path/to/the/json(s) trainer.storage_dir=/path/to/store/experiment/checkpoints

This will run the training without fine-tuning (so, it only performs one iteration). If you want to perform fine-tuning, you can run

python -m padertorch.contrib.examples.source_separation.or_pit.train with database_json=/path/to/the/json(s) trainer.storage_dir=/path/to/store/experiment/checkpoints trainer.model.finetune=True load_model_from=/path/to/the/pre-trained/model

I hope this helps. Let me know if you have any troubles

Answer 2 · 2020-09-07T13:51:38.000Z

Thanks a lot. We can run this!