/SE-FFTNet-Docker

Docker-wrapped version of SE-FFTNet-Docker

Primary LanguagePython

SE-FFTNet in a handy Docker container

This is a small repo to make it easier to use SE-FFTNet to clean up audio samples. It wraps the model in Docker, takes in noisy WAV files and outputs cleaned up WAVs.

Requirements:

  • Linux (currently, until CUDA for WSL2 is released)
  • Docker & Nvidia-docker (tested on 19.03.6)

The container follows the requirements of SE-FFTNet, so is based on tensorflow/tensorflow:1.14.0-gpu-py3.

Note that this setup uses the model provided in the SE-FFTNet repo.

Input and Output

Set up a local directory for output (called output here) and inside that, put a directory called input. It should look something like this:

output/
├── input/
│   ├── input_file_1.wav
│   ├── input_file_2.wav
│   ├── input_file_3.wav

If you are happy to mount from your clone of this repo, for example, you could use /path/to/repo/se-fftnet_output/.

Building and Running the Container

You can build with:

docker build -t se-fftnet:<version number> -f build_tf.dockerfile .

Put the files you wish to convert in input, then run the container, mounting it as below:

docker container run --runtime=nvidia -v <local path>:/se-fftnet/output/ se-fftnet:<version number>

This will run run_sefftnet.py, which will process all files in input. The python script replaces generate.sh in the SE-FFTNet repo to better handle directories.

There is not much in run_sefftnet.py to change apart from:

  • the path of config.json inside the container - though this should not need changing even if you want to edit the config.
  • the model ID, which should not need changing unless you have your own model files.

Changing the Config

SE-FFTNet comes with a prebuilt config file called config.json - you can find this in the SE-FFTNet-tensorflow-implemenatation repo, under config. To keep this as a submodule and avoid changing the files, I have created a local config which Docker will use to overwrite the default config on build. You can find this here. Some useful things you might wish to change are:

  • data_dir - sets the base directory that the model takes input from
  • test_noisy_audio_dir - the directory within the base directory, where the model expects to find WAVs
  • base_dir - the base directory for the model
  • output_dir - where the model outputs clean WAVs