SE-FFTNet in a handy Docker container

This is a small repo to make it easier to use SE-FFTNet to clean up audio samples. It wraps the model in Docker, takes in noisy WAV files and outputs cleaned up WAVs.

Requirements:

Linux (currently, until CUDA for WSL2 is released)
Docker & Nvidia-docker (tested on 19.03.6)

The container follows the requirements of SE-FFTNet, so is based on tensorflow/tensorflow:1.14.0-gpu-py3.

Note that this setup uses the model provided in the SE-FFTNet repo.

Input and Output

Set up a local directory for output (called output here) and inside that, put a directory called input. It should look something like this:

output/
├── input/
│   ├── input_file_1.wav
│   ├── input_file_2.wav
│   ├── input_file_3.wav

If you are happy to mount from your clone of this repo, for example, you could use /path/to/repo/se-fftnet_output/.

Building and Running the Container

You can build with:

docker build -t se-fftnet:<version number> -f build_tf.dockerfile .

Put the files you wish to convert in input, then run the container, mounting it as below:

docker container run --runtime=nvidia -v <local path>:/se-fftnet/output/ se-fftnet:<version number>

This will run run_sefftnet.py, which will process all files in input. The python script replaces generate.sh in the SE-FFTNet repo to better handle directories.

There is not much in run_sefftnet.py to change apart from:

the path of config.json inside the container - though this should not need changing even if you want to edit the config.
the model ID, which should not need changing unless you have your own model files.

Changing the Config

SE-FFTNet comes with a prebuilt config file called config.json - you can find this in the SE-FFTNet-tensorflow-implemenatation repo, under config. To keep this as a submodule and avoid changing the files, I have created a local config which Docker will use to overwrite the default config on build. You can find this here. Some useful things you might wish to change are:

data_dir - sets the base directory that the model takes input from
test_noisy_audio_dir - the directory within the base directory, where the model expects to find WAVs
base_dir - the base directory for the model
output_dir - where the model outputs clean WAVs

tangohead/SE-FFTNet-Docker

SE-FFTNet in a handy Docker container

Input and Output

Building and Running the Container

Changing the Config