This repository contains configurations of the following SE models with different model sizes used in the paper Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement:
- Conv-TasNet
- DEMUCS-v4
- BSRNN
- TF-GridNet
The configurations are based on the ESPnet toolkit and our implementations for the above models are provided in ./espnet2/ as a reference.
Model | Causal | #Params (M) | #MACs (G/s) | Config file | Model link | |
---|---|---|---|---|---|---|
16 kHz | 48 kHz | |||||
BSRNN | (sampling-frequency-independent) | |||||
xtiny | ✔︎ | 0.5 | 0.1 | 0.4 | conf/bsrnn_xtiny.yaml | |
✘ | 0.5 | 0.2 | 0.6 | conf/bsrnn_xtiny_noncausal.yaml | ||
tiny | ✔︎ | 1.3 | 0.6 | 1.7 | conf/bsrnn_tiny.yaml | |
✘ | 1.5 | 0.7 | 2.2 | conf/bsrnn_tiny_noncausal.yaml | ||
small | ✔︎ | 4.1 | 2.1 | 6.4 | conf/bsrnn_small.yaml | |
✘ | 4.8 | 2.8 | 8.5 | conf/bsrnn_small_noncausal.yaml | ||
medium | ✔︎ | 14.3 | 8.4 | 25.2 | conf/bsrnn_medium.yaml | |
✘ | 16.9 | 11.2 | 33.4 | conf/bsrnn_medium_noncausal.yaml | ||
large | ✔︎ | 52.9 | 33.4 | 99.9 | conf/bsrnn_large.yaml | |
✘ | 63.1 | 44.3 | 132.5 | conf/bsrnn_large_noncausal.yaml | ||
xlarge | ✔︎ | 83.6 | 66.1 | 197.7 | conf/bsrnn_large_double.yaml | |
✘ | 104.1 | 87.9 | 262.3 | conf/bsrnn_large_double_noncausal.yaml |
Model | Causal | #Params (M) | #MACs (G/s) | Config file | Model link | |
---|---|---|---|---|---|---|
16 kHz | 48 kHz | |||||
Conv-TasNet | (input is always resampled to 48 kHz) | |||||
small | ✘ | 1.1 | - | 8.9 | conf/conv_tasnet_small.yaml | |
medium | ✘ | 14.3 | - | 18.7 | conf/conv_tasnet_medium.yaml | |
large | ✘ | 52.6 | - | 47.2 | conf/conv_tasnet_large.yaml | |
xlarge | ✘ | 103.9 | - | 85.4 | conf/conv_tasnet_xlarge.yaml | |
DEMUCS-v4 | (input is always resampld to 48 kHz) | |||||
tiny | ✘ | 1.0 | - | 1.0 | conf/demucsv4_tiny.yaml | |
small | ✘ | 4.1 | - | 3.5 | conf/demucsv4_small.yaml | |
medium | ✘ | 16.2 | - | 13.0 | conf/demucsv4_medium.yaml | |
large | ✘ | 26.9 | - | 17.2 | conf/demucsv4_large.yaml | |
xlarge | ✘ | 79.3 | - | 40.7 | conf/demucsv4_xlarge.yaml | |
TF-GridNet | (sampling-frequency-independent) | |||||
xxtiny | ✘ | 0.1 | 1.9 | 5.6 | conf/tfgridnet_xxtiny.yaml | |
xtiny | ✘ | 0.5 | 7.4 | 21.7 | conf/tfgridnet_xtiny.yaml | |
tiny | ✘ | 1.5 | 24.1 | 70.5 | conf/tfgridnet_tiny.yaml | |
small | ✘ | 5.7 | 89.5 | 261.8 | conf/tfgridnet_small.yaml |