Perceptual Contrast Stretching on Target Feature for Speech Enhancement

This repo is only dedicated to the post-processing PCS.

catalog

Introduction
PCS-tools
SpeechMetrics-tools
Citation
References

Update (May 11, 2024):

For Speech Enhancement Systems utilizing a 400-sample window frame in the Short-Time Fourier Transform (STFT), we recommend using PCS400 instead of PCS. This adjustment helps prevent distortion due to mismatching.

Introduction

"PCS is derived based on the critical band importance function and applied to modify the targets of the SE model."
"It can also be used as a post-processing (PP) method to further sharpen the structure of enhanced speech and suppress residual noise."

More details can be found in here: http://arxiv.org/abs/2203.17152 (Preprint arXiv; Accepted by INTERSPEECH 2022)

This repo is only dedicated to the post-processing PCS.

Enhanced audios are generated by different baseline models to which post-processing PCS is then applied.
The experimental results are as follows:

Some examples are shown below:

PCS-tools

Post-processing PCS tools can be found at /PCS or PCS400 folder.
So you can simply post-process the audio with PCS.

For Speech Enhancement Systems utilizing a 400-sample window frame in the Short-Time Fourier Transform (STFT), we recommend using PCS400 instead of PCS. This adjustment helps prevent distortion due to mismatching.

Scoring-tools

Speech metric scores were computed with /speech_metrics.

Online Post-processing PCS Demo

https://lojoffy-pcs-online-demo-main-luu0rc.streamlitapp.com/

Citation:

If you find the code useful in your research, please cite:

@article{chao2022perceptual,
  title={Perceptual Contrast Stretching on Target Feature for Speech Enhancement},
  author={Chao, Rong and Yu, Cheng and Fu, Szu-Wei and Lu, Xugang and Tsao, Yu},
  journal={Proc. of INTERSPEECH},
  year={2022}
}

Reference:

DPT-FSNet:

arXiv: https://arxiv.org/pdf/2104.13002.pdf
Reproduced and denoted as DPT*

RoyChao19477/PCS

Perceptual Contrast Stretching on Target Feature for Speech Enhancement

catalog

Update (May 11, 2024):

Introduction

PCS-tools

Scoring-tools

Online Post-processing PCS Demo

Citation:

Reference:

SEGAN:

Wiener filter:

Transformer T(c) / T(nc)

CRNN

MetricGAN+

DPT-FSNet: