runngezhang's Stars
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
kska32/ebooks
收藏的一些经典的历史、政治、心理、哲学、数学、计算机方面电子书(约10万本)
philipperemy/keras-tcn
Keras Temporal Convolutional Network.
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
haoheliu/versatile_audio_super_resolution
Versatile audio super resolution (any -> 48kHz) with AudioSR.
cpuimage/WebRTC_NS
Noise Suppression Module Port From WebRTC
maum-ai/nuwave2
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates @ INTERSPEECH 2022
slp-rl/aero
This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)
maggie0830/DCCRN
implementation of "DCCRN-Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement" by pytorch
sp-uhh/storm
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
sj-li/MS-TCN2
MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation (TPAMI 2020)
fakufaku/fast_bss_eval
A fast implementation of bss_eval metrics for blind source separation
google-research/seanet
fakufaku/diffusion-separation
Single channel speech source separation by diffusion process (ICASSP 2023)
chomeyama/DualCycleGAN
Official implementation of DualCycleGAN for nonparallel audio super resolution
sp-uhh/deep-non-linear-filter
slp-rl/SC-PhASE
This repo contains the official PyTorch implementation of "A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement" (Interspeech 2022)
Hadryan/TFNet-for-Environmental-Sound-Classification
Learning discriminative and robust time-frequency representations for environmental sound classification: Convolutional neural networks (CNN) are one of the best-performing neural network architectures for environmental sound classification (ESC). Recently, attention mechanisms have been used in CNN to capture the useful information from the audio signal for sound classification, especially for weakly labelled data where the timing information about the acoustic events is not available in the training data, apart from the availability of sound class labels. In these methods, however, the inherent time-frequency characteristics and variations are not explicitly exploited when obtaining the deep features. In this paper, we propose a new method, called time-frequency enhancement block (TFBlock), which temporal attention and frequency attention are employed to enhance the features from relevant frames and frequency bands. Compared with other attention mechanisms, in our method, parallel branches are constructed which allow the temporal and frequency features to be attended respectively in order to mitigate interference from the sections where no sound events happened in the acoustic environments. The experiments on three benchmark ESC datasets show that our method improves the classification performance and also exhibits robustness to noise.
moodoki/tfnet
zeroone-universe/AECNN_for_Speech_Enhancement
Unofficial Pytorch Lightning Implementation of "A New Framework for CNN-Based Speech Enhancement in the Time Domain"
tan90xx/audio-super-resolution-tf
https://tan90xx.github.io/SR-display.github.io/
zeroone-universe/TowardsRobustSpeechSR
Unofficial Pytorch Lightning Implementation of "Towards Robust Speech Super-Resolution"
BerlinerA/DSVAE-NES
This repository contains the official PyTorch implementation of the paper: "Learning Discrete Structured VAE using NES".
nicolas-dufour/self-supervised-low-res-speech
This project transfert the self supervised Wav2vec2 representation to low ressources languages
maggie0830/WebRTC_NS
Noise Suppression Module Port From WebRTC
zeroone-universe/BinauralEffectSimulator
andreeavoicu19/Music-Recommender-System
Based on sound processing and audio feature extraction
zeroone-universe/AdaSpeech
An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"
zeroone-universe/GM4MNIST
zeroone-universe/SRGAN
Unofficial Pytorch Lightning Implementation of SRGAN