SiddGururani's Stars
eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
adamian98/pulse
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
belangeo/pyo
Python DSP module
philipperemy/deep-speaker
Deep Speaker: an End-to-End Neural Speaker Embedding System.
aliutkus/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
NVIDIA/mellotron
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
spotify/klio
Smarter data pipelines for audio.
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
auspicious3000/SpeechSplit
Unsupervised Speech Decomposition Via Triple Information Bottleneck
kamenbliznashki/normalizing_flows
Pytorch implementations of density estimation algorithms: BNAF, Glow, MAF, RealNVP, planar flows
liusongxiang/StarGAN-Voice-Conversion
This is a pytorch implementation of the paper: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
rosinality/glow-pytorch
PyTorch implementation of Glow
ivanvovk/WaveGrad
Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.
pranaymanocha/PerceptualAudio
Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM
vBaiCai/python-pesq
A python package for calculating the PESQ.
GuitarML/SmartGuitarPedal
Guitar plugin made with JUCE that uses neural network models to emulate real world hardware.
huyanxin/phasen
A unofficial Pytorch implementation of Microsoft's PHASEN
craigmacartney/Wave-U-Net-For-Speech-Enhancement
Improved speech enhancement with the Wave-U-Net, a deep convolutional neural network architecture for audio source separation, implemented for the task of speech enhancement in the time-domain.
yistLin/FragmentVC
Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention
asuni/wavelet_prosody_toolkit
acids-ircam/flow_synthesizer
Universal audio synthesizer control learning with normalizing flows
zomux/lanmt
LaNMT: Latent-variable Non-autoregressive Neural Machine Translation with Deterministic Inference
L0SG/NanoFlow
PyTorch implementation of the paper "NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity." (NeurIPS 2020)
adrienchaton/PerceptualAudio_Pytorch
Pytorch implementation of "A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences", Pranay Manocha et al. - unofficial work in progress
russellgeum/Phase-aware-Deep-Complex-UNet
[Not Official] Implementation DC-UNet, ICLR 2019
thuhcsi/icassp2021-emotion-tts
Please visit: https://thuhcsi.github.io/icassp2021-emotion-tts/
ViEm-ccy/GEDLoss_pytorch
a pytorch implementation of Google GEDLoss
hifi-gan/code01