rishikksh20

Dubpro.aiNew Delhi, India

Pinned Repositories

convolution-vision-transformers
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers
Language:Python216 7 634
CrossViT-pytorch
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Language:Python169 2 218
FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Language:Jupyter Notebook211 10 1251
FNet-pytorch
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
Language:Python247 5 536
hifigan-denoiser
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Language:Python191 10 942
iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
Language:Python208 10 1545
MLP-Mixer-pytorch
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision
Language:Python206 2 328
ResUnet
Pytorch implementation of ResUnet and ResUnet ++
Language:Python411 3 1065
ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
Language:Python454 8 960
VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Language:Python318 13 1860

rishikksh20's Repositories

rishikksh20/ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
Language:Python454 8 960
rishikksh20/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Language:Python318 13 1860
rishikksh20/FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Language:Jupyter Notebook211 10 1251
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
Language:Python208 10 1545
rishikksh20/AdaSpeech
AdaSpeech: Adaptive Text to Speech for Custom Voice
Language:Jupyter Notebook155 7 1140
rishikksh20/HiFiplusplus-pytorch
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement
Language:Python143 12 618
rishikksh20/Avocodo-pytorch
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Language:Python114 15 415
rishikksh20/SoundStorm-pytorch
Google's SoundStorm: Efficient Parallel Audio Generation
Language:Python113 17 512
rishikksh20/Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Language:Python99 7 932
rishikksh20/vae_tacotron2
VAE Tacotron 2, an alternative of GST Tacotron
Language:Python85 7 929
rishikksh20/HiFi-GAN
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python77 7 723
rishikksh20/LightSpeech
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Language:Python77 9 57
rishikksh20/AdaSpeech2
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Language:Jupyter Notebook69 9 019
rishikksh20/UnivNet-pytorch
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Language:Python68 6 410
rishikksh20/NaturalSpeech2
Language:Python65 13 03
rishikksh20/AudioMAE-pytorch
Unofficial PyTorch implementation of Masked Autoencoders that Listen
Language:Python60 4 26
rishikksh20/Liveness-Detection
Liveness Detection for human face
Language:Python52 4 017
rishikksh20/gmvae_tacotron
Gaussian Mixture VAE Tacotron
Language:Python51 6 312
rishikksh20/iSTFT-Avocodo-pytorch
Ultrafast GAN based Vocoder for Text to Speech
Language:Python51 6 27
rishikksh20/Phone-Level-Mixture-Density-Network-for-TTS
Rich Prosody Diversity Modelling with Phone-level Mixture Density Network
Language:Jupyter Notebook45 5 16
rishikksh20/Zero-Shot-TTS
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Language:Python34 4 23
rishikksh20/NU-Wave2-pytorch
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]
Language:Python24 6 03
rishikksh20/Bidirectional-LEM-pytorch
Pytorch Implementation of Bidirectional Long Expressive Memory
Language:Python9 2 01
rishikksh20/WaveFlow
WaveFlow : A Compact Flow-based Model for Raw Audio
Language:Python4 2 02
rishikksh20/ai-audio-startups
Community list of startups working with AI in audio and music technology
3 1 0
rishikksh20/Inception-Transformer-pytorch
iFormer: Inception Transformer
1 2 1
rishikksh20/rishikksh20
0 3 62
rishikksh20/AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Language:Python0 0
rishikksh20/ahmetfurkaann
1 0
rishikksh20/PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
Language:Python0 0