rishikksh20
Generative AI Models | Deep Learning Researcher | Open Source enthusiast | Text to Speech | Speech Synthesis | Object detection | Computer Vision
Dubpro.aiNew Delhi, India
Pinned Repositories
convolution-vision-transformers
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers
CrossViT-pytorch
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
FNet-pytorch
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
hifigan-denoiser
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
MLP-Mixer-pytorch
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision
ResUnet
Pytorch implementation of ResUnet and ResUnet ++
ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
rishikksh20's Repositories
rishikksh20/ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
rishikksh20/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
rishikksh20/FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
rishikksh20/AdaSpeech
AdaSpeech: Adaptive Text to Speech for Custom Voice
rishikksh20/HiFiplusplus-pytorch
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement
rishikksh20/Avocodo-pytorch
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
rishikksh20/SoundStorm-pytorch
Google's SoundStorm: Efficient Parallel Audio Generation
rishikksh20/Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
rishikksh20/vae_tacotron2
VAE Tacotron 2, an alternative of GST Tacotron
rishikksh20/HiFi-GAN
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
rishikksh20/LightSpeech
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
rishikksh20/AdaSpeech2
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
rishikksh20/UnivNet-pytorch
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
rishikksh20/NaturalSpeech2
rishikksh20/AudioMAE-pytorch
Unofficial PyTorch implementation of Masked Autoencoders that Listen
rishikksh20/Liveness-Detection
Liveness Detection for human face
rishikksh20/gmvae_tacotron
Gaussian Mixture VAE Tacotron
rishikksh20/iSTFT-Avocodo-pytorch
Ultrafast GAN based Vocoder for Text to Speech
rishikksh20/Phone-Level-Mixture-Density-Network-for-TTS
Rich Prosody Diversity Modelling with Phone-level Mixture Density Network
rishikksh20/Zero-Shot-TTS
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
rishikksh20/NU-Wave2-pytorch
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]
rishikksh20/Bidirectional-LEM-pytorch
Pytorch Implementation of Bidirectional Long Expressive Memory
rishikksh20/WaveFlow
WaveFlow : A Compact Flow-based Model for Raw Audio
rishikksh20/ai-audio-startups
Community list of startups working with AI in audio and music technology
rishikksh20/Inception-Transformer-pytorch
iFormer: Inception Transformer
rishikksh20/rishikksh20
rishikksh20/AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
rishikksh20/ahmetfurkaann
rishikksh20/PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions