Pinned Repositories
academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
agri_crop_prediction
asteroid
The PyTorch-based audio source separation toolkit for researchers || Pretrained models available
Audio-and-text-based-emotion-recognition
A multimodal approach on emotion recognition using audio and text.
awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
cuddly-garbanzo
deepspeech-ASR
Mozilla deepspeech Automatic speech recognition System
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
fastspeech2_custom
image-background-changer
Change the background of an image using semantic segmentation
raikarsagar's Repositories
raikarsagar/image-background-changer
Change the background of an image using semantic segmentation
raikarsagar/vocode-react-sdk
raikarsagar/academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
raikarsagar/agri_crop_prediction
raikarsagar/asteroid
The PyTorch-based audio source separation toolkit for researchers || Pretrained models available
raikarsagar/Audio-and-text-based-emotion-recognition
A multimodal approach on emotion recognition using audio and text.
raikarsagar/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
raikarsagar/cuddly-garbanzo
raikarsagar/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
raikarsagar/fastspeech2_custom
raikarsagar/flowtron
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
raikarsagar/From-0-to-Research-Scientist-resources-guide
Detailed and tailored guide for undergraduate students or anybody want to dig deep into the field of AI with solid foundation.
raikarsagar/FullSubNet
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
raikarsagar/GradTTS
Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"
raikarsagar/ml-with-audio
HF's ML for Audio study group
raikarsagar/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
raikarsagar/multimodal-speech-emotion-recognition
Lightweight and Interpretable ML Model for Speech Emotion Recognition and Ambiguity Resolution (trained on IEMOCAP dataset)
raikarsagar/NeMo
NeMo: a toolkit for conversational AI
raikarsagar/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
raikarsagar/pyloudnorm
Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
raikarsagar/raikarsagar.github.io
Website
raikarsagar/speechbrain
A PyTorch-based Speech Toolkit
raikarsagar/SpeechDenoisingWithDeepFeatureLosses
Speech Denoising with Deep Feature Losses
raikarsagar/spyder
Simple Python package for fast DER computation
raikarsagar/svrinfo_proj
raikarsagar/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
raikarsagar/tacotron2_inference
NVIDIA tacotron2 repo with custom inference scripts
raikarsagar/tacotron2_waveglow
raikarsagar/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E, WIP
raikarsagar/waveglow
A Flow-based Generative Network for Speech Synthesis