shreeshailgan

Pinned Repositories

phonemizer
Simple text to phones converter for multiple languages
Language:Python1.3k 23 154175
voxceleb_trainer
In defence of metric learning for speaker recognition
Language:Python1.1k 30 174274
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Language:Python1.9k 28 220543
Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
Language:Python1.4k 36 724252
python-audio-separator
Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)
Language:Python558 11 11393
NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.5k 210 2.3k2.6k
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python8k 76 226601
Speech-Editing-Toolkit
It's a repository for implementations of neural speech editing algorithms.
Language:Python192 9 2419

shreeshailgan doesn’t have any repository yet.