AshwinSankar17
MS DSAI @ Indian Institute of Technology Madras | AI4Bharat | Working in Speech x Multi-Modal AI.
AI4Bharat | Indian Institute of Technology, MadrasChennai
Pinned Repositories
IndicVoices-R
A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
AI4Bharat_TTS_Paper_Reading_Group
List of papers and optionally their summary covered in the TTS team's paper reading sessions at AI4Bharat
astravani
Weapons to wield sound
DeepRL
Various algorithms implemented with CLI for easier training and testing purposes
idle_time
Usage of module gives the idle time of the computer. Note: Linux requires xprintidle. Use "sudo apt install xprintidle" for the module to work.
Mimi
NewsCluster
Scrape and cluster news based on the headlines
ot-flow-matching-tts
Flow-matching DiT for speech editing and synthesis
Roar
Roar - a toolkit for Indic Speech AI
AshwinSankar17's Repositories
AshwinSankar17/AI4Bharat_TTS_Paper_Reading_Group
List of papers and optionally their summary covered in the TTS team's paper reading sessions at AI4Bharat
AshwinSankar17/astravani
Weapons to wield sound
AshwinSankar17/Mimi
AshwinSankar17/ot-flow-matching-tts
Flow-matching DiT for speech editing and synthesis
AshwinSankar17/advice
A repository of links with advice related to grad school applications, research, phd etc
AshwinSankar17/Roar
Roar - a toolkit for Indic Speech AI
AshwinSankar17/AshwinSankar17.github.io
Personal website using ai-folio
AshwinSankar17/attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
AshwinSankar17/Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
AshwinSankar17/conditional-flow-matching
TorchCFM: a Conditional Flow Matching library
AshwinSankar17/DALLE2-pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
AshwinSankar17/Diff-HierVC
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"
AshwinSankar17/google-research
Google Research
AshwinSankar17/HoMM
High order Moment Models
AshwinSankar17/iamunr4v31
AshwinSankar17/lightning-hydra-template
AshwinSankar17/moshi
AshwinSankar17/parler-tts
Inference and training library for high-quality TTS models.
AshwinSankar17/penn
Pitch Estimating Neural Networks (PENN)
AshwinSankar17/pyxis
Container plugin for Slurm Workload Manager
AshwinSankar17/rfpp
The codebase of our paper "Improving the Training of Rectified Flows"
AshwinSankar17/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
AshwinSankar17/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
AshwinSankar17/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
AshwinSankar17/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
AshwinSankar17/TTS-data-pipeline
WiP: A TTS data scraping pipeline for youtube with music separation, denoising and auto-transcriptions
AshwinSankar17/VITS
AshwinSankar17/VITS_hf_port
Code to port VITS checkpoint to AutoModel compatible version.
AshwinSankar17/VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
AshwinSankar17/x-clip
A concise but complete implementation of CLIP with various experimental improvements from recent papers