Pinned Repositories
AutoVocoder
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models, a darkhorse in the field of Generative Models
diffusion-audio-restoration-nvidia-SR
Audio-to-Audio Schrodinger Bridges is a diffusion-based audio restoration model for bandwidth extension and inpainting.
F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
golf_diff_Glottal_Flow_LPC_synthesis
A DDSP-based neural vocoder.
MB-iSTFT-VITS2_super-monotonic-align
Application of MB-iSTFT-VITS components to vits2_pytorch
tacospawn
PyTorch implementation of TacoSpawn, Speaker Generation
unconditional-diff-STFT
Unconditional music synthesis using a diffusion model in the STFT domain
WaveletAttention
Wavelet-Attention CNNs for Image Classification
SynthAether's Repositories
SynthAether/auraloss
Collection of audio-focused loss functions in PyTorch
SynthAether/RAVE
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder
SynthAether/speechbrain_Conversational-AI
A PyTorch-based Speech Toolkit
SynthAether/whisperX
WhisperX: Timestamp-Accurate Automatic Speech Recognition.
SynthAether/BABE2_music_restoration_enhancement
SynthAether/bark_TTS
🔊 Text-prompted Generative Audio Model
SynthAether/CML-TTS-Dataset
CML-TTS: A Multilingual Dataset for Speech Synthesis
SynthAether/dpm-solver
Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps"
SynthAether/fairseq2
FAIR Sequence Modeling Toolkit 2
SynthAether/FastBERT
The repository for the code of the FastBERT paper
SynthAether/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
SynthAether/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
SynthAether/Grad-TTS
Implementation of the 'Grad-TTS' with Multilingual Cleaners
SynthAether/LAVISH
Vision Transformers are Parameter-Efficient Audio-Visual Learners
SynthAether/llama.cpp
Port of Facebook's LLaMA model in C/C++
SynthAether/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
SynthAether/nansypp
Unofficial implementation of NANSY++ in Pytorch Lightning
SynthAether/Neural-HMM
Neural HMMs are all you need (for high-quality attention-free TTS)
SynthAether/penn_Pitch-Estimating-Neural-Networks-
Pitch Estimating Neural Networks (PENN)
SynthAether/PitchSqueezer
A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation
SynthAether/podcast-summarizer
OpenAI Whisper + davinci for podcast summarization
SynthAether/ppgs_High-Fidelity-Neural-Phonetic-Posteriorgrams
High-Fidelity Neural Phonetic Posteriorgrams
SynthAether/praat
Praat: Doing Phonetics By Computer
SynthAether/pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
SynthAether/tango
Codes and Model of the paper "Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model"
SynthAether/tts-arabic-pytorch
TTS models for Arabic (Tacotron2, FastPitch)
SynthAether/vitsgpt-vits
the code for vits in the vitsGPT project
SynthAether/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
SynthAether/VoRAS_VC
VoRAS: Vocos Retrieval and self-Augmentation for Speech
SynthAether/XPhoneBERT
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)