lbehringer's Stars
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
openai/gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
152334H/tortoise-tts-fast
Fast TorToiSe inference (5x or your money back!)
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
HazyResearch/safari
Convolutions for Sequence Modeling
b04901014/MQTTS
alvinlindstam/grapheme
A python package for grapheme aware string handling
JonathanFly/bark
🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
elevenlabs/elevenlabs-python
The official Python API for ElevenLabs Text to Speech.
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
resemble-ai/Resemblyzer
A python package to analyze and compare voices with deep learning
AndreevP/wvmos
MOS score prediction by fine-tuned wav2vec2.0 model
common-voice/common-voice
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
miguelmota/intent-utterance-expander
Expand custom utterance slots of phrases, to use with Alexa Skills Kit Sample Utterances.
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
thunlp/OpenDelta
A plug-and-play library for parameter-efficient-tuning (Delta Tuning)
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
as-ideas/ForwardTacotron
⏩ Generating speech in a single forward pass without any attention!
iisys-hof/HUI-Audio-Corpus-German
This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repository it is possible to automatically recreate the dataset. It is also possible to add more speakers to the processing pipeline.
dmort27/panphon
Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.
openai/openai-cookbook
Examples and guides for using the OpenAI API
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
py-pdf/pypdf
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
jitsi/jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
cldf-clts/clts
Cross-Linguistic Transcription Systems