lbehringer

lbehringer's Stars

jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Language:Jupyter Notebook7.5k 88 124734
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python4.5k 58 151382
huggingface/parler-tts
Inference and training library for high-quality TTS models.
Language:Python4.2k 55 93412
yisol/IDM-VTON
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
Language:Python3.6k 54 145577
libAudioFlux/audioFlux
A library for audio and music analysis, feature extraction.
Language:C2.7k 32 16117
gedeck/practical-statistics-for-data-scientists
Code repository for O'Reilly book
Language:Jupyter Notebook2.7k 71 261.7k
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
Language:Jupyter Notebook2.4k 35 47197
daochenzha/data-centric-AI
A curated, but incomplete, list of data-centric AI resources.
1k 15 372
oborchers/Fast_Sentence_Embeddings
Compute Sentence Embeddings Fast!
Language:Jupyter Notebook619 12 5583
maxrmorrison/torchcrepe
Pytorch implementation of the CREPE pitch tracker
Language:Python397 9 2661
hubertsiuzdak/snac
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Language:Python354 7 1921
keonlee9420/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Language:Python318 12 2041
huggingface/dataspeech
Language:Python272 13 1535
k2-fsa/libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Language:Python171 6 610
nyrahealth/CrisperWhisper
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
Language:Python1557
marianne-m/brouhaha-vad
Predicts the level of noise and reverberation on your audiofiles
Language:Jupyter Notebook134 10 1624
Wataru-Nakata/miipher
Unofficial implementation of miipher
Language:Python103 5 714
SamsungLabs/SummaryMixing
This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).
Language:Python99 10 311
DataDome/sliceline
✂️ Fast slice finding for Machine Learning model debugging.
Language:Python87 5 105
interactiveaudiolab/ppgs
High-Fidelity Neural Phonetic Posteriorgrams
Language:Python74 8 134
k2-fsa/text_search
Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup
Language:Python56 12 1314
dhimasryan/MOSA-Net-Cross-Domain
Language:Python47 3 79
yeounoh/slicefinder
automatic data slicing
Language:Jupyter Notebook34 2 16
msalhab96/SNR-Estimation-Using-Deep-Learning
An implementation for Frame-level Speech Signal-to-Noise Ratio Estimation using deep learning
Language:Jupyter Notebook30 1 15
Fraunhofer-IIS/ODAQ
24 4 00
audeering/w2v2-age-gender-how-to
How to use our public wav2vec2 age and gender model
Language:Jupyter Notebook23 3 52
sarulab-speech/Coco-Nut
Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus
22 1 00
utter-project/mHuBERT-147-scripts
Collection of scripts from mHuBERT-147.
Language:Python202
cisco/multilingual-speech-testing
Test software and data for evaluation of speech processing algorithms in multiple languages
Language:Python7 6 00
SonyResearch/project_ethics_augmented_datasheets_for_speech_datasets
Public code repo for research paper
Language:TeX6 1 01