jhauret
PhD student in machine learning applied to acoustics at Cnam, Paris. Bringing deep learning from research to production.
Conservatoire National des Arts et MétiersParis
jhauret's Stars
bshall/hubert
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Aria-K-Alethia/BigCodec
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
kyutai-labs/moshi
MarcLafon/heatood
This repo contains the official implementation of Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection (ICML'23).
NathanGodey/headless-lm
Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https://arxiv.org/abs/2309.08351)
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
jhauret/vibravox
Speech to Phoneme, Bandwidth Extension and Speaker Verification using the Vibravox dataset.
haoheliu/SemantiCodec-inference
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
urgent-challenge/urgent2024_challenge
Official data preparation scripts for the URGENT 2024 Challenge
perladoubinsky/SemAug
[WAVC 2024] Official implementation of the paper: Semantic Generative Augmentations for Few-shot Counting
muqiaoy/PAAP
huggingface/competitions
jhauret/eben
Repo for source code of EBEN: Extreme Bandwidth Extension Network
SamsungLabs/hifi_plusplus
HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement (ICASSP 2023)
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
resemble-ai/Resemblyzer
A python package to analyze and compare voices with deep learning
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
QxLabIreland/listening-test
An open source platform for browser based speech and audio subjective quality tests.
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
samsad35/source-filter-vae
Learning and controlling the source-filter representation of speech with a variational autoencoder
google/visqol
Perceptual Quality Estimator for speech and audio
facebookresearch/Noresqa
This github repo is for Neurips 2021 and Interspeech 2022 papers on Non-Matching Reference based estimation of speech quality assessment.
elevoctech/ESMB-corpus
RookieJunChen/Inter-SubNet
The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.
CompVis/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
facebookresearch/textlesslib
Library for Textless Spoken Language Processing
lucidrains/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch