noetits's Stars
VincentStimper/normalizing-flows
PyTorch implementation of normalizing flow models
noetits/SCRIBE
Data processing and analysis of SCRIBE (Spoken Corpus Recordings In British English)
Ekeany/Clustering-Mixed-Data
A repository with various methods for clustering mixed datasets in python
noetits/MUST_P-SRL
MUST&P-SRL: Multi-lingual and Unified Syllabification in Text and Phonetic Domains for Speech Representation Learning
juice500ml/dysarthria-gop
rathaumons/pyppbox
Toolbox for people detecting, tracking, and re-identifying.
nektos/act
Run your GitHub Actions locally 🚀
numediart/MBROLA
MBROLA is a speech synthesizer based on the concatenation of diphones
lingjzhu/charsiu
Charsiu: A neural phonetic aligner.
nathanhubens/fasterai
FasterAI: Prune and Distill your models with FastAI and PyTorch
noetits/ICE-Talk
Interface for Controllable Expressive Talking Machine
SoftwareImpacts/SIMPAC-2020-65
Interface for Controllable Expressive Talking Machine
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
SuperKogito/SER-datasets
A collection of datasets for the purpose of emotion recognition/detection in speech.
numediart/ICE-Talk
Interface for Controllable Expressive Talking Machine
nii-yamagishilab/multi-speaker-tacotron
VCTK multi-speaker tacotron for ICASSP 2020
ebranlard/matlab2python
Simple matlab2python converter
Tomiinek/Blizzard2013_Segmentation
Transcripts and segmentation for the Blizzard 2013 audiobooks also known as the Lessac or Blizzard 2013 dataset.
Emotional-Text-to-Speech/dl-for-emo-tts
:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:
tugstugi/pytorch-dc-tts
Text to Speech with PyTorch (English and Mongolian)
numediart/LaughterSynthesis
This repository contains laughter-related synthesis systems.
jjery2243542/adaptive_voice_conversion
mravanelli/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
facebookresearch/nevergrad
A Python toolbox for performing gradient-free optimization
sterling239/audio-emotion-recognition
jbdel/MOSEI_UMONS
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
jbdel/WMT18_MNMT
Solution from UMONS system at WMT 18
descriptinc/melgan-neurips
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis