unilight's Stars
joonson/syncnet_python
Out of time: automated lip sync in the wild
serengil/deepface
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
facebookresearch/pytorchvideo
A deep learning library for video understanding research.
yistLin/universal-vocoder
A PyTorch implementation of the universal neural vocoder
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
microsoft/DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
felixkreuk/UnsupSeg
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)
xinjli/ucla-phonetic-corpus
Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION
lilianemomeni/KWS-Net
Seeing Wake Words: Audio-visual Keyword Spotting
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
maxrmorrison/torchcrepe
Pytorch implementation of the CREPE pitch tracker
kylebgorman/textgrid
A Python module for interacting with Praat TextGrid files. Also includes a class for reading HTK .mlf files into Praat
zhouhaoyi/Informer2020
The GitHub repository for the paper "Informer" accepted by AAAI 2021.
dmort27/epitran
A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)
tuanvu92/VCC2020
BYVoid/OpenCC
Conversion between Traditional and Simplified Chinese
nii-yamagishilab/VCC2020-listeningtest
n1243645679976/espnet
End-to-End Speech Processing Toolkit
Sinica-SLAM/Bottleneck_feature_extractor
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
nii-yamagishilab/VCC2020-database
openai/vdvae
Repository for the paper "Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images"
HLTSingapore/Emotional-Speech-Data
This is the GitHub page for publicly available emotional speech data.
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
taesungp/contrastive-unpaired-translation
Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)
microsoft/P.808
This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).
twtrubiks/docker-tutorial
Docker 基本教學 - 從無到有 Docker-Beginners-Guide 教你用 Docker 建立 Django + PostgreSQL 📝
kpu/kenlm
KenLM: Faster and Smaller Language Model Queries
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.