MingjieChen
Postdoc researcher, Speech Processing, Natural Language Processing, Conversational AI, at University of Sheffield
MingjieChen's Stars
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
guillaumekln/faster-whisper
Faster Whisper transcription with CTranslate2
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
microsoft/i-Code
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
atong01/conditional-flow-matching
TorchCFM: a Conditional Flow Matching library
hollobit/GenAI_LLM_timeline
ChatGPT, GenerativeAI and LLMs Timeline
DmitryRyumin/INTERSPEECH-2023-Papers
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
yangdongchao/AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
ivanvovk/WaveGrad
Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.
audiolabs/webMUSHRA
a MUSHRA compliant web audio API based experiment software
lablab-ai/Whisper-transcription_and_diarization-speaker-identification-
How to use OpenAIs Whisper to transcribe and diarize audio files
microsoft/Pengi
An Audio Language model for Audio Tasks
adelacvg/NS2VC
Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech
DongKeon/Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
roatienza/efficientspeech
PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.
jasonppy/PromptingWhisper
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
ga642381/Speech-Prompts-Adapters
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
pyf98/DPHuBERT
INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"
patrickltobing/cyclevae-vc-neuralvoco
facebookresearch/Noresqa
This github repo is for Neurips 2021 and Interspeech 2022 papers on Non-Matching Reference based estimation of speech quality assessment.
linan2/Voice-activity-detection-VAD-paper-and-code
Voice activity detection (VAD) paper and code(From 198*~ )and its classification.
caskcsg/SPCL
code for "Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation, EMNLP 22"
guxm2021/ALT_SpeechBrain
[ISMIR 2022] Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
W-Wu/DEER
ZhihaoDU/du2022sond
Speaker overlap-aware Neural Diarization