mysxs
Ph.D. student at Institute of Information Engineering, Chinese Academy of Sciences
University of Chinese Academy of Sciences北京
Pinned Repositories
emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
ChatTTS
ChatTTS is a generative speech model for daily dialogue.
emotion2vec
Python-Wrapper-for-World-Vocoder
A Python wrapper for the high-quality vocoder "World"
SlowFast-master
SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
sxs.github.io
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Speech-Emotion-Recognition
Speech emotion recognition implemented in Keras (LSTM, CNN, SVM, MLP) | 语音情感识别
mysxs's Repositories
mysxs/SlowFast-master
mysxs/ChatTTS
ChatTTS is a generative speech model for daily dialogue.
mysxs/emotion2vec
mysxs/Python-Wrapper-for-World-Vocoder
A Python wrapper for the high-quality vocoder "World"
mysxs/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on