Pinned Repositories
AdaSpeech
An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"
AuxFormer
AuxFormer: Robust Approach to Audiovisual Emotion Recognition
bark
🔊 Text-Prompted Generative Audio Model
ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
ChatTTS
ChatTTS is a generative speech model for daily dialogue.
CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
espnet
End-to-End Speech Processing Toolkit
FastSpeech
The Implementation of FastSpeech based on pytorch.
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
yxfy's Repositories
yxfy/AdaSpeech
An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"
yxfy/AuxFormer
AuxFormer: Robust Approach to Audiovisual Emotion Recognition
yxfy/bark
🔊 Text-Prompted Generative Audio Model
yxfy/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
yxfy/ChatTTS
ChatTTS is a generative speech model for daily dialogue.
yxfy/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
yxfy/espnet
End-to-End Speech Processing Toolkit
yxfy/FastSpeech
The Implementation of FastSpeech based on pytorch.
yxfy/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
yxfy/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
yxfy/Information-Extraction-Chinese
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
yxfy/LLaSM
第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。
yxfy/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
yxfy/MOSNet-pytorch
The pytorch implement of MOSNet
yxfy/MOSNettf
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
yxfy/Multimodal-Emotion-Recognition
This repository contains the code for the paper `End-to-End Multimodal Emotion Recognition using Deep Neural Networks`.
yxfy/NeuralSpeech
yxfy/PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System and End-to-End Speech Simultaneous Translation.
yxfy/Robust_Fine_Grained_Prosody_Control
PyTorch Implementation of Robust and fine-grained prosody control of end-to-end speech synthesis
yxfy/seed-tts-eval
yxfy/SenseVoice
Multilingual Voice Understanding Model
yxfy/SoundLabel
语音数据集制作标记工具
yxfy/speech
yxfy/TeleSpeech-ASR
yxfy/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese)
yxfy/VoiceprintRecognition-Pytorch
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
yxfy/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit