CaoYuhang

Master Degree At USTC, Speech Enhancement, ASR, LLM

Pinned Repositories

Active-Noise-Cancellation-System
LMS, RLS, QR-RLS based FIR adaptive filter for active noise cancellation with and without quantization effects.
Language:MATLAB0 1 01
AdaptiveFilterandActiveNoiseCancellation
Adaptive Filter and Active Noise Cancellation —— LMS, NLMS, RLS
Language:MATLAB00
Algorithms
Assignments for coursera's stanford algorithm's specialization
Language:C++0 2 00
APNet2
Source code of APNet2, a vocoder
Language:Python0 0 00
Artificial-Intelligence
Awesome AI Learning with +100 AI Cheat-Sheets, Free online Books, Top Courses, Best Videos and Lectures, Papers, Tutorials, +99 Researchers, Premium Websites, +121 Datasets, Conferences, Frameworks, Tools
0 1 00
ASR_Course
Language:C0 1 00
audio-visual-speech-enhancement
Official Implementation of "Visual Speech Enhancement", Interspeech 2018.
Language:Python00
Auto-Tuning-Spectral-Clustering
This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"
Language:Python0 0 00
practical-machine-learning-with-python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Language:Jupyter Notebook10
SpeechAlgorithms
Speech Algorithms
Language:C1 0 00

CaoYuhang's Repositories

CaoYuhang/SpeechAlgorithms
Speech Algorithms
Language:C1 0 00
CaoYuhang/APNet2
Source code of APNet2, a vocoder
Language:Python0 0 00
CaoYuhang/awesome
😎 Awesome lists about all kinds of interesting topics
0 0 00
CaoYuhang/Awesome-GPT-Store
A collection of major GPTS available in public
0 0
CaoYuhang/ChatTTS-api-ui-docker
One command to run ChatTTS
Language:Jupyter Notebook0 0
CaoYuhang/ChatTTS-ui
一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
CaoYuhang/dataspeech
CaoYuhang/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Language:Python0 0
CaoYuhang/fluency_scorer
It's unofficial implementation for speech fluency assessment model
CaoYuhang/Free-Certifications
A curated list of free courses & certifications.
0 0
CaoYuhang/g2p_mix
Language:Python0 0
CaoYuhang/GPT-vup
GPT-vup BIliBili | 抖音 | AI | 虚拟主播
Language:Python0 0
CaoYuhang/hackingtool
ALL IN ONE Hacking Tool For Hackers
Language:Python0 0
CaoYuhang/IP_LAP
CVPR2023 talking face implementation for Identity-Preserving Talking Face Generation With Landmark and Appearance Priors
Language:Python0 0
CaoYuhang/LiveWhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
Language:Python0 0
CaoYuhang/megatts2
Unoffical implement of Megatts2
CaoYuhang/mini-omni
open-source multimodel large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
CaoYuhang/moshi
CaoYuhang/mustango
Mustango: Toward Controllable Text-to-Music Generation
CaoYuhang/OpenPhonemizer
Permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning.
Language:Python0 0
CaoYuhang/RefAudioEmoTagger
一种基于Emotion2Vec的批量音频情感自动标注脚本
CaoYuhang/roop
one-click face swap
Language:Python0 0
CaoYuhang/SpatialCodec
Language:Python0 0
CaoYuhang/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
Language:Python0 0
CaoYuhang/stable-audio-tools
Generative models for conditional audio generation
CaoYuhang/stable-speech
Reproduction of Stability AI's Text-to-Speech model.
Language:Python0 0
CaoYuhang/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Language:Python0 0
CaoYuhang/supervoice-voicebox
VoiceBox neural network implementation
CaoYuhang/voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Language:Python1 0
CaoYuhang/wukong-robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。