Pinned Repositories
Active-Noise-Cancellation-System
LMS, RLS, QR-RLS based FIR adaptive filter for active noise cancellation with and without quantization effects.
AdaptiveFilterandActiveNoiseCancellation
Adaptive Filter and Active Noise Cancellation —— LMS, NLMS, RLS
Algorithms
Assignments for coursera's stanford algorithm's specialization
APNet2
Source code of APNet2, a vocoder
Artificial-Intelligence
Awesome AI Learning with +100 AI Cheat-Sheets, Free online Books, Top Courses, Best Videos and Lectures, Papers, Tutorials, +99 Researchers, Premium Websites, +121 Datasets, Conferences, Frameworks, Tools
ASR_Course
audio-visual-speech-enhancement
Official Implementation of "Visual Speech Enhancement", Interspeech 2018.
Auto-Tuning-Spectral-Clustering
This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"
practical-machine-learning-with-python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
SpeechAlgorithms
Speech Algorithms
CaoYuhang's Repositories
CaoYuhang/SpeechAlgorithms
Speech Algorithms
CaoYuhang/APNet2
Source code of APNet2, a vocoder
CaoYuhang/awesome
😎 Awesome lists about all kinds of interesting topics
CaoYuhang/Awesome-GPT-Store
A collection of major GPTS available in public
CaoYuhang/ChatTTS-api-ui-docker
One command to run ChatTTS
CaoYuhang/ChatTTS-ui
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
CaoYuhang/dataspeech
CaoYuhang/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
CaoYuhang/fluency_scorer
It's unofficial implementation for speech fluency assessment model
CaoYuhang/Free-Certifications
A curated list of free courses & certifications.
CaoYuhang/g2p_mix
CaoYuhang/GPT-vup
GPT-vup BIliBili | 抖音 | AI | 虚拟主播
CaoYuhang/hackingtool
ALL IN ONE Hacking Tool For Hackers
CaoYuhang/IP_LAP
CVPR2023 talking face implementation for Identity-Preserving Talking Face Generation With Landmark and Appearance Priors
CaoYuhang/LiveWhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
CaoYuhang/megatts2
Unoffical implement of Megatts2
CaoYuhang/mini-omni
open-source multimodel large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
CaoYuhang/moshi
CaoYuhang/mustango
Mustango: Toward Controllable Text-to-Music Generation
CaoYuhang/OpenPhonemizer
Permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning.
CaoYuhang/RefAudioEmoTagger
一种基于Emotion2Vec的批量音频情感自动标注脚本
CaoYuhang/roop
one-click face swap
CaoYuhang/SpatialCodec
CaoYuhang/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
CaoYuhang/stable-audio-tools
Generative models for conditional audio generation
CaoYuhang/stable-speech
Reproduction of Stability AI's Text-to-Speech model.
CaoYuhang/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
CaoYuhang/supervoice-voicebox
VoiceBox neural network implementation
CaoYuhang/voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system
CaoYuhang/wukong-robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。