Pinned Repositories
book-text-to-speech
A book about Text-to-Speech (TTS) in Chinese.
cc-compare
一款可替换beycond compare, 免费使用的代码同步对比工具,来自**。
downkyi
哔哩下载姬downkyi,B站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。
Grad-TTS-Chinese
Huawei Grad-TTS for Chinese
LLMBook-zh.github.io
《大语言模型》作者:赵鑫,李军毅,周昆,唐天一,文继荣
MiniThunder
android迷你版迅雷,支持thunder:// ftp:// http:// ed2k:// 磁力链 种子文件的下载,音视频文件支持边下边播.
so-vits-svc
SoftVC VITS Singing Voice Conversion
TikTokDownloader
完全免费开源,基于 Requests 模块实现:TikTok 主页/视频/图集/原声;抖音主页/视频/图集/收藏/直播/原声/合集/评论/账号/搜索/热榜数据采集工具
tuning_playbook
《深度学习调优指南》A playbook for systematically maximizing the performance of deep learning models.
VI-Speaker
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
MaxMax2016's Repositories
MaxMax2016/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
MaxMax2016/MeloTTS.cpp
A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting mixed English and Chinese languages.
MaxMax2016/REAL_TIME_NKF_AEC
神经网络回声消除,C实现
MaxMax2016/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
MaxMax2016/BigVGAN-Official
终于开源了 Official implementation of BigVGAN in PyTorch
MaxMax2016/ChatTTS
ChatTTS is a generative speech model for daily dialogue.
MaxMax2016/Diff-MST
音乐生成,Multitrack music mixing style transfer given a reference song using differentiable mixing console.
MaxMax2016/e2-tts-pytorch
Flow-matching Transformer,Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
MaxMax2016/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
MaxMax2016/FasterLivePortrait
Bring portraits to life in Real Time!onnx/tensorrt support!
MaxMax2016/FireRedTTS
小红书语音合成大模型
MaxMax2016/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
MaxMax2016/gryannote
说话人识别,Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
MaxMax2016/hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
MaxMax2016/HierSpeechpp
The official implementation of HierSpeech++
MaxMax2016/LivePortrait
头像动作迁移,Make one portrait alive!
MaxMax2016/moshi
类似GPT4O,语音端到端交互
MaxMax2016/optispeech
TTS, A lightweight end-to-end text-to-speech model
MaxMax2016/Parrot-TTS
Official Code for ParrotTTS
MaxMax2016/promptttspp
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
MaxMax2016/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
MaxMax2016/seed-vc
seed-tts: zero-shot voice conversion with in context learning
MaxMax2016/SenseVoice-onnx
sensevoice with onnx runtime
MaxMax2016/stable-speech
Reproduction of Stability AI's Text-to-Speech model.
MaxMax2016/SysMocap
数字人动捕和驱动完整方案 A real-time motion capture system for 3D virtual character animating.
MaxMax2016/tinyspeech
Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"
MaxMax2016/TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
MaxMax2016/viet-tts
(越南语)VietTTS: An Open-Source Vietnamese Text to Speech
MaxMax2016/x-vits
Experiments to confirm the performance of PeriodVITS
MaxMax2016/zerovox
zero-shot realtime TTS system, fully offline, free and open source