FChin39's Stars
X-LANCE/AniTalker
[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"
Plachtaa/seed-vc
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
divan/txqr
Transfer data via animated QR codes
sz3/libcimbar
Optimized implementation for color-icon-matrix barcodes
Stirling-Tools/Stirling-PDF
#1 Locally hosted web application that allows you to perform various operations on PDF files
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
facebookresearch/voxpopuli
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
CosmosShadow/gptpdf
Using GPT to parse PDF
2noise/ChatTTS
A generative speech model for daily dialogue.
spatialaudio/jackclient-python
🂻 JACK Audio Connection Kit (JACK) Client for Python :snake:
niedev/RTranslator
Open source real-time translation app for Android that runs locally
andrewyng/translation-agent
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文仓库(随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
ben0oil1/GPT-SoVITS-Server
【脱离复杂的环境配置和整合包,极简配置推理服务】从GPT-SoVITS项目里面提取出来的,纯粹的推理服务方案。
PantsuDango/Dango-Translator
团子翻译器 —— 个人兴趣制作的一款基于OCR技术的翻译器
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
koodo-reader/koodo-reader
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
yerfor/GeneFacePlusPlus
GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
KevinWang676/Bark-Voice-Cloning
Bark Voice Cloning and Voice Cloning for Chinese Speech
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
facebookresearch/audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
wavmark/wavmark
AI-based Audio Watermarking Tool
Far-Se/win32audio
Flutter package to handle windows audio devices. Also extracts native icon to bytes in dart
jackaudio/jack2
jack2 codebase
yxlllc/DDSP-SVC
Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)