yaleimeng's Stars
2dust/v2rayNG
A V2Ray client for Android, support Xray core and v2fly core
ZiqiaoPeng/SyncTalk
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
MetaCubeX/ClashMetaForAndroid
A rule-based tunnel for Android.
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
yerfor/GeneFacePlusPlus
GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
yerfor/Real3DPortrait
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code
lyswhut/lx-music-mobile
一个基于 React native 开发的音乐软件
IntelLabs/fastRAG
Efficient Retrieval Augmentation and Generation Framework
Zz-ww/SadTalker-Video-Lip-Sync
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
TencentARC/GFPGAN
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
datawhalechina/self-llm
《开源大模型食用指南》针对**宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
zvxme/wav2lip_chinese
eosphoros-ai/DB-GPT
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
Picovoice/cobra
On-device voice activity detection (VAD) powered by deep learning
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
guoqincode/Open-AnimateAnyone
Unofficial Implementation of Animate Anyone
fishaudio/fish-speech
SOTA Open Source TTS
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Plachtaa/VITS-fast-fine-tuning
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
PlayVoice/vits_chinese
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
hiroi-sora/Umi-OCR
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
LC044/WeChatMsg
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
Raudaschl/rag-fusion
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
Stability-AI/generative-models
Generative Models by Stability AI
SagerNet/sing-box
The universal proxy platform
leoFitz1024/wallhaven
基于wallhaven.cc的一款壁纸管理工具