taichuai

Focus, focus, focus

Mars

taichuai's Stars

hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
Language:Python41.9k 247 5806.1k
1adrianb/face-alignment
:fire: 2D and 3D Face alignment library build using pytorch
Language:Python7.2k 174 3161.4k
google-deepmind/alphafold3
AlphaFold 3 inference pipeline.
Language:Python5.7k 49 201671
mseitzer/pytorch-fid
Compute FID scores with PyTorch.
Language:Python3.5k 13 86516
Tencent/Hunyuan3D-1
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation
Language:Python2.5k187
genmoai/models
The best OSS video generation models
Language:Python2.1k 34 67209
deepseek-ai/Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
Language:Python1.3k 23 2260
wenqsun/DimensionX
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
Language:Python1.1k 41 3266
NVIDIA/Cosmos-Tokenizer
A suite of image and video neural tokenizers
Language:Python1k 17 026
edwko/OuteTTS
Interface for OuteTTS models.
Language:Python786 20 4162
mqz111a/virtual_human_stream
The "virtual_human_stream" project is a real-time digital human system supporting audio-video dialogue. It integrates models like ernerf, musetalk, and wav2lip for voice cloning, video stitching, and streaming via RTMP/WebRTC. It’s optimized for high performance and easy customization, with support for ChatGPT dialogue integration.
Language:Python613 42 091
Henry-23/VideoChat
实时语音交互数字人，支持端到端语音方案（GLM-4-Voice - THG）和级联方案（ASR-LLM-TTS-THG）。可自定义形象与音色，无须训练，支持音色克隆，首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.
Language:Python564 9 3573
EnVision-Research/Lotus
Official Implementation of LOTUS: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Language:Python532 16 2429
adobe-research/MakeItTalk
Language:Jupyter Notebook500 25 23308
OleehyO/TexTeller
TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability, enabling it to cover most usage scenarios.
Language:Python401 6 1746
xg-chu/GAGAvatar
[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar
Language:Python381 16 2436
wdndev/tiny-llm-zh
从零实现一个小参数量中文大语言模型。
Language:Python373 6 1741
dongzhuoyao/awesome-flow-matching
A summary of related works about flow matching, stochastic interpolants
359 14 213
Femoon/tts-azure-web
TTS Azure Web 是一个 Azure 文本转语音（TTS）网页应用，可以在本地或者云端使用你的 Azure Key 一键部署。TTS Azure Web is an Azure Text-to-Speech (TTS) web application. It allows you to run it locally or deploy it with a single click using your Azure Key.
Language:TypeScript355 2 742
VideoVerses/VideoTuna
Let's finetune video generation models!
Language:Python337 7 1111
VITA-MLLM/Freeze-Omni
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Language:Python22815
ddlBoJack/Awesome-Speech-Language-Model
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
14312
shaopengw/Awesome-Music-Generation
Awesome music generation model——MG²
Language:Python121 9 711
cantabile-kwok/vec2wav2.0
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
Language:Python58 10 55
LeCAR-Lab/wococo
Language:Python56 0 04
XingliangJin/MCM-LDM
[CVPR 2024] Arbitrary Motion Style Transfer with Multi-condition Motion Latent Diffusion Model
Language:Python47 4 129
GhostCai/PortraitRelighting
Official PyTorch implementation of the CVPR 2024 Highlight Paper "Real-time 3D-aware Portrait Video Relighting"
Language:Python36 5 65
audeering/w2v2-age-gender-how-to
How to use our public wav2vec2 age and gender model
Language:Jupyter Notebook30 3 52
liuhuang31/Megatts2_HierSpeechpp
Megatts2 use HierSpeechpp's vocoder
Language:Python171
PeizhiYan/mediapipe-blendshapes-to-flame
Mapping Mediapipe's 52 blendshapes to FLAME's expression coefficients and poses.
Language:Jupyter Notebook6 2 10

taichuai

taichuai's Stars

hacksider/Deep-Live-Cam

1adrianb/face-alignment

google-deepmind/alphafold3

mseitzer/pytorch-fid

Tencent/Hunyuan3D-1

genmoai/models

deepseek-ai/Janus

wenqsun/DimensionX

NVIDIA/Cosmos-Tokenizer

edwko/OuteTTS

mqz111a/virtual_human_stream

Henry-23/VideoChat

EnVision-Research/Lotus

adobe-research/MakeItTalk

OleehyO/TexTeller

xg-chu/GAGAvatar

wdndev/tiny-llm-zh

dongzhuoyao/awesome-flow-matching

Femoon/tts-azure-web

VideoVerses/VideoTuna

VITA-MLLM/Freeze-Omni

ddlBoJack/Awesome-Speech-Language-Model

shaopengw/Awesome-Music-Generation

cantabile-kwok/vec2wav2.0

LeCAR-Lab/wococo

XingliangJin/MCM-LDM

GhostCai/PortraitRelighting

audeering/w2v2-age-gender-how-to

liuhuang31/Megatts2_HierSpeechpp

PeizhiYan/mediapipe-blendshapes-to-flame