chenxwh

Research Scientist @facebookresearch

University of Cambridge

Pinned Repositories

bark
🔊 Text-Prompted Generative Audio Model
Language:Python102 6 020
cog-RMBG
Fork of https://huggingface.co/briaai/RMBG-1.4
Language:Python92 2 015
cog-sd-txt2imghd
Stable-diffusion with Real-ESRGAN for upsampling
Language:Python75 3 710
cog-themed-diffusion
Language:Python41 1 416
cog-whisper
Language:Roff81 1 1428
insanely-fast-whisper
Incredibly fast Whisper-large-v3
Language:Jupyter Notebook1.9k 13 0109
Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
Language:Jupyter Notebook88 3 036
rudalle-sr
A Cog implementation of the Real-ESRGAN super-resolution model from ruDALL-E.
Language:Python32 2 16
SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild
Language:Python97 1 09
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Language:Python59 1 010

chenxwh's Repositories

chenxwh/insanely-fast-whisper
Incredibly fast Whisper-large-v3
Language:Jupyter Notebook1.9k 13 0109
chenxwh/SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild
Language:Python97 1 09
chenxwh/Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
Language:Jupyter Notebook88 3 036
chenxwh/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Language:Python59 1 010
chenxwh/SadTalker
（CVPR 2023）SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Language:Python29 5 017
chenxwh/OpenVoice
Instant voice cloning by MyShell.
Language:Python24 1 05
chenxwh/Sana
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Language:Python6 0 02
chenxwh/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Language:Python5 0 0
chenxwh/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python5 0 01
chenxwh/Omost
Your image is almost there!
Language:Python5 0 02
chenxwh/Lotus
Official Implementation of Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Language:Python4
chenxwh/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python4 0 01
chenxwh/DeepSeek-VL2
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Language:Python32
chenxwh/PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Language:Python3 0 0
chenxwh/CogVideo
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python2
chenxwh/ml-depth-pro
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
Language:Python23
chenxwh/chenxwh.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
Language:JavaScript1 0 0
chenxwh/DiffSynth-Studio
Enjoy the magic of Diffusion models!
Language:Python1 0 0
chenxwh/Florence-VL
Language:Python1 0 0
chenxwh/NOVA
NOVA: Autoregressive Video Generation without Vector Quantization
Language:Python1 0 0
chenxwh/OminiControl
A minimal and universal controller for FLUX.1.
Language:Python1
chenxwh/OmniParser
A simple screen parsing tool towards pure vision based GUI agent
Language:Jupyter Notebook1 0 0
chenxwh/CogView3
text to image to generation: CogView3-Plus and CogView3(ECCV 2024)
Language:Python
chenxwh/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
Language:Python
chenxwh/Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Language:Python0 02
chenxwh/DepthCrafter
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Language:Python
chenxwh/echomimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Language:Python0 0
chenxwh/hart
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Language:Python0 0
chenxwh/LTX-Video
Official repository for LTX-Video
Language:Python0 0
chenxwh/OneDiffusion
Language:Python