chenxwh

Research Scientist @facebookresearch

University of Cambridge

Pinned Repositories

bark
🔊 Text-Prompted Generative Audio Model
Language:Python102 6 021
cog-RMBG
Fork of https://huggingface.co/briaai/RMBG-1.4
Language:Python100 2 016
cog-sd-txt2imghd
Stable-diffusion with Real-ESRGAN for upsampling
Language:Python74 3 78
cog-themed-diffusion
Language:Python43 1 416
cog-whisper
Language:Roff81 1 1428
insanely-fast-whisper
Incredibly fast Whisper-large-v3
Language:Jupyter Notebook1.9k 13 0109
Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
Language:Jupyter Notebook87 3 036
SadTalker
（CVPR 2023）SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Language:Python32 5 018
SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild
Language:Python98 1 010
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Language:Python60 1 010

chenxwh's Repositories

chenxwh/bark
🔊 Text-Prompted Generative Audio Model
Language:Python102 6 021
chenxwh/cog-RMBG
Fork of https://huggingface.co/briaai/RMBG-1.4
Language:Python100 2 016
chenxwh/i2vgen-xl
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
Language:Python25 1 010
chenxwh/Semantic-Segment-Anything
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
Language:Python23 0 02
chenxwh/cog-deforum-stable-diffusion
Language:Python16 0 08
chenxwh/Grounded-Segment-Anything
Marrying Grounding DINO with Segment Anything & Stable Diffusion & Tag2Text & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Audio Inputs
Language:Jupyter Notebook15 0 03
chenxwh/VideoCrafter
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
Language:Python13 0 01
chenxwh/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
Language:Python11 0 04
chenxwh/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
Language:Python10 0 03
chenxwh/ControlVideo
Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
Language:Python8 0 03
chenxwh/replicate-sd-textual-inversion
Language:Python8 1 014
chenxwh/AudioSep
Official implementation of "Separate Anything You Describe"
Language:Python5 0 02
chenxwh/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Language:Python4 0 05
chenxwh/Prompt-Free-Diffusion
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Language:Python3 0 01
chenxwh/UnIVAL
Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.
Language:Jupyter Notebook3 0 0
chenxwh/cog-I2VGen-XL
Language:Python2 1 01
chenxwh/Cutie
[arXiv 2023] Putting the Object Back Into Video Object Segmentation
Language:Python2 0 01
chenxwh/InternLM-XComposer
Language:Python2 0 01
chenxwh/shap-e
Generate 3D objects conditioned on text or images
Language:Python2 1 03
chenxwh/StableSR
Exploiting Diffusion Prior for Real-World Image Super-Resolution
Language:Python2 0 0
chenxwh/StyleDrop-PyTorch
Unoffical implement for [StyleDrop](https://arxiv.org/abs/2306.00983)
Language:Python2 0 0
chenxwh/T2I-Adapter
T2I-Adapter
Language:Python2 0 02
chenxwh/cog-ledits
Language:Python1 1 0
chenxwh/Depth-Anything
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Language:Python1 0 01
chenxwh/TokenFlow
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
Language:Python1 0 02
chenxwh/Wuerstchen
Official implementation of Würstchen: Efficient Pretraining of Text-to-Image Models
Language:Jupyter Notebook1 0 01
chenxwh/FastSAM
Fast Segment Anything
Language:Python0 01
chenxwh/LISA
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Language:Python0 0
chenxwh/ResShift
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (PyTorch)
Language:Python0 0
chenxwh/webie
Dataset for web-scaled information extraction.
Language:Python0 0