junkunyuan
A PhD student from Zhejiang University, working on artificial intelligence.
Zhejiang UniversityShenzhen, China
junkunyuan's Stars
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
CASIA-IVA-Lab/FastSAM
Fast Segment Anything
google-research/text-to-text-transfer-transformer
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
sczhou/ProPainter
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
ZHKKKe/MODNet
A Trimap-Free Portrait Matting Solution in Real Time [AAAI 2022]
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
TMElyralab/MusePose
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
traveller59/spconv
Spatial Sparse Convolution Library
NUS-HPC-AI-Lab/OpenDiT
OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference
Picsart-AI-Research/StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
ShareGPT4Omni/ShareGPT4Video
An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
baofff/U-ViT
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
wenquanlu/HandRefiner
Boese0601/MagicDance
[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
maitrix-org/Pandora
Pandora: Towards General World Model with Natural Language Actions and Video States
cxgincsu/SemanticGuidedHumanMatting
Robust Human Matting via Semantic Guidance, ACCV 2022.
Karine-Huang/T2I-CompBench
[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation
zhengkw18/face-vid2vid
Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis
hustvl/DiG
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
nowsyn/InstMatt
Official repository for Instance Human Matting via Mutual Guidance and Multi-Instance Refinement
ViTAE-Transformer/P3M-Net
The official repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving"
zengbohan0217/FADM
nowsyn/SparseMat