yumingj's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
xai-org/grok-1
Grok open release
NanmiCoder/MediaCrawler
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
danielgatis/rembg
Rembg is a tool to remove images background
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
xuebinqin/U-2-Net
The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
LargeWorldModel/LWM
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
FoundationVision/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MooreThreads/Moore-AnimateAnyone
Character Animation (AnimateAnyone, Face Reenactment)
ai-forever/Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
emeryberger/CSrankings
A web app for ranking computer science departments according to their research output in selective venues, and for finding active faculty across a wide range of areas.
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
TMElyralab/MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
layerdiffusion/LayerDiffuse
Transparent Image Layer Diffusion using Latent Transparency
TencentARC/BrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
mhamilton723/FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
openai/Video-Pre-Training
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
chflame163/ComfyUI_LayerStyle
A set of nodes for ComfyUI that can composite layer and mask to achieve Photoshop like functionality.
Computer-Vision-in-the-Wild/CVinW_Readings
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
stylegan-human/StyleGAN-Human
StyleGAN-Human: A Data-Centric Odyssey of Human Generation
williamyang1991/FRESCO
[CVPR 2024] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
3DTopia/3DTopia
Text-to-3D Generation within 5 Minutes
microsoft/XPretrain
Multi-modality pre-training
Meituan-AutoML/VisionLLaMA
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
mtli/HTML4Vision
A simple HTML visualization tool for computer vision research :hammer_and_wrench:
NIRVANALAN/LN3Diff
[ECCV-2024] LN3Diff creates high-quality 3D object mesh from text within 8 V100-SECONDS.
wang-chen/thesis_template_ntu
Thesis Latex Template for Nanyang Technological University (NTU)
NannanLi999/UniHuman
Official code for CVPR 2024 paper UniHuman: A Unified Model For Editing Human Images in the Wild