wangzheallen

Computer Vision ResearcherSF, US

wangzheallen's Stars

instantX-research/InstantID
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
Language:Python11.3k 128 233823
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
Language:Jupyter Notebook10.1k 85 249826
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language:C++8k 78 172417
leptonai/search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
Language:TypeScript7.9k 55 671k
HumanAIGC/EMO
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
7.5k 335 266922
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Language:Python6.6k 44 83592
gaomingqi/Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Language:Python6.6k 62 140483
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Language:Jupyter Notebook6.4k 121 107428
OpenGVLab/DragGAN
Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" （DragGAN 全功能实现，在线Demo，本地部署试用，代码、模型已全部开源，支持Windows, macOS, Linux）
Language:Python5k 66 113491
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
Language:Python4.9k 49 209510
MarkFzp/mobile-aloha
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Language:Jupyter Notebook3.9k 75 17681
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
Language:Python3.8k 48 176289
MooreThreads/Moore-AnimateAnyone
Character Animation (AnimateAnyone, Face Reenactment)
Language:Python3.3k 37 154253
MarkFzp/act-plus-plus
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
Language:Python3.1k 47 57567
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
Language:Python2.7k 37 56259
isl-org/ZoeDepth
Metric depth estimation from a single image
Language:Jupyter Notebook2.4k 34 118221
siliconflow/onediff
OneDiff: An out-of-the-box acceleration library for diffusion models.
Language:Jupyter Notebook1.8k 39 463108
NVlabs/FoundationPose
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Language:Python1.6k 30 272224
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
Language:Python1.6k 40 99308
rese1f/StableVideo
[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Language:Python1.4k 21 2489
chuanyangjin/fast-DiT
Fast Diffusion Models with Transformers
Language:Python763 6 1599
OpenRobotLab/PointLLM
[ECCV 2024 Best Paper Candidate] PointLLM: Empowering Large Language Models to Understand Point Clouds
Language:Python676 14 4333
exiawsh/StreamPETR
[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Language:Python605 12 24265
csuhan/OneLLM
[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language
Language:Python604 11 2733
tianweiy/DMD2
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
Language:Python580 6 5031
DerryHub/BEVFormer_tensorrt
BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Language:Python448 5 8171
OpenGVLab/PonderV2
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Language:Python328 19 256
Tsinghua-MARS-Lab/futr3d
Code for paper: FUTR3D: a unified sensor fusion framework for 3d detection
Language:Python285 16 6139
jiawei-ren/diffmimic
[ICLR 2023] DiffMimic: Efficient Motion Mimicking with Differentiable Physics https://arxiv.org/abs/2304.03274
Language:Python280 12 621
DiT-3D/DiT-3D
🔥🔥🔥Official Codebase of "DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation"
Language:Python237 12 2617

wangzheallen

wangzheallen's Stars

instantX-research/InstantID

artidoro/qlora

SJTU-IPADS/PowerInfer

leptonai/search_with_lepton

HumanAIGC/EMO

facebookresearch/DiT

gaomingqi/Track-Anything

FoundationVision/VAR

OpenGVLab/DragGAN

allenai/OLMo

MarkFzp/mobile-aloha

mlfoundations/open_flamingo

MooreThreads/Moore-AnimateAnyone

MarkFzp/act-plus-plus

facebookresearch/jepa

isl-org/ZoeDepth

siliconflow/onediff

NVlabs/FoundationPose

AI-Hypercomputer/maxtext

rese1f/StableVideo

chuanyangjin/fast-DiT

OpenRobotLab/PointLLM

exiawsh/StreamPETR

csuhan/OneLLM

tianweiy/DMD2

DerryHub/BEVFormer_tensorrt

OpenGVLab/PonderV2

Tsinghua-MARS-Lab/futr3d

jiawei-ren/diffmimic

DiT-3D/DiT-3D