ztianlin's Stars
robodhruv/visualnav-transformer
Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
ai4ce/CityWalker
[CVPR 2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
leggedrobotics/viplanner
ViPlanner: Visual Semantic Imperative Learning for Local Navigation
Yzichen/FlashOCC
unitreerobotics/unitree_rl_gym
microsoft/MoGe
[CVPR'25] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
IGL-HKUST/DiffusionAsShader
[arXiv 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
ZGCTroy/RealCam-Vid
open-sourced video dataset with dynamic scenes and camera movements annotation
ZGCTroy/RealCam-I2V
ZGCTroy/CamI2V
official repo of paper for "CamI2V: Camera-Controlled Image-to-Video Diffusion Model"
hehao13/CameraCtrl
OpenDriveLab/AgiBot-World
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
cure-lab/MagicDrive
[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
isaac-sim/IsaacLab
Unified framework for robot learning built on NVIDIA Isaac Sim
deepseek-ai/Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
yuantianyuan01/StreamMapNet
erwold/qwen2vl-flux
NVIDIA/Cosmos
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. Cosmos is purpose built for physical AI. The Cosmos repository will enable end users to run the Cosmos models, run inference scripts and generate videos.
IDEA-Research/Grounded-SAM-2
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
LMD0311/DAPT
[CVPR 2024] Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
mit-han-lab/nunchaku
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
lllyasviel/ControlNet
Let us control diffusion models!
AutonomousVehicleLaboratory/SemVecNet
open-mmlab/mmdeploy
OpenMMLab Model Deployment Framework
PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
wudongming97/Prompt4Driving
[AAAI2025] Language Prompt for Autonomous Driving