sankin97's Stars
jiawei-ren/dreamgaussian4d
[arXiv 2023] DreamGaussian4D: Generative 4D Gaussian Splatting
JeffWang987/WorldDreamer
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
zhanghm1995/Forge_VFM4AD
A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.
robodrive-24/toolkit
Official Toolkit for The RoboDrive Challenge
ActiveVisionLab/Awesome-LLM-3D
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
3DTopia/GPTEval3D
[ CVPR 2024 ] Implementation for "GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation"
skyhehe123/ScatterFormer
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
OpenDriveLab/ViDAR
[CVPR 2024 Highlight] Visual Point Cloud Forecasting
zju3dv/street_gaussians
Code for "Street Gaussians for Modeling Dynamic Urban Scenes"
wayveai/LingoQA
Official GitHub repository for the paper "LingoQA: Video Question Answering for Autonomous Driving"
yifanlu0227/ChatSim
[CVPR2024 Highlight] Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration
BiDiff/bidiff
[CVPR'24] Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors
Open3DA/LL3DA
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
OpenGVLab/DriveMLM
opendilab/LMDrive
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
fudan-zvg/Reason2Drive
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
chaytonmin/UniScene
Official implementation of our RAL'24 paper: Multi-Camera Unified Pre-training for Autonomous Driving
fudan-zvg/WoVoGen
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
NVlabs/BEV-Planner
vlm-driver/Dolphins
Doubiiu/DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
shalfun/DrivingDiffusion
Layout-Guided multi-view driving scene video generation with latent diffusion model
huang-yh/SelfOcc
[CVPR 2024] SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
thuml/ContextWM
Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://arxiv.org/abs/2305.18499
wzzheng/OccWorld
3D World Model for Autonomous Driving
USC-GVL/Agent-Driver
A Language Agent for Autonomous Driving
BraveGroup/Drive-WM
[CVPR 2024] A world model for autonomous driving.
ZiqinZhou66/ZegCLIP
Official implement of CVPR2023 ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
PJLab-ADG/GPT4V-AD-Exploration
On the Road with GPT-4V(ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent
AlmoonYsl/QTNet
[NeurIPS 2023] Query-based Temporal Fusion with Explicit Motion for 3D Object Detection