skskgrowl's Stars
NotACracker/COTR
[CVPR24] COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction
keithAND2020/awesome-Occupancy-research
Papers on occupation, including monocular and multi-view in autonomous driving scenarios
feizc/DiS
Scalable Diffusion Models with State Space Backbone
weiyithu/SurroundOcc
[ICCV 2023] SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving
ldtho/DifFUSER
DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation
duanyiqun/DiffusionDepth
PyTorch Implementation of introducing diffusion approach to 3D depth perception ECCV 2024
dome272/MaskGIT-pytorch
Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)
Rorisis/Co-Occ
[IEEE RA-L] Co-Occ: Coupling Explicit Feature Fusion with Volume Rendering Regularization for Multi-Modal 3D Semantic Occupancy Prediction
Event-AHU/Mamba_State_Space_Model_Paper_List
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
autodriving-heart/Awesome-Autonomous-Driving
awesome-autonomous-driving
BarqueroGerman/FlowMDM
[CVPR 2024] Official Implementation of "Seamless Human Motion Composition with Blended Positional Encodings".
OpenRobotLab/UniHSI
[ICLR 2024 Spotlight] Unified Human-Scene Interaction via Prompted Chain-of-Contacts
Szy-Young/ActFormer
🔥ActFormer in PyTorch (ICCV 2023)
OpenMotionLab/MotionGPT
[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs
scenediffuser/Scene-Diffuser
Official implementation of CVPR23 paper "Diffusion-based Generation, Optimization, and Planning in 3D Scenes"
afford-motion/afford-motion
Official implementation of CVPR24 highlight paper "Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance"
UrbanArchitect/UrbanArchitect
The official repository of our paper: "Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior"
52CV/CVPR-2024-Papers
UMass-Foundation-Model/3D-VLA
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
embodied-generalist/embodied-generalist
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
allenai/Holodeck
CVPR 2024: Language Guided Generation of 3D Embodied AI Environments.
hzxie/CityDreamer
The official implementation of "CityDreamer: Compositional Generative Model of Unbounded 3D Cities". (Xie et al., CVPR 2024)
3dlg-hcvc/M3DRef-CLIP
[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects
Open3DA/LL3DA
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
scene-verse/SceneVerse
Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"
OpenRobotLab/EmbodiedScan
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
MzeroMiko/VMamba
VMamba: Visual State Space Models,code is based on mamba
hustvl/Vim
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
ATR-DBI/CityRefer
ZhanYang-nwpu/Mono3DVG
[AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024