Pinned Repositories
3D-Box-Segment-Anything
We extend Segment Anything to 3D perception by combining it with VoxelNeXt.
3D-LLM
Preliminary Code for 3D-LLM: Injecting the 3D World into Large Language Models
3D-Occupancy-Perception
A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective
3DChatbot
SKT AI FLY Challengers 3기 프로젝트 열정 1팀
AnimatedDrawings
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
DB-GPT-Hub
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
MedicalGPT-zh
MedicalGPT-zh:一个基于ChatGLM的在高质量指令数据集微调的中文医疗对话语言模型
ml-ferret
scGPT
Video-ChatGPT
Video-ChatGPT is a large vision-language model with a dedicated video-encoder and large language model (LLM), enabling video understanding and conversation about videos.
2132660698's Repositories
2132660698/3D-Occupancy-Perception
A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective
2132660698/3DRealCar_Dataset
2132660698/Awesome-Autonomous-Driving
2132660698/Awesome-Multi-Camera-3D-Occupancy-Prediction
Awesome papers and code about Multi-Camera 3D Occupancy Prediction, such as TPVFormer, SurroundOcc, PanoOcc, OccFormer, FB-OCC, SelfOcc, COTR, SparseOcc. In this repository, you will see the latest 3D occupancy prediction papers and code.
2132660698/Bench2Drive
Closed-loop multi-ability evaluation of end-to-end autonomous driving algorithms
2132660698/Bench2DriveZoo
BEVFormer, UniAD, VAD in CARLA under Closed-Loop Evaluation
2132660698/BEV-Perception
Bird's Eye View Perception
2132660698/carla_garage
[ICCV'23] Hidden Biases of End-to-End Driving Models
2132660698/CVT-Occ
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
2132660698/Dolphins111
[ECCV 2024] The official code for "Dolphins: Multimodal Language Model for Driving“
2132660698/DriveLM
[ECCV 2024] DriveLM: Driving with Graph Visual Question Answering
2132660698/DriveVLM
2132660698/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
2132660698/GenAD
[ECCV 2024] GenAD: Generative End-to-End Autonomous Driving
2132660698/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
2132660698/Hunyuan3D-2
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
2132660698/Hydra-MDP
2132660698/LMDrive
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
2132660698/MindSearch
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
2132660698/OccNet-Course
国内首个占据栅格网络全栈课程《从BEV到Occupancy Network,算法原理与工程实践》,包含端侧部署。Surrounding Semantic Occupancy Perception Course for Autonomous Driving (docs, ppt and source code) 在线课程主页:http://111.229.117.200:8100/ (作者独立搭建)
2132660698/OmniCorpus
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
2132660698/OmniDrive
2132660698/open-r1
Fully open reproduction of DeepSeek-R1
2132660698/SensorsCalibration
OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving
2132660698/SparseDrive
SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation
2132660698/SparseOcc
Official implementation for 'SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction' (CVPR 2024)
2132660698/Talk2Drive
2132660698/VAD
[ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
2132660698/ViewFormer-Occ
[ECCV 2024] ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers
2132660698/Vista
A Generalizable World Model for Autonomous Driving