ShaohuaDong2021's Stars
geekan/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
UZ-SLAMLab/ORB_SLAM3
ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM
Lightning-AI/lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
open-mmlab/mmdetection3d
OpenMMLab's next-generation platform for general 3D object detection.
vikhyat/moondream
tiny vision language model
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
mit-biomimetics/Cheetah-Software
mit-han-lab/bevfusion
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
traveller59/second.pytorch
SECOND for KITTI/NuScenes object detection
DLYuanGod/TinyGPT-V
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
yunlong10/Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
CVPR2023-3D-Occupancy-Prediction/CVPR2023-3D-Occupancy-Prediction
CVPR2023-Occupancy-Prediction-Challenge
cwchenwang/awesome-3d-diffusion
A collection of papers on diffusion models for 3D generation.
allenai/unified-io-2
Azure/MS-AMP
Microsoft Automatic Mixed Precision Library
vasgaowei/BEV-Perception
Bird's Eye View Perception
llm-efficiency-challenge/neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
zhanghm1995/Forge_VFM4AD
A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.
wudongming97/RMOT
[CVPR2023] Referring Multi-Object Tracking
liangxuy/Inter-X
[CVPR 2024] Official implementation of the paper "Towards Versatile Human-Human Interaction Analysis"
jjwang/HanOS
Microkernel-based General Purpose Operating System #Hobby OS#
zyrant/SPGroup3D
[AAAI 2024] SPGroup3D: Superpoint Grouping Network for Indoor 3D Object Detection
YingLv1106/CAINet
This is a multimodal semantic segmentation method, named CAINet: Context-Aware Interaction Network for RGB-T Semantic Segmentation.