otbzi's Stars
xai-org/grok-1
Grok open release
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
microsoft/autogen
A programming framework for agentic AI 🤖
facebookresearch/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
open-mmlab/mmdetection
OpenMMLab Detection Toolbox and Benchmark
karpathy/llm.c
LLM training in simple, raw C/CUDA
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
amusi/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
facebookresearch/detr
End-to-End Object Detection with Transformers
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
apple/ml-ferret
google-deepmind/mujoco
Multi-Joint dynamics with Contact. A general purpose physics simulator.
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
nilsherzig/LLocalSearch
LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progress of the agents and the final answer. No OpenAI or Google API keys are needed.
open-mmlab/mmdetection3d
OpenMMLab's next-generation platform for general 3D object detection.
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
MarkFzp/mobile-aloha
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon
fundamentalvision/BEVFormer
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
fundamentalvision/Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
MarkFzp/act-plus-plus
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
baaivision/Painter
Painter & SegGPT Series: Vision Foundation Models from BAAI
tonyzhaozh/aloha
wzzheng/TPVFormer
[CVPR 2023] An academic alternative to Tesla's occupancy network for autonomous driving.
UMass-Foundation-Model/3D-LLM
Code for 3D-LLM: Injecting the 3D World into Large Language Models
weiyithu/SurroundOcc
[ICCV 2023] SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving
roboterax/humanoid-gym
Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer https://arxiv.org/abs/2404.05695
OpenRobotLab/PointLLM
[ECCV 2024 Best Paper Candidate] PointLLM: Empowering Large Language Models to Understand Point Clouds
JeffWang987/OpenOccupancy
[ICCV 2023] OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception