coltontravers's Stars
ByteByteGoHq/system-design-101
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
microsoft/autogen
A programming framework for agentic AI 🤖
OpenBMB/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
AtsushiSakai/PythonRobotics
Python sample codes for robotics algorithms.
NVlabs/instant-ngp
Instant neural graphics primitives: lightning fast NeRF and more
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
1rgs/jsonformer
A Bulletproof Way to Generate Structured JSON from Language Models
cyberbotics/webots
Webots Robot Simulator
Ly0n/awesome-robotic-tooling
Tooling for professional robotic development in C++ and Python with a touch of ROS, autonomous driving and aerospace.
Josh-XT/AGiXT
AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.
UX-Decoder/Semantic-SAM
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
spcl/graph-of-thoughts
Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
open-mmlab/Multimodal-GPT
Multimodal-GPT
roboflow/inference
A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
allenai/ai2thor
An open-source platform for Visual AI.
showlab/Show-1
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
CesiumGS/cesium-unreal
Bringing the 3D geospatial ecosystem to Unreal Engine
Jumpat/SegmentAnythingin3D
Segment Anything in 3D with NeRFs (NeurIPS 2023)
SilenceOverflow/Awesome-SLAM
A curated list of SLAM resources
waymo-research/waymax
A JAX-based simulator for autonomous driving research.
showlab/MotionDirector
[ECCV 2024 Oral] MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
IDEA-Research/OpenSeeD
[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"
JonasSchult/Mask3D
Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.
YingqingHe/ScaleCrafter
[ICLR 2024 Spotlight] Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.
facebookresearch/VLPart
[ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation
yzfly/Awesome-Multimodal-Prompts
Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.