vaesl's Stars
ChatGPTNextWeb/ChatGPT-Next-Web
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
chenfei-wu/TaskMatrix
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
openai/shap-e
Generate 3D objects conditioned on text or images
gaomingqi/Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
geekyutao/Inpaint-Anything
Inpaint anything using Segment Anything and inpainting models.
princeton-vl/infinigen
Infinite Photorealistic Worlds using Procedural Generation
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
Farama-Foundation/HighwayEnv
A minimalist environment for decision-making in autonomous driving
ttengwang/Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
AGI-Edgerunners/LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
wilson1yan/VideoGPT
Thinklab-SJTU/Awesome-LLM4AD
A curated list of awesome LLM for Autonomous Driving resources (continually updated)
weiyithu/SurroundOcc
[ICCV 2023] SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving
vimalabs/VIMA
Official Algorithm Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"
PJLab-ADG/neuralsim
neuralsim: 3D surface reconstruction and simulation based on 3D neural rendering.
exiawsh/StreamPETR
[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
PJLab-ADG/DriveLikeAHuman
Drive Like a Human: Rethinking Autonomous Driving with Large Language Models
danijar/daydreamer
DayDreamer: World Models for Physical Robot Learning
haoningwu3639/StoryGen
[CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models
wenyuqing/panacea
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"
wudongming97/TopoMLP
[ICLR2024] TopoMLP: A Simple yet Strong Pipeline for Driving Topology Reasoning
megvii-research/Far3D
[AAAI2024] Far3D: Expanding the Horizon for Surround-view 3D Object Detection
wudongming97/Prompt4Driving
wudongming97/OnlineRefer
[ICCV 2023] OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation
WayneMao/PillarNeSt
The Official Implementation of PillarNeSt
zyayoung/Awesome-Video-LLMs
Explore VLM-Eval, a framework for evaluating Video Large Language Models, enhancing your video analysis with cutting-edge AI technology.