liangxiaowei00's Stars
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
RLHF-V/RLAIF-V
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
BAAI-DCAI/SpatialBot
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
andimarafioti/florence2-finetuning
Quick exploration into fine tuning florence 2
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
openvla/openvla
OpenVLA: An open-source vision-language-action model for robotic manipulation.
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
magic-research/PLLaVA
Official repository for the paper PLLaVA
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
google-deepmind/open_x_embodiment
octo-models/octo
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
meta-llama/llama3
The official Meta Llama 3 GitHub site
lpiccinelli-eth/UniDepth
Universal Monocular Metric Depth Estimation
NHirose/SACSoN
Scalable Autonomous Control for Social Navigation
robodhruv/visualnav-transformer
Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
WooooDyy/LLM-Agent-Paper-List
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Paitesanshi/LLM-Agent-Survey
ros2/examples
Example packages for ROS 2
MarkFzp/act-plus-plus
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
THUDM/AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Genesis-Embodied-AI/Genesis
A generative world for general-purpose robotics & embodied AI learning.
meta-llama/llama
Inference code for Llama models
Genesis-Embodied-AI/RoboGen
A generative and self-guided robotic agent that endlessly propose and master new skills.