Cherryjingyao's Stars
verlab/accelerated_features
Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
zd11024/NaviLLM
[CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'
ZhengyiLuo/PHC
Official Implementation of the ICCV 2023 paper: Perpetual Humanoid Control for Real-time Simulated Avatars
ZhengyiLuo/PULSE
Official Implementation of the ICLR 2024 spotlight paper: Universal Humanoid Motion Representations for Physics-Based Control
LYX0501/InstructNav
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
robodhruv/visualnav-transformer
Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
NeoVertex1/SuperPrompt
SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
CraftJarvis/JARVIS-1
JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models
sail-sg/Agent-Smith
[ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
jun0wanan/awesome-large-multimodal-agents
web-arena-x/visualwebarena
VisualWebArena is a benchmark for multimodal agents.
YangXuanyi/Multi-Agent-GPT
Multi-Agent-GPT: 一款基于RAG和agent构建的多模态专家助手GPT。它集成了文本、图像和音频等模态工具。支持本地部署和私有数据库建设。
liangwq/Chatglm_lora_multi-gpu
chatglm多gpu用deepspeed和
alfworld/alfworld
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
microsoft/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
chenfei-wu/TaskMatrix
anchen1011/FireAct
FireAct: Toward Language Agent Fine-tuning
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
modelscope/modelscope-agent
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
microsoft/autogen
A programming framework for agentic AI 🤖 (PyPi: autogen-agentchat)
antgroup/agentUniverse
agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications.
Ag2S1/Sibyl-System
mem0ai/mem0
The Memory layer for your AI apps
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Vision-CAIR/MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
MarkFzp/humanplus
[CoRL 2024] HumanPlus: Humanoid Shadowing and Imitation from Humans
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation