Cherryjingyao

Cherryjingyao's Stars

microsoft/autogen
A programming framework for agentic AI 🤖
Language:Python35.3k 412 2k5.1k
chenfei-wu/TaskMatrix
Language:Python34.6k 301 3553.3k
microsoft/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
Language:Python23.8k 383 1812k
mem0ai/mem0
The Memory layer for your AI apps
Language:Python23.1k 130 6872.1k
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python12.8k 106 596894
NeoVertex1/SuperPrompt
SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.
5.5k 74 19517
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Language:Jupyter Notebook4.9k 34 199648
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
Language:Python4.3k 31 378392
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
Language:Python3.8k 48 176286
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Language:Python3.4k 29 412213
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language:Python3.1k 37 234253
modelscope/modelscope-agent
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
Language:Python2.8k 39 205316
lucidrains/flamingo-pytorch
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
Language:Python1.2k 21 1359
alipay/agentUniverse
agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications.
Language:Python910 14 34115
robodhruv/visualnav-transformer
Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
Language:Python654 34 5179
MarkFzp/humanplus
[CoRL 2024] HumanPlus: Humanoid Shadowing and Imitation from Humans
Language:Python592 16 096
Vision-CAIR/MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
Language:Python561 12 4261
liangwq/Chatglm_lora_multi-gpu
chatglm多gpu用deepspeed和
Language:Python404 5 5061
alfworld/alfworld
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
Language:Python372 8 8457
jun0wanan/awesome-large-multimodal-agents
358 5 221
CraftJarvis/JARVIS-1
JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models
Language:Java342 42 917
anchen1011/FireAct
FireAct: Toward Language Agent Fine-tuning
Language:Python255 2 519
web-arena-x/visualwebarena
VisualWebArena is a benchmark for multimodal agents.
Language:Python249 4 5048
YangXuanyi/Multi-Agent-GPT
Multi-Agent-GPT: 一款基于RAG和agent构建的多模态专家助手GPT。它集成了文本、图像和音频等模态工具。支持本地部署和私有数据库建设。
Language:Python221 2 35
remyxai/VQASynth
Compose multimodal datasets 🎹
Language:Python220 5 1013
flowersteam/lamorel
Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
Language:Python199 4 2418
Ag2S1/Sibyl-System
Language:Python103 6 710
sail-sg/Agent-Smith
[ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Language:Python90 7 413
real-stanford/reflect
[CoRL 2023] REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction
Language:Jupyter Notebook77 1 66
omron-sinicx/ViLaIn
An official implementation of Vision-Language Interpreter (ViLaIn)
Language:SAS25 7 22