Awesome-Minecraft-Agents

Our Minecraft Agent

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

[🍎 Project Page] [📖 arXiv Paper] [🌟 GitHub]

We propose a Hybrid Multimodal Memory module that integrates structured knowledge and multimodal experiences into the memory mechanism of the agent. On top of it, we introduce a powerful Minecraft agent, Optimus-1, which achieves a 30% improvement over existing agents on 67 long-horizon tasks. ✨


Table of Contents


Awesome Policy

Vision-driven Policy

Title Venue Year Code Demo
Star
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
NeurIPS 2022 Github -
Star
GROOT: Learning to Follow Instructions by Watching Gameplay Videos
ICLR 2024 Github Demo

Goal-conditioned Policy

Title Venue Year Code Demo
Star
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
NeurIPS 2022 Github Demo
Star
Mastering Diverse Domains through World Models
Arxiv 2023 Github Demo
Star
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
NeurIPS 2023 Github Demo
Star
Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
CVPR 2023 Github -
Star
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control
NeurIPS Workshop 2024 Github Demo
Star
Pre-Training Goal-Based Models for Sample-Efficient Reinforcement Learning
ICLR 2024 Github -
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
Arxiv 2024 Github -
Star
Reinforcement Learning Friendly Vision-Language Model for MineCraft
ECCV 2024 Github -
Star
ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting
Arxiv 2024 Github Demo

Awesome Agent

Policy-based Agent

Title Venue Year Code Demo
Star
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
NeurIPS 2023 Github Demo
Star
Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks
NeurIPS Workshop 2023 Github Demo
Star
Learning from Visual Observation via Offline Pretrained State-to-Go Transformer
NeurIPS 2023 Github Demo
LLaMA Rider: Spurring Large Language Models to Explore the Open World
NAACL Findings 2024 - -
Star
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models
NeurIPS Workshop 2023 Github Demo
Star
Steve-Eye: Equiping LLM-based Embodied Agents with Visual Perception in Open Worlds
ICLR 2024 Github Demo
Star
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
NeurIPS 2024 Github Demo
Star
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
CVPR 2024 Github Demo
Star
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents
NeurIPS 2024 Github Demo

Code-based Agent

Title Venue Year Code Demo
Star
Voyager: An Open-Ended Embodied Agent with Large Language Models
NeurIPS 2023 Github Demo
Star
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
Arxiv 2023 Github -
Star
Creative Agents: Empowering Agents with Imagination for Creative Tasks
Arxiv 2023 Github Demo
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
CVPR 2024 - Demo
Star
Odyssey: Empowering Minecraft Agents with Open-World Skills
Arxiv 2024 Github -
Star
See and Think: Embodied Agent in Virtual Environment
ECCV 2024 Github Demo
RL-GPT: Integrating Reinforcement Learning and Code-as-policy
NeurIPS 2024 - Demo
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence
Arxiv 2024 - Demo
Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification
Arxiv 2024 - -