Awesome-Minecraft-Agents

Our Minecraft Agent

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

[🍎 Project Page] [📖 arXiv Paper] [🌟 GitHub]

We propose a Hybrid Multimodal Memory module that integrates structured knowledge and multimodal experiences into the memory mechanism of the agent. On top of it, we introduce a powerful Minecraft agent, Optimus-1, which achieves a 30% improvement over existing agents on 67 long-horizon tasks. ✨

Table of Contents

Awesome Policy
- Vision-driven Policy
- Goal-conditioned Policy
Awesome Agent
- Policy-based Agent
- Code-based Agent

Awesome Policy

Vision-driven Policy

Title	Venue	Year	Code	Demo
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos	NeurIPS	2022	Github	-
GROOT: Learning to Follow Instructions by Watching Gameplay Videos	ICLR	2024	Github	Demo

Goal-conditioned Policy

Title	Venue	Year	Code	Demo
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge	NeurIPS	2022	Github	Demo
Mastering Diverse Domains through World Models	Arxiv	2023	Github	Demo
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft	NeurIPS	2023	Github	Demo
Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction	CVPR	2023	Github	-
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control	NeurIPS Workshop	2024	Github	Demo
Pre-Training Goal-Based Models for Sample-Efficient Reinforcement Learning	ICLR	2024	Github	-
Vision-Language Models Provide Promptable Representations for Reinforcement Learning	Arxiv	2024	Github	-
Reinforcement Learning Friendly Vision-Language Model for MineCraft	ECCV	2024	Github	-
ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting	Arxiv	2024	Github	Demo

Awesome Agent

Policy-based Agent

Title	Venue	Year	Code	Demo
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents	NeurIPS	2023	Github	Demo
Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks	NeurIPS Workshop	2023	Github	Demo
Learning from Visual Observation via Offline Pretrained State-to-Go Transformer	NeurIPS	2023	Github	Demo
LLaMA Rider: Spurring Large Language Models to Explore the Open World	NAACL Findings	2024	-	-
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models	NeurIPS Workshop	2023	Github	Demo
Steve-Eye: Equiping LLM-based Embodied Agents with Visual Perception in Open Worlds	ICLR	2024	Github	Demo
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks	NeurIPS	2024	Github	Demo
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception	CVPR	2024	Github	Demo
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents	NeurIPS	2024	Github	Demo

Code-based Agent

Title	Venue	Year	Code	Demo
Voyager: An Open-Ended Embodied Agent with Large Language Models	NeurIPS	2023	Github	Demo
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory	Arxiv	2023	Github	-
Creative Agents: Empowering Agents with Imagination for Creative Tasks	Arxiv	2023	Github	Demo
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft	CVPR	2024	-	Demo
Odyssey: Empowering Minecraft Agents with Open-World Skills	Arxiv	2024	Github	-
See and Think: Embodied Agent in Virtual Environment	ECCV	2024	Github	Demo
RL-GPT: Integrating Reinforcement Learning and Code-as-policy	NeurIPS	2024	-	Demo
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence	Arxiv	2024	-	Demo
Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification	Arxiv	2024	-	-

dawn0815/Awesome-Minecraft-Agent