Diplomacy-AI

We're building an AI to play the board game Diplomacy!

tl;dr

Diplomacy embeds simple board play in a framework of rich, unstructured negotiations; using reinforcement learning to play the 'gunboat' variant (without negotiations) should be relatively easy, but involves coalitions; the natural language processing of full Diplomacy will be a real challenge.

Basics

For more details about Diplomacy and bots playing it, see these slides, based on this lightning talk at Bornhack 2018.

We intend to develop this project via the Diplomacy AI Meetup. Please join us - either in person - or on Slack.

Goals

near term: an AI that systematically out-performs state-of-the-art bots (e.g. Albert, D-Brane, KestasBot) in Gunboat Diplomacy (without negotiation).

articles in internationally recognised AI journals describing members' work. (The ANAC Diplomacy League was part of the IJCAI, International Joint Conferences on Artificial Intelligence.)
beat human players in online Diplomacy fora (e.g. webDiplomacy). Initially, Gunboat Diplomacy, then full Diplomacy online and, finally, full Diplomacy face-to-face (e.g. at WorldDipCon).

Key resources

de Jonge, Baarslag, Aydoğan, Jonker, Fujita and Ito (2018), "The Challenge of Negotiation in the Game of Diplomacy", The 6th International Conference on Agreement Technologies 2018, Bergen, Norway.

The best introduction to negotiating agents in Diplomacy. This article contains a brief overview of Diplomacy, the BANDANA environment, leading entries in 2017, 2018 ANAC Diplomacy Challenges and the challenges facing negotiating agents. None of the entries so far seem to engage in learning, much less reinforcement learning.

none of the negotiation algorithms ... have been able to significantly improve the performance over a non-negotiating baseline agent

Now that modern Chess and Go computers are already far superior to any human player [15], we expect that Diplomacy will start to draw more attention as the next big challenge for computer science.

we are not expecting the Diplomacy Challenge to have a winner any time soon. We regard it as a long term challenge which might take several years to tackle.

The BANDANA framework

A Java framework for developing automated agents to play Diplomacy. It calls the Parlance game server. BANDANA extends the DipGame framework; neither support convoy orders.

Downloads include:

March 2018 manual: download this first and follow the installation instructions for the BANDANA Java framework and Parlance game server;
the accompanying Java framework;
2018 ANAC agents.

The scoring system (12 points for a solo victory, etc.) is drawn from PlayDiplomacy's.

The Rules of Diplomacy (Avalon Hill)

2008, 5th edition. Here is the 2000 4th edition. See Kruijswijk's DATC for a list of previous editions.

Paquette, Lu, Bocco, Smith, Ortiz-Gagné, Kummerfeld, Singh, Pineau and Courville (2019), "No Press Diplomacy: Modeling Multi-Agent Gameplay", NIPS

Introduces DipNet, a gunboat bot that is first trained (supervised learning) on a dataset of about 150,000 human Diplomacy games (of which c. 33,000 are gunboat), primarily from webDiplomacy; this initialises self-play reinforcement learning. Its reward function includes both local terms (as supply centres are won/lost) and a terminal one (34 points for a solo victory, etc.). As inputs, the model takes the "current board state and previous phase orders".

DipNet outperforms a number of benchmark bots, including Jason van Hal's Albert, its benchmark bot.

The Python 3 DipNet code is available here. The code also includes a game server, implemented as an OpenAI gym. DipNet bots are implemented on webdiplomacy.net, allowing a human to play against six bots (of which there are two variants).

the supervised agent was able to learn to coordinate support orders while this behaviour appears to deteriorate during self-play training.

A discussion of this is here, where Squigs44 explains the deterioration during self-play: "when it learned from webdip, it was able to more effectively support others. When it learned from itself, it was less effective at supporting others, and was more effective at winning. The goal of the bot isn't to support, it is to win. Since the bot is in a gunboat setting, it makes sense that supporting other countries wasn't as rewarding."

Episode 52 of the Diplomacy Games podcast series, the Rise of the Bots discusses these bots. The 'Jane' bot is drawn from Ender's Game. The ethical question of whether to add messaging (thus, teaching bots to lie to humans) is raised, and countered by the observation that bots already bluff in poker. It's noted (without explanation) that the bots' performance declines as game progress.

Anthony, Eccles, Tacchetti, Kramár, Gemp, Hudson, Porcel, Lanctot, Pérolat, Everett, Singh, Graepel, Bachrach, "Learning to Play No-Press Diplomacy with Best Response Policy Iteration", DeepMind

Built on Paquette et al. (2019) by introducing sampled best responses (SBRs) for policy iteration. Along with some changes to the DipNet neural architecture, the result outperforms existing benchmarks, including DipNet. The agents will be made open source once the paper is accepted for publication.

de Jonge and Sierra (2017), "D-Brane: a diplomacy playing agent for automated negotiations research", Applied Intelligence

Detailed description of the D-Brane (Diplomacy BRAnch & bound NEgotiator) modules, which seek to myopically maximise the Supply Centres gained in the current round using And/Or tree search with Branch & Bound. Thus, while tactically strong, it is not built for longer-term planning.

Other resources

Parlance

A Python 2 framework for playing the Diplomacy board game over a network. BANDANA calls it. Successor to DAIDE.

Parang

A set of bots to play Diplomacy over Parlance, from blabberbot ("A simple bot that sends constant streams of random press") to neurotic ("A neural-network bot, which unfortunately has no memory yet") and peacebot ("A simple bot that invites each player to be peaceful").

Kemmerling, Ackermann, Preuss (2011), "Nested look-ahead evolutionary algorithm based planning for a believable diplomacy bot", European Conference on the Applications of Evolutionary Computation

Provides a history of Diplomacy bots, including Stragotiator (reportedly able to pass the Turing Test in some short games). Estimates that the number of possible placements for 34 units is 4.09 x 10^27, with each capable of executing about 7.24 moves on average (without convoying). Reports on results of play by a version of Stragotiator with enhanced planning against raw Stragotiator and Albert.

Shapiro, Fuchs and Levinson (2002), "Learning a Game Strategy Using Pattern-Weights and Self-Play", Third International Conference on Computers and Games

Trained a TD RL system on an existing knowledge base. In contrast to the ANAC project, this system plays 'gunboat' Diplomacy, moving without negotiations. §2 describes the game graph and action space.

Stormont and Allan (2012), "A comparison of Diplomacy gameboard graph search algorithms", Fourth International Conference on Agents and Artificial Intelligence (ICAART)

"This paper addresses one element of creating a planner for a Diplomacy agent: an efficient search algorithm for determining the shortest path to achieving victory in the game".

World Diplomacy Database

Includes:

tournaments;
player rankings under a number of systems;
different scoring systems.

Diplomacy Archive

An archive of 'zines and articles; FAQ last updated in 2002.

online Diplomacy games

Some online fora for playing Diplomacy are:

webDiplomacy

A GitHub project, which - in 2019 - became the first online Diplomacy game to host bots (see Paquette et al. 2019, above).

In Sept/Oct 2019, webDiplomacy ran its first human v bot Terminator Tournament. The overall rate of solo victories by the c. 70 human players was about 30%; of the c. 20 top human players, the solo rate was still less than 50%. Discussions on webDiplomacy and in the Rise of the Bots podcast indicate that the bots are unexpectedly strong.

webDiplomacy's Discord channel is here.

PlayDiplomacy

Top 25 players are displayed here. Like Elo ratings, PlayDiplomacy rankings reflect the strength of the opposition that players have overcome. The exact formula is deliberately secret to prevent "play[ing] the system rather than the game", but seems to assign 12 points for a solo victory, and decreasing weight to older games (a system called 'fading echoes').

A basic order adjudication tool is available here. Their online discussion forum is here.

Backstabbr

Backstabbr's order adjudication claims to be compliant with DATC. It is supported by a Discord server.

The DPjudge

A deceptively unimpressive website disguising serious games and content, including The Diplomatic Pouch 'zine, dating back to 1995. Their order adjudication algorithm is described here. They provided messages for analysis by Niculae et al. (2015).

books

Sharp (1978), "The game of Diplomacy"

Written with dated verve, this book colourfully introduces Diplomacy theory, including opening and endgame theory informed by basic statistics. Available online here.

consider one of the most important facts about the game of Diplomacy: each player is outnumbered six to one

I had pulled off a particularly vicious stab (not even a very good one, as it turned out) on an ally who had served me faithfully for five game years. I waited, cringing, for the next development; and there in the next post was a letter with the familiar postmark. I opened it apprehensively. 'Dear Richard,' it began. 'Ouch! That hurt. It looks as if I shall be playing a rather minor role in our partnership from now on. ...' And it went on to discuss some tactical possibilities for the coming season. Now, this letter had two effects: first, it made me feel like a louse; second, it induced me to let him off the hook, because I knew that in the last resort I would rather get a letter from him than from any of the other potential allies in that area. That is the way to treat a stab.

I once induced England to play it as part of an elaborate long-term bargain: England was never to occupy the North Sea, in exchange for which Germany undertook to build no fleets at all.

Kostick (2015), "The Art of Correspondence in the Game of Diplomacy"

A book about negotiation in (human) Diplomacy by Conor Kostick, historian, novelist, game designer and international Diplomacy champion.

online communities

a subreddit;
a Discord server;
rec.games.diplomacy newsgroup (or via Google Groups)

Kruijswijk's Diplomacy Adjudicator Test Cases (DATC)

Detailed discussion of algorithmic order adjudication based primarily on the 2000 Diplomacy rules. Last updated in 2009.

Diplomacy AI Development Environment (DAIDE)

A UK-based Diplomacy AI community project begun in 2002, which is now moribund. There seems to have been no significant new material added to the website since 2013. A brief history is here.

Its message syntax is the most sophisticated developed for Diplomacy AIs. The March 2010 DAIDE Message Syntax document is here.

NLP

Niculae, Kumar, Boyd-Graber, Danescu-Niculescu-Mizil (2015), "Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game", Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pages 1650–1659, Beijing, China

Collected 145,000 dyadic messages from 249 games (JSON format) played on The DPjudge and another online forum. The messages were then analysed using tools such as Stanford's Sentiment Analyzer and politeness classifiers.

half of the games have over 515 messages exchanged between the players, while the top quartile has over 750 messages per game. Also, non-trivial messages (with at least one sentence) tend to be complex: over half of them have at least five sentences, and the top quartile consists of messages with eight or more sentences.

The resulting model achieves a cross-validation accuracy of 57% and a Matthews correlation coefficient of 0.14, significantly above chance ... This indicates that, unlike the actual players, the classifier is able to exploit subtle linguistic signals that surface in the conversation.

The authors' page here summarises their findings and presents their anonymised data, and video and slides from their ACL 2015 talk. Also mentioned by security expert Bruce Schneier's blogpost.

Masters' theses

Cruz and Lopes Cardoso (2019), "Deep Reinforcement Learning in Strategic Multi-Agent Games: the case of No-Press Diplomacy", Universidade do Porto

Introduces DeepDip, an openAI gym based on Parlance and BANDANA to play no-press Diplomacy on three map variants: the original, a three-player version, and a two-player version.

Fernandes de Mascarenhas (2017), "AI player for board game Diplomacy", Técnico Lisboa

Surveys DAIDE, including the bots developed for it, and presents Tagus, its own (hard-coded) bot.

Webb, Chin, Wilkins, Payce, Dedoyard (2008), "Automated Negotiation in the Game of Diplomacy", Imperial College

Project report of MEng students, supervised by Iain Phillips. On AI/ML:

Machine Learning Currently most bots learn very little during the course of play. If a bot can be designed that interprets and learns from the actions of its opponents a better bot may be created.

Huff, Chan, Tondelier, Bundred, Egan (2005), "Automated Negotiation in the Game of Diplomacy", Imperial College

Project report of MEng students, again supervised by Iain Phillips. §2 reviews existing Diplomacy bots. No AI/ML used.

Ritchie (2003), "Diplomacy — A.I.", University of Glasgow

MSc thesis supervised by Ron Poet. Chapter 3 discusses theories of Diplomacy play.

podcasts!?

Yup, a This American Life podcast episode, Absolutely Stabulous, emphasising the importance of the strategies beyond the board:

"And the way he said it to me made me feel like it wasn't a manipulation." (Ambassador Dennis Ross)

"I want to tell you how much ... I was impressed by this" (Ambassador Dennis Ross)

The podcast covers some of the material in David Hill's "The Board Game of the Alpha Nerds":

If you’ve ever heard of Diplomacy, chances are you know it as “the game that ruins friendships.”

Edi took out scissors and a pen from a drawer and set to work on the letter, painstakingly doctoring it. When he was finished he held up the letter proudly and read it to himself. I am against a three-way draw and I will take three more supply centers … He put the forgery in the mail to the third player. Then he waited.

“What I love about the game is that no one can really play the game ‘right,’” Birsan said. “Otherwise, after 50 years, that would have been discovered long ago and the hobby would have died of boredom.”

Would you believe, a whole Diplomacy Games podcast series?

Diplicity

Android app implementing a non-copyrighted variant of Diplomacy; the "spiritual successor" to Droidippy.

datapolitical/Diplomacy-AI