Learning to play diplomacy is a big deal for several reasons. Not only does it involve multiple players, who move at the same time, but each turn is preceded by a short discussion where players chat in pairs in an attempt to form alliances or team up against rivals. After this round of negotiations, players then decide which pieces to move—and whether to honor or withdraw from the agreement.
At each point in the game, Cicero models how other players might act based on the state of the board and previous conversations with them. It then determines how players can work together for mutual benefit and creates messages designed to achieve those goals.
To create Cicero, Meta marries two different types of AI: a reinforcement learning model that determines what to create and a larger language model that communicates with other players.
Cicero is not perfect. It still sent messages that contained errors, sometimes contradicting its own plans or making strategic mistakes. But Meta claims that people often choose to cooperate with it over other players.
And that’s significant because while games like chess or Go end with a winner and a loser, real-world problems usually don’t have such straightforward resolutions. Finding trade-offs and workarounds is often more valuable than winning. Meta claims that Cicero is a step toward AI that can help with a range of complex problems that require compromise, from route planning around busy traffic to contract negotiations.