The Diplomacy AI Development Environment (DAIDE) was developed starting
in 2002 as an attempt to centralize (centralise to David...) efforts to
develop Artificial Intelligence (AI) robotic (bot) structures for
replacing a human Diplomacy player with a computer programmed
player. Since that time, a very active Yahoo Group (dipai) and numerous
web sites have developed to support this project. David Norman, the
DAIDE language, server, and mapper designer/author, has written an
introductory Diplomatic Pouch article on the project as well as comments
in that and other forums. Today, I'd like to ask David a few probing
questions about the past and present and future of this project and see
if I can excite more of you to think about participating in the project
as an observer, commentator, or programmer.
Jim Burgess (JB):
Thanks, David, for being willing to talk about this.
David Norman (DN):
No problem.
Just to add to that description, I think it's worth
repeating the opening of the article you cite. The key aim of the DAIDE
project was not so much to centralise efforts, as to provide a framework
for development where Bots could easily compete against each other and
against humans. Before the DAIDE project, there had already been two
hobby projects to develop a Diplomacy AI - Danny Loeb's DPP, and then
Sean Lorber's SeaNail. Both of these had had a huge amount of
development effort put into them, but neither had that much use, as the
only way for them to play in a game was for a person to manage the
program, entering results from the game into the AI, and then submitting
the orders generated by the AI to the GM.
So, the DAIDE project
set out to provide an environment where AIs could be developed, and then
play against each other and against humans. By allowing them to play a
lot more games, we could not just develop AIs, but also find out how
well they were playing, and refine and improve them.
JB:
Let me ask a general question first about the current scope of the
project. I know that there are currently 195 members of the Yahoo
Group, though many of them are like me, who do not intend to actually
program a bot themselves. Roughly how many working bots have been
designed to your knowledge and how many of the 195 group members would
you classify as active programmers?
DN: There have been
ten Bots developed so far, by nine different authors, although of course
for each of those Bots, there are many different versions. The Bots vary
from DumbBot, the first Bot produced by the project, which I wrote in
two days, to Albert, which Jason van Hal wrote recently, and is the best
Bot to date. Playing a no-press game against six Alberts is very
difficult. And I should emphasise, the Bots do not know that six of the
powers are being played by the same Bot, or which power is the human
player.
JB:
You've developed what I think is a neat three letter token language
syntax that to me strikes a near perfect balance (especially for this
development phase) in being computer program readable and human readable
but expresses most all of the types and levels of negotiation that most
players use in working out tactics and cooperation on the board. Could
I get some of your thoughts today on how well this is working in the
actual operation and negotiation between bots in tests you've seen?
DN:So far, it
hasn't been used that much by the Bots.
The language is split up
into 13 levels of increasing complexity - from the first level - where
all you can do is offer an alliance, and the second level where you can
suggest specific orders, up to the top levels where you can ask for an
explanation of a power's press or orders, and pass on messages that
you've received from other powers. By splitting it into levels, you can
have games where only language up to a certain level is allowed,
allowing Bots to build up their press capabilities in stages, and still
compete with more advanced Bots.
So far, none of the Bots can
handle more than the bottom two levels.
Having said that, we have
had one game using the full language - we had a game between seven human
players where the only negotiation allowed was in the DAIDE language -
mainly to test the language and find any problems with it before Bots
started to use it. This was easier to do than it sounds, because the
DAIDE Mapper has a press entry system which allows you to enter press in
English by selecting from a list of options, and then translates to and
from the tokenised language for you. And of course, we found a number of
problems - mostly questions which could be asked but there was no way to
express the answer you wanted to give!
JB:
Of those, about how many have implemented language syntax above Level 0
(no press)?
DN:Of
the ten Bots that have been written so far, seven are no-press only, and
three support some press. But as I said, none of these three can handle
more than the first two press levels.
JB:
Do you feel that current playtest efforts around these have pushed the
negotiation side of the project very far to date? As a non-programmer,
participator in group discussion on dipai, I've not seen that much
discussion on this, or are people mostly trying to master the efforts to
evaluate and improve coordinated tactical movement amongst one's own
units?
DN:Yes,
the tactical and strategic side is receiving a lot more focus at the
moment.
There are two theories on how to write a Diplomacy AI
that negotiates. The first is that you need to understand the tactical
and strategic side of the game. Once you understand that, you can then
understand where cooperation would improve both your prospects, and then
that is the foundation for your negotiation. The second is that you
negotiate with your neighbours. The agreements you make with them
determines your strategy and tactics.
Currently, the first
theory seems to be prominent, so people are concentrating on putting all
their effort in writing a Bot that can play no-press well, with the
expectation that once that works well, press will follow on.
Of
course, there is a third theory that the two sides need to feed into
each other. But that's well beyond anything anybody's trying to do at
the moment!
JB:
One of the things that strikes me is the sheer range of types and goals
of programming that must be accomplished to design a good bot, it seems
to me that more "jointly designed" bots where one person worked on one
piece while someone else works on something else (with an understood and
planned for goal of integration) would push things forward faster. This
was what Daniel Loeb was doing in the early 1990's in the original
Diplomacy Programming Project as he had numerous students working for
him on various parts. One failure in that was the "coordination" part,
so there always is a tradeoff between the single mind of a designer and
a group effort. What do you think of the joint design/single designer
issue, both historically and in the future of DAIDE?
DN:In the
long term, I think the best Bots will have to be a joint development -
there's just too much involved for a single person to write it. But the
disadvantage of a joint project is that you're unlike to get several
competing joint projects - and at the moment, nobody knows the best way
to write a Dip AI. So for the moment, I think we are better off with
people doing their own thing, letting the different results play each
other, and learn what works and what doesn't.
JB:
My understanding of the mapping is that DAIDE would support variant maps
(variant rules might be a bit more problematic), but I think one really
good use for Diplomacy bots would be in playtesting maps to get general
senses of balance between powers. Most playtests are extremely limited
while it is easily possible to run thousands of DAIDE games on a variant
map to test its characteristics. I think I actually have a series of
questions about this. First, do most of the bots people are designing
have the capability of operating on other maps?
DN: As far as I know,
they all do.
One of the early decisions we made, was that the
project should not be limited to the standard map, as this may lead to
Bots that are coded to take advantages of the public knowledge and
specific features of the standard map (such as coding the opening book,
the stalemate lines, etc), rather than learning how to take a map and
work out the features on the fly.
Hence there is very little to
do to make a Bot handle all maps. The full definition of the map is sent
to the AI from the server when it connects (whether it's a variant or
standard).
JB:
And to the extent they do, it seems it wouldn't be hard to code them
into your mapper, would it?
DN: The easiest way to
code a new variant map, is to enter it into MapMaker (www.ellought.demon.co.uk/mapmaker.htm). From there, I
have a process which can fairly quickly convert it into all the files
required by the server and the mapper. Plus MapMaker has a lot of
internal checking built in, which will pick up a lot of the common
errors made when defining a map.
Entering a variant the size of
Standard into MapMaker takes about an hour.
JB:
Given current bot capability, do you think a variant map designer would
learn much from repeated bot tests of their maps in the design
phase? How do bots do at replicating some of the statistics on regular
Diplomacy games (realizing that there are large differences in those
across playing groups across time)?
DN: With the early
Bots, it definitely wasn't worth it. There was a huge disadvantage to
playing some powers. For instance, playing as
Austria or Germany against six
DumbBots is pretty difficult, as you tend to get attacked from all
sides, while playing England, France,
Italy or Turkey
against six DumbBots is extremely easy - and if you set seven DumbBots
playing against each other, it'd almost always be one of those four that
won.
But as the Bots have improved, so has the balance of their
play. And as that happens, they would become a much better source of
testing.
We have run a few DAIDE tournaments between the
different Bots, with around 2000 games per tournament. The statistics
from these tournaments do show a significant variation of results of
each power compared with human games, but unfortunately, there haven't
been any such tournaments run recently enough to involve the latest
Bots, which I would expect to give results that are far closer to the
results of Standard.
Even when Bots are able to play
sufficiently well, there are still things that a variant tester would
have to note. For instance, a game between Bots has never ended in an
agreed draw, as there is no Bot that is yet able to agree to a draw.
Furthermore, they also don't have any specific knowledge of how to set
up a stalemate line, so almost all games end in a solo. The few that
don't are where a Bot manages to form a stalemate line through its other
algorithms, and the game is eventually ended by the server terminating
it (which is usually set to happen if there have been 50 years without a
change of centre ownership!). Because of this, play testing with the
current Bots wouldn't tell you if the game is prone to stalemates or
solos. But it should give you a good idea of the balance of the
strengths of the powers in the variant. And hopefully future Bots will
resolve this issue.
Another thing the Bots can't do, is tell you
whether it's actually an interesting variant to play!!!
Of
course, there is one additional advantage of testing with Bots. With
human players, your results are going to be skewed by the skill level of
the players. By testing with every power played by an instance of the
same Bot, you have a perfectly level playing field from the player
ability perspective!
JB:
In my view, the negotiation part is not hugely important, I would think
that testing a variant map in no press Level 0 would give a designer
most of the input they needed, especially regarding statistics on which
centers particular bot countries ended up holding. Do you agree?
DN: I would go
further than that. My experience of testing variants is that No Press
games generally show up problems with a variant better than press games.
Playing a game with press allows the players to compensate for
weaknesses in their power, and counteract the strengths of other powers,
much better than they are able to in a no-press game. Hence if there is
an imbalance, I believe it will show up much better in repeated no-press
play than in repeated press play.
Of course, if you are trying
to make an unbalanced variant, one where one power is unusually strong,
and the other powers have to work together to deal with it, then this
doesn't follow. But variants like this are in the small minority.
JB:
One of the problems we all have is that this is a hobby. Daniel Loeb
made a fairly significant amount of progress in a relatively short
period of time with making his project a school/student activity. Some
of the efforts at developing bots has come from people working on
Masters degrees. But the "professionals" have done a horrible job (my
opinion) in designing bots, probably because they were up against
commercial constraints that made them repeatedly take inappropriate
shortcuts. I've heard the comment lately about "programming projects
taking over your life" as well (knowing you, like me, are much too busy
a person to actually have this or any other part of the hobby actually
take over). How would you assess the "incentive" problems, "time"
problems, and "gosh darn it, this is just a really difficult programming
task" problems in determining the speed and direction of DAIDE to date?
DN: I
don't think it should be that big a problem yet. Some people spend years
working on a hobby project - indeed, I know Sean Lorber says he spent 15
years developing SeaNail. And yet Albert, the best DAIDE Bot to date was
developed in a number of months. Given this, I don't see why there
should be barriers to other people writing better Bots that we currently
have while still keeping it as a hobby.
When the time comes that the
best Bots really are that good that it's more than a one-man hobby
commitment to write a new competitive Bot, that's when I think we really
need to look at forming a community project to write the next generation
of Bot. But I don't think we're anywhere near that yet.
JB:
I'd now like to turn to the future. I've often said, and still believe,
that truly solving the dipai problem is synonymous with the task of
solving the "Turing test" of AI that currently fascinates the futurists
like Ray Kurzweil and Mitch Kapor, but not much of anyone else. In that
sense, solving the dipai problem is a game, really interesting to crack,
but not of much external use. On the other hand, many of the futurists
believe this is a really important hurdle to cross and thus solving the
dipai problem in that way (having bots be "indistinguishable" from human
players in an open test) could be a huge breakthrough in human
evolution. I don't quite believe either of these extremes, though
remain fascinated by the ideas generated. What do you think?
DN: It's not
something I've really considered. I think when it comes to Diplomacy,
Bots have some huge advantages and some huge disadvantages. They can
calculate a massive number of possible orders in a very short length of
time, but on the other hand, they don't have the natural ability to
empathise with their ally, or to talk about anything other than the
game. Hence I think that when Diplomacy Bots do become competitive with
human players, they will do so by out-playing them in the parts of the
game they are good at, not by playing like them.
JB:
Would you care to give odds on a DAIDE bot passing a Turing test by 2029
(Kurzweil's date)?
DN: As in actually playing like a human, not just playing as
well as a human? I'd be very surprised. They may manage it in a no-press
game, but in a press game, even using the DAIDE language (or something
similar), I wouldn't expect them to be able to accurately mimic a human
in the way they use the language.
JB:
Any other thoughts on all this you would like to convey?
DN:
If people want to get involved in the project, then there are two ways
they can. The first is to write their own Bot. If this is of interest,
then join the DipAI YahooGroup, and have a look at the DAIDE Homepage
(www.daide.org.uk).
The other way they can help,
is by joining the Real Time Diplomacy group. This is a group of players
who play a complete no-press Diplomacy game online in a couple of hours,
using the the DAIDE software. When there are seven of them available,
they play an all-human game, but when there are less available, the
spaces are filled by Bots. Hence this is a great way for Bots to get
some playing experience in a human environment.
There have also
been a couple of spinoffs from this project. One of them is, having put
together a list of all the concepts you need to negotiate in Diplomacy,
I've then laid them out on a double-sided A4 sheet, in multiple
languages. Hence you have an instant translator for if you're ever
playing FtF Diplomacy with someone who you don't have a common language
with. See
www.ellought.demon.co.uk/dip_translator. It currently
covers five languages (English, French, German, Dutch, Italian).
And taking this one step further, I've already said that the DAIDE
Mapper can translate between the tokenised DAIDE language and English.
Well there's no direct link between the two, so it could equally
translate between DAIDE and French, German, or any other language. Once
this has been done, you could have two Mappers in a game, one in
English, one in French. Each player enters their negotiation in their
own language and it's automatically translated into the language of the
other player! It's not there yet, but it's something to look out for in
the future...
JB:
I wish you luck in this project and hope that more people engage with it
over time. One wishes one didn't have to work so much and had more time
for play..... people can see your site on this project at:
http://www.ellought.demon.co.uk/dipai/