Jump'n'Bump AI

Pyry Matikainen
CS 152

 

>>> Here is the applet (what you're undoubtedly here for) <<<

Overview

Jump'n'Bump is a simple arcade game for up to four players (I only deal with two) on a single keyboard: each player controls a rabbit which he or she can move left and right as well as make jump. The simplicity of the game (there are only three controls: left, right, and jump) means that it ought to be simple to evolve neural-network based AIs for it. This project is an attempt to do just that.

Phase I

The first attempt to evolve AIs for Jump'n'Bump evolved just the weights for fixed-topology non-recurrent neural-networks; the structure of the network had to be decided ahead of time. The first evolution method tried was to separate the population into two non-breeding subpopulations; then to decide the fitness of each individual, it would be pitted against the champions of both subpopulations. Unfortunately, this method resulted in AIs which took advantage of the quirks of the other AIs, but which did not have any 'general' Jump'n'Bump playing ability, as compared to a human. To encourage the AIs to be more agressive, a different evolution technique was then tried: rather than pitting AIs against AIs, they would be pitted against, at least initially, a stationary target: a rabbit that does not move at all of its own volition (it could still be pushed around). Incredibly, this resulted in AIs that were significantly better than the competitive technique; however, they were still very easy to defeat by humans. The reason for the failure was thought to be that the search space was simply too large: for an AI, it is unlikely that every portion will reference every input, so using a fully-connected network structure is wasteful. As a result, I looked for a method that would evolve the network structure at the same time.

Phase II

For the second phase, a technique called NEAT (neuro-evolution of augmenting topologies) was used to evolve AIs for Jump'n'Bump. As its full name suggests, NEAT evolves both the structure of the neural network as well as its weights; additionally, it starts minimally, that is, without any hidden nodes, so that the evolved networks are only as large as necessary. A paper on NEAT by Kenneth O. Stanely and Risto Miikkulainen suggested that NEAT was competitive with other techniques for evolving neural networks. Furthermore, the sample task for which the networks were trained in the paper was double pole balancing, a task which seemed similar in complexity to Jump'n'Bump. Thus, I was optimistic that NEAT would quickly solve the problem.

In fact, I was so optimistic that I wrote in my proposal that I hoped to work on varied terrain if I could get the AIs to beat a human (me). That would prove to be far too optimistic. (The thing I wrote was that I hoped to add an indirect encoding to NEAT: however, that turned out to not be useful, as I will explain later).

Rather than re-implementing NEAT, I availed myself to one of the many Java implementations: ANJI, or Another NEAT Java Implementation. I had already ported Jump'N'Bump to Java from the source code for Phase I, so it would be relatively simple to plug my Jump'n'Bump simulator into NEAT as a fitness function. After looking over the code, I found that ANJI supports an entire class of relevant fitness functions: tournaments. Thus, I converted my Java Jump'n'Bump to work as the underlying game for an ANJI tournament.

However, the results from ANJI, while initially promising when in ten generations it already seemed as good (qualitatively) as in 100 generations of the fixed topology, were not going anywhere. Like the fixed topology networks, any sort of competitive tournament resulted in AIs which simply exploited weaknesses in other AIs rather than being 'generally good' at Jump'n'Bump. Since evolving against a stationary target had resulted in a great improvement for the fixed topology case, I reasoned that if I were to write a heuristic based AI myself, by training against that the networks would be forced to become at least as generally good as whatever heuristic AI I wrote. My AI is fairly simple: if it is above the other player, it attempts to stay above them, and if it is below the other player, it attempts to move out of the way. Nevertheless, training against this heuristic does produce AIs that are better than it, and which have a slightly 'human-like' quality. They might be comparable in skill to a human who has never played Jump'n'Bump before, but who understands the basic principles.

Interestingly, these AIs had very few hidden nodes: in fact, an AI that was about as good as the fixed topology networks best result had only one hidden node. In general, the NEAT evolved networks had at most ten hidden nodes, with most having only three to four. Hence, my proposal about adding an indirect encoding would prove to be less useful than I had hoped: there simply were not any repeated structures that could be more easily encoded. Additionally, my proposal did not take into account that the ordering of the genomes would have to be saved for the 'blocks' to work properly; but saving the genome ordering introduces an entire new level of complexity. Hence, I did not add any indirect encoding.

The Tournament

To resolve what might be the best evolutionary technique, I decided to evolve a number of AIs and place them, along with the heuristic hand-written AI and the stationary 'dumb' AI, in a round-robin tournament.

Competitors

DumbCompetitor and HandCodedCompetitor: these correspond to the dumb (just sits there) AI and the hand-coded heuristic AI.

fixedsmall100, fixedsmall300, fixedlarge100, and fixedlarge300: Fixed topology networks evolved against stationary targets; the 'small' fixed networks were a 15-3-3 structure (15 inputs, 3 hidden, 3 outputs) whereas the 'large' fixed networks were a 15-10-10-3 structure. The '100' competitors were evolved for 100 generations, and the '300' competitors were evolved for 300 generations. So, for instance, fixedlarge100 was a15-10-10-3 network evolved for 100 generations.

anheur100 and anheur300: Standing for ANji HEuristic; these networks were evolved, using NEAT, against the heuristic AI for 100 and 300 generations respectively.

anse100 and anse300: Standing for ANji Single Elimination, these networks were evolved against each other in single-elimination tournaments for 100 and 300 generations respectively.

ankr100: Standing for ANji K Random opponents, this network was evolved by pitting each network in the population against k (5 in this case) random opponents from the population to determine its fitness; ankr100 was the champion after 100 generations of evolution.

pancake100 and pancake300: These were evolved using a 'hybrid' approach. They were first evolved for 50 generations against the heuristic, then for 50 in single-elimination tournaments, and then 50 against the heuristic and so on; pancake100 was the champion after 100 generations of evolution total, and likewise pancake300 was the champion after 300 generations of evolution total.

The Fights

These competitors then engaged in a round-robin: each competitor played each other competitor in 1000 games of Jump'n'Bump (each game was 1200 updates, or approximately one minute long if it were played in 'real-time' rather than as fast as the computer could do the calculations).

You can fight the AIs against each other (or yourself) here.

Results

Full results in bigresults.txt.

Wins:

How many times the competitor to the left has won against the competitor in the given column (out of 1000).

Wins
  DumbCompetitor HandCodedCompetitor fixedsmall100 fixedsmall300 fixedlarge100 fixedlarge300 anheur100 anheur300 anse100 anse300 ankr100 pancake100 pancake300
DumbCompetitor * 0 0 0 0 0 17 8 22 9 14 27 13
HandCodedCompetitor 1000 * 949 965 980 984 468 232 984 989 851 993 996
fixedsmall100 1000 33 * 621 307 347 167 427 119 53 93 334 933
fixedsmall300 997 16 243 * 312 234 159 101 358 125 241 68 991
fixedlarge100 998 14 595 471 * 358 251 95 427 180 382 163 880
fixedlarge300 1000 9 558 587 552 * 298 241 587 380 305 260 552
anheur100 407 421 733 768 626 603 * 460 166 141 220 681 167
anheur300 729 681 435 831 845 655 378 * 266 13 399 300 136
anse100 208 10 830 563 451 306 146 456 * 403 352 75 401
anse300 189 5 934 808 724 501 191 937 409 * 355 72 460
ankr100 470 86 859 674 497 568 191 451 477 174 * 112 518
pancake100 232 4 577 901 769 656 215 602 48 23 70 * 70
pancake300 128 1 48 5 68 320 155 627 198 419 343 73 *

Total Wins/Losses/Ties per Competitor:

  Wins Losses Ratio
DumbCompetitor 110 7358 0.015
HandCodedCompetitor 10391 1280 8.12
fixedsmall100 4434 6761 0.66
fixedsmall300 3845 7194 0.53
fixedlarge100 4814 6131 0.78
fixedlarge300 5329 5532 0.96
anheur100 5393 2636 2.05
anheur300 5668 4637 1.22
anse100 4201 4061 1.03
anse300 5585 2912 1.92
ankr100 5077 3625 1.40
pancake100 4170 3158 1.32
pancake300 2385 6117 0.39

Discussion

Given that these are for 1000 matches per pairing, the values are unlikely to be significantly influenced by random chance (although there is a chance element involved in the selection of champions). These results suggest that it has been difficult to evolve 'generally good' AIs since 'goodness' at Jump'n'Bump is not transitive. For instance, pancake100 wins against anheur300 602/1000 times; in turn, anheur300 beats HandCodedCompetitor 681/1000 times, but HandCodedCompetitor beats pancake100 a whopping 993/1000 times; that is, pancake100 beats anheur300 which beats HandCoded which in turn beats pancake100. The most surprising result is that HandCoded is the clear winner in general, even though anheur300 beats HandCoded (as we would expect it to, since anheur300 was specifically evolved to beat HandCoded). The anheur's might be considered the 'most general' of the evolved AIs, even though they have some surprising weaknesses (such as anheur300 being beaten by anse300 [a ridiculous 93.7% of the time] and the pancakes). These results also suggest that the best way to evolve a generally good AI is to have generally good opponents to evolve them against: a catch-22 of sorts.

Future Work

The best generally good opponents are of course humans: since NEAT evolves relatively fast (within 100 to 300 generations with a population of only 100), it is well within the range of possibility that a reasonably popular Jump'n'Bump game on the internet would find enough players to be able to use actual humans as the opponents to evolve AIs against.

Additionally, varied terrain could still be investigated: the best idea I have had is to present the network with a 'window' to the terrain surrounding its character.

Downloads

jbai.zip (readme.txt in top level explains how to use the various components). You will need to download ANJI yourself since I am not sure about the technicalities involving distributing it (eaiser to just link to it than figure out what the license terms would require me to do).

finalpresentation.ppt : the final presentation

paperpresentation.ppt : presentation on a paper on NEAT

Proposal.doc : project proposal

Contact

You can contact me at pmatikainen AT cs.hmc.edu