The purpose of this study is to determine the effects of additional computing power for scalable 9x9 Computer Go programs. The programs that are being tested use a popular new technique based on Monte Carlo simulations (randomly played games in this case) combined with best first tree searching called UCT.
This study tests 13 versions of two different programs which play matches between each other in a competition for ELO rating points. Elo ratings are a direct measure of playing strength. Each version of a particular program spends twice as much effort playing it's move as the previous version, as measured by the number of random games played to choose a move.
Matches are scheduled between players using a random scheduling algorithm where players of similar strength are much more likely to play each other, but any pairing is still possible. This works far better for assessment purposes than, for instance, a round robin where matches between players of considerably different skill are common.
The amount of computing effort required to test these programs at higher levels is substantial and several people have contributed computing resources to this effort. On a single computer this study would require many months of computing effort in order to get enough data to be statistically meaningful.
The two programs playing in this study are Mogo and FatMan. Mogo is probably the strongest 9x9 go playing program in the world at this time (early 2008) and FatMan is a very simple and generic UCT based program that plays on CGOS, a game server just for Go playing programs. Additionally, a popular program called "Gnugo" plays in this study as a fixed point of reference at 1800 ELO.
In the following table, there are 13 versions of each program, labeled 01 - 13. Mogo_01 does only 64 monte carlo simulations in the evaluation porition of the tree search. Mogo_02 doubles this and each subsequent version doubles the previous in number of simulations.
The same formula applies to FatMan, except the number of simulations is adjusted upward to correspond in strength (at least roughly) with the much stronger Mogo and thus FatMan_01 does 1024 simulations. To put it another way, FatMan_01 needs 1024 simulations to be roughly equal in strength to Mogo_01 which does 64 simulations.
We can compute the number of simulations for any level as:
Mogo simulations at level N = 64 * 2^(N-1)
FatMan simulations at level N = 1024 * 2^(N-1)
Command line program invocation (weakest level)
|Mogo||mogo --9 --nbTotalSimulations 64 --playsAgainstHuman 0|
|FatMan||FatMan -l 1 -r (level 1 = 1024 simulations)|
|Gnugo-3.7.11||gnugo --mode gtp --capture-all-dead --chinese-rules --min-level 8 --max-level 8 --positional-superko|
|X axis:||Each CPU Doubling|
|Y axis:||ELO Rating|