RLGO Code

Brought to you by: sliv

Tree [r9] / trunk /

History

HTTPS access

File	Date	Author	Commit
features	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
input	2009-10-09	sliv	[r9] Put extra distribution into subdir makefiles
learning	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
m4	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
policies	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
rlgo	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
rlgomain	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
scripts	2009-10-09	sliv	[r9] Put extra distribution into subdir makefiles
settings	2009-10-09	sliv	[r9] Put extra distribution into subdir makefiles
unittest	2009-10-09	sliv	[r3] Remove unittest deps
utils	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
AUTHORS	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
COPYING	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
COPYING.LESSER	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
ChangeLog	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
INSTALL	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
Makefile.am	2009-10-09	sliv	[r9] Put extra distribution into subdir makefiles
Makefile.in	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
NEWS	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
README	2009-10-09	sliv	[r4] update README
TODO	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
aclocal.m4	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
config.guess	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
config.sub	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
configure	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
configure.ac	2009-10-09	sliv	[r8] Fix problem with boost unit test framework macro
depcomp	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
install-sh	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn
missing	2009-10-08	sliv	[r1] Import RLGO 2.5 into sourceforge svn

Read Me

RLGO README

1. Contents of this file

1. Contents
2. Overview
3. Building RLGO
4. Running RLGO
5. Settings files
6. Running scripts

2. Overview

RLGO is a Go playing program based on reinforcement learning and simulation-based search.
It is built on top of the Fuego Computer Go library.

3. Building RLGO

RLGO requires the following packages in order to build:

a) Boost (version 1.33.1 or higher) must be installed.
b) Fuego 0.4 must be downloaded and built.

In addition, RLGO requires the following packages to run experiment scripts:

c) GoGui must be installed.
d) GnuGo must be installed.
e) BayesElo must be installed.

See the INSTALL file for further installation issues.

4. Running RLGO

RLGO uses the Go Text Protocol (GTP) (http://www.lysator.liu.se/~gunnar/gtp/gtp2-spec-draft2/gtp2-spec.html).
As with any GTP program, RLGO can be run from the command line, or from an interface such as GoGui.

RLGO supports the following command line options:

-settings filename
Use specified settings file (see section 5)
-Foo Bar
Override all settings with the name "Foo" found in the settings file with the value "Bar"
-Fee.Foo Bar
Override the setting with the name "Foo" in the object "Fee" with the value "Bar"

The value "Bar" must be a single alphanumeric string.
The tilde character '~' will be replaced by a space during the override
(to allow for multi-word strings).

If no settings file is specified then RLGO will use dyna2-settings.set by default.

Some particularly important settings to know about are:

OutputPath: where output files (log files, saved games, etc.) should be placed
BoardSize: size of board to use
SelfPlay: play a series of test-games using self-play, instead of waiting for GTP commands
MaxGames: how many games to simulate before each real move

By default RLGO is launched in GTP mode. After initialisation, it will wait to receive GTP commands on the standard input. To launch RLGO in self-test mode, launch RLGO with the option -SelfPlay SelfPlay.

RLGO must be initialised with the correct board size, using the BoardSize setting. It is not sufficient to set the boardsize using a GTP command.

5. Settings files

The setup of RLGO is controlled by a settings file. The settings file not only specifies the values of parameters, but also which objects are created by RLGO. The purpose of the settings file is to make it easy to run scripts that vary the setup of RLGO, for example to run RLGO with three different learning rules, or to run RLGO with four different step-sizes (see section 6).

Each object is listed in the settings file with the following format:

Object = <CLASSNAME>
{
ID = <OBJECTNAME>
<SETTING1> = <VALUE1>
<SETTING2> = <VALUE2>
...
}

Settings may refer to other objects by their identifier <OBJECTNAME>. For example, RLGO supports many different learning rules (for example the general-settings.set file includes TD0, TDLambda and LambdaReturn algorithms). When these rules are included in the settings file, objects of the specified classes are created and initialised on start-up. However, this does not imply that these objects are used! In this example, the trainer specifies which learning rule to use, for example by specifying LearningRule = TD0.

Settings are case-sensitive and order-sensitive. There must be whitespace on either side of the equals sign " = ". Comments may be included in settings files by prefixing with a hash #. The tilde character is converted into a space internally (this avoids problems in scripts with multi-word settings). Some settings expect vectors of objects, in the following format (whitespace must be included before, between and after each object name):

<SETTING> = <NUM_OBJECTS> [ <OBJECTNAME1> <OBJECTNAME2> ... ]

There are two special types of object. The RlInclude object includes all settings from another settings file. The RlOverride object overrides settings in any subsequent objects. This allows for different variants of a main settings file to be maintained, by including the general settings and overriding specific settings.

The meaning of each setting is documented in the corresponding class declaration. The order in which settings are expected is not currently documented (@TODO), but is easily identified from the LoadSettings method of the corresponding class.

Settings may be overridden in the following ways (highest priority listed first):

a) Command line (see section 4).
b) Environment variable. Any environment variables with the prefix "Rl" are assumed to be setting overrides. The prefix is stripped to give the setting.
c) RlOverride object (see above)

Standard settings files for RLGO include:

localshape-settings.set Local shape features with a long-term memory (no learning, no search)
tdlearn-settings.set Temporal-difference learning with a long-term memory
tdsearch-settings.set Temporal-difference search with a short-term memory
dyna2-settings.set Dyna-2 search with long and short-term memories
tourney-settings.set Settings for RLGO with tournament settings (time control, alpha-beta search, pondering)

6. Running scripts

Experiments with RLGO are executed by running shell scripts in the scripts subdirectory. Each script provides its own command line options, which can be viewed by invoking the script with no arguments. Some examples of using these scripts are provided below:

rlgomain/rlgo -settings tdlearn -Alpha 0.2 -Lambda 0.9 -LearningRule TDLambda
# Run RLGO with tdlearn-settings.set, but overriding the setting Alpha to have a value of 0.2, lambda to have a value of 0.9, and using the TD Lambda learning rule.

scripts/getprogram.sh tdsearch -MaxGames 10000
# Output the command line for RLGO using tdsearch-settings.set and running 10000 simulations per move

scripts/getprogram.sh gnugo0
# Output the command line for running GnuGo level 0 in GTP mode

scripts/multi-matches.sh tdlearn gnugo0 output/learningrules LearningRule "TD0 TDLamda LambdaReturn" 9 0 1000 submit-para.sh -MaxGames 10000
# Play three 1000-game matches in parallel between RLGO using tdlearn-settings.set and GnuGo level 0, using three different learning rules. The output will be placed into the output/learningrules directory, including RLGO's log files and .sgf game records of all games. RLGO will execute 10000 simulations per move.

analyze.sh ../output/alpha LearningRule "TD0 TDLamda LambdaReturn" 1600
# Analyze the results of the above multi-matches.sh invocation. Elo ratings will be generated relative to GnuGo level 0, which is assigned a 1600 Elo rating.

scripts/multi-run.sh tdlearn output/alpha Alpha "0.1 0.05 0.02 0.01" 9 0 1000 submit-para.sh -MaxGames 100000
# Run four training runs of 100000 games in parallel, using RLGO and tdlearn-settings.set, using four different values for the step-size. The output will be placed into the output/alpha directory.

RLGO Code

Tree [r9] / trunk / Download Snapshot History

Read Me

Tree [r9] / trunk /

History