[go: up one dir, main page]

Menu

Tree [r1] / trunk /
 History

HTTPS access


File Date Author Commit
 features 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 input 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 learning 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 m4 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 policies 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 rlgo 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 rlgomain 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 scripts 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 settings 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 utils 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 AUTHORS 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 COPYING 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 COPYING.LESSER 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 ChangeLog 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 INSTALL 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 Makefile.am 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 Makefile.in 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 NEWS 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 README 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 TODO 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 aclocal.m4 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 config.guess 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 config.sub 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 configure 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 configure.ac 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 depcomp 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 install-sh 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 missing 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn

Read Me

RLGO README

1. Contents of this file

  1. Contents
  2. Overview
  3. Building RLGO
  4. Running RLGO
  5. Settings files
  6. Running scripts


2. Overview

RLGO is a Go playing program based on reinforcement learning methods. It is built on top of the Fuego Computer Go library.


3. Building RLGO

The following steps are required to build RLGO on a Unix based system:

a) Boost must be installed.
b) Fuego 0.1 must be built.
c) Build RLGO with the following steps:

cd rlgo/linux
make

See the INSTALL file for further installation issues.


3. Running RLGO

RLGO can be run from the command line as follows:

cd rlgo/linux/build/release
rlgo

Alternatively, RLGO can be run from GoGui by specifying absolute paths for the following:

Program: rlgo/linux/build/release/rlgo
Working directory: rlgo/linux/build/release

RLGO supports the following command line options:

-settings filename  
    Use specified settings file (see section 5)
-Foo Bar
    Override all settings with the name "Foo" found in the settings file with the value "Bar"
-Fee.Foo Bar
    Override the setting with the name "Foo" in the object "Fee" with the value "Bar"

The value "Bar" must be a single alphanumeric string. 
The tilde character '~' will be replaced by a space during the override 
  (to allow for multi-word strings).

If no settings file is specified then RLGO will use dyna2-settings.set by default.

Some particularly important settings to know about are:

DataPath:   where to find the rlgo directory
OutputPath: where output files (log files, saved games, etc.) should be placed
BoardSize:  size of board to use
TestGames:  play a series of test-games using self-play, instead of waiting for GTP commands
MaxGames:   how many games to simulate before each real move

To run RLGO from a different location, override the DataPath and OutputPath settings to point to the rlgo/data and rlgo/output directories respectively. 

By default RLGO is launched in GTP mode. After initialisation, it will wait to receive GTP commands on  the standard input. To launch RLGO in self-test mode, specify some number of test games to play.

RLGO must be initialised with the correct board size, using the BoardSize setting. It is not sufficient to set the boardsize using a GTP command.


4. Settings files

The setup of RLGO is controlled by a settings file. The settings file not only specifies the values of parameters, but also which objects are created by RLGO. The purpose of the settings file is to make it easy to run scripts that vary the setup of RLGO, for example to run RLGO with three different learning rules, or to run RLGO with four different step-sizes (see section 6).

Each object is listed in the settings file with the following format:

Object = <CLASSNAME>
{
    ID = <OBJECTNAME>
    <SETTING1> = <VALUE1>
    <SETTING2> = <VALUE2>
	...
}

Settings may refer to other objects by their identifier <OBJECTNAME>. For example, RLGO supports many different learning rules (for example the dyna2-settings.set file includes TD0, TDLambda and LambdaReturn algorithms). When these rules are included in the settings file, objects of the specified classes are created and initialised on start-up. However, this does not imply that these objects are used! In this example, the trainer specifies which learning rule to use, for example by specifying LearningRule = TD0.

Settings are case-sensitive and order-sensitive. There must be whitespace on either side of the equals sign " = ". Comments may be included in settings files by prefixing with a hash #. The tilde character is converted into a space internally (this avoids problems in scripts with multi-word settings). Some settings expect vectors of objects, in the following format (whitespace must be included):

<SETTING> = <NUM_OBJECTS> [ <OBJECTNAME1> <OBJECTNAME2> ... ]

There are two special types of object. The RlInclude object includes all settings from another settings file. The RlOverride object overrides settings in any subsequent objects. This allows for different variants of a main settings file to be maintained, by including the general settings and overriding specific settings.

The meaning of each setting is documented in the corresponding class declaration. The order in which settings are expected is not currently documented (@TODO), but is easily identified from the LoadSettings method of the corresponding class.

Settings may be overridden in the following ways (highest priority listed first):

a) Command line (see section 4).
b) Environment variable. Any environment variables with the prefix "Rl" are assumed to be setting overrides. The prefix is stripped to give the setting.
c) RlOverride object (see above)

Standard settings files for RLGO include:

  localshape-settings.set Local shape features in long-term memory
  tdlearn-settings.set    Settings for temporal-difference learning with permanent memory only
  tdsearch-settings.set   Settings for temporal-difference search with permanent memory only
  dyna2-settings.set      Dyna-2 with both long and short-term memories
  tourney-settings.set    Dyna-2 with tournament settings (time control, alpha-beta search, pondering)

6. Running scripts

Experiments with RLGO are executed by running scripts. The working directory is assumed to be rlgo/scripts. It is assumed that the current directory . is in the search path. 

a) getprogram script

The getprogram script outputs the complete command line for a particular Go program. It is invoked as follows:

getprogram.sh <NAME>

The <NAME> of a Go program is a short abbreviation from the following list:

	gnugod    GnuGo set to default playing strength
	gnugo0    GnuGo set to level 0
	fuego     Fuego
	farm     Multi-process version of RLGO (invokes several different RLGO processes)

Otherwise, if the file <NAME>-settings.set exists, then RLGO will be invoked using this settings file. Note that additional arguments will be passed through to the command line invocation. For example:

getprogram.sh general -MaxGames 10000

will invoke RLGO using the general settings file, with 10000 simulations per move.

b) multi-play script

This is a script to run a series of matches between RLGO and a fixed opponent, varying the value of one setting in RLGO. This script uses the gogui-twogtp program (part of the GoGui package), and GnuGo. It assumes that they are available on the path. The command line arguments to multi-play are as follows:

multi-play.sh player opponent path setting values size minutes games submit [options]

player:   Tag used to identify player in the getprogram.sh script
opponent: Tag used to identify opponent in the getprogram.sh script
path:     Output path where match data should be saved
setting:  Setting to vary in player
values:   List of values to try for setting, e.g. "0.1 0.2 0.3"
size:     Board size to use
minutes:  Time control to use (0 for no time control)
games:    Number of games to play in each match
submit:   Script used to play matches, listed below
options:  Any set of options supported by RLGO

seqplay.sh:           Play matches sequentially
paraplay.sh:          Play matches in parallel
queueplay.sh hours:   Submit matches to queue, requesting this number of hours
hostplay.sh hostfile: Submit matches to hosts listed in file
testplay.sh:          Output match commands without executing them

For example, the following invocation will play three 1000-game matches in parallel between RLGO and GnuGo level 0, using three different learning rules. The output will be placed into the rlgo/output/alpha directory, including RLGO's log files and .sgf game records of all games. RLGO will execute 10000 simulations per move.

multi-play.sh general gnugo0 ../output/alpha LearningRule "TD0 TDLamda LambdaReturn" 9 0 1000 paraplay.sh -MaxGames 10000

c) analyze script

This script analyzes the output of a match run by the multi-play script. It outputs a table giving the results of each match for each setting, and a corresponding Elo rating. The command line arguments are as follows:

analyze.sh path setting values [elo]

path:     Path where match data is saved
setting:  Setting varied in player
values:   List of values tried for setting
elo:      Base Elo rating for comparison

For example the following command will analyze the results of the multi-play match from the previous example. Elo ratings will be generated relative to GnuGo level 0, which is assigned a 1600 Elo rating:

analyze.sh ../output/alpha LearningRule "TD0 TDLamda LambdaReturn" 1600

d) multi-run script

This script runs several different versions of RLGO in self-play mode, varying a single setting. This script allows training runs to be performed with different settings. The command line arguments are:

multi-run.sh player path setting values games submit [options]

player:   Tag used to identify player in the getprogram.sh script
path:     Path where match data should be saved
setting:  Setting to vary in player
values:   List of values to try for setting, e.g. "0.1 0.2 0.3"
games:    Number of games to run
submit:   Script used to play matches, listed below
options:  Any set of options supported by RLGO

seqplay.sh:           Play matches sequentially
paraplay.sh:          Play matches in parallel
queueplay.sh hours:   Submit matches to queue, requesting this number of hours
hostplay.sh hostfile: Submit matches to hosts listed in file
testplay.sh:          Output match commands without executing them

For example, the following invocation will run four training runs of 100000 games in parallel, using four different values for the step-size. The output will be placed into the rlgo/output/alpha directory.

multi-run.sh permanent ../output/alpha Alpha "0.1 0.05 0.02 0.01" 9 0 1000 paraplay.sh -MaxGames 10000