[go: up one dir, main page]

Menu

Tree [r9] / trunk /
 History

HTTPS access


File Date Author Commit
 features 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 input 2009-10-09 sliv [r9] Put extra distribution into subdir makefiles
 learning 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 m4 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 policies 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 rlgo 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 rlgomain 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 scripts 2009-10-09 sliv [r9] Put extra distribution into subdir makefiles
 settings 2009-10-09 sliv [r9] Put extra distribution into subdir makefiles
 unittest 2009-10-09 sliv [r3] Remove unittest deps
 utils 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 AUTHORS 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 COPYING 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 COPYING.LESSER 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 ChangeLog 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 INSTALL 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 Makefile.am 2009-10-09 sliv [r9] Put extra distribution into subdir makefiles
 Makefile.in 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 NEWS 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 README 2009-10-09 sliv [r4] update README
 TODO 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 aclocal.m4 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 config.guess 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 config.sub 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 configure 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 configure.ac 2009-10-09 sliv [r8] Fix problem with boost unit test framework macro
 depcomp 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 install-sh 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn
 missing 2009-10-08 sliv [r1] Import RLGO 2.5 into sourceforge svn

Read Me

RLGO README

1. Contents of this file

  1. Contents
  2. Overview
  3. Building RLGO
  4. Running RLGO
  5. Settings files
  6. Running scripts


2. Overview

RLGO is a Go playing program based on reinforcement learning and simulation-based search. 
It is built on top of the Fuego Computer Go library.


3. Building RLGO

RLGO requires the following packages in order to build:

a) Boost (version 1.33.1 or higher) must be installed.
b) Fuego 0.4 must be downloaded and built.

In addition, RLGO requires the following packages to run experiment scripts:

c) GoGui must be installed.
d) GnuGo must be installed.
e) BayesElo must be installed.

See the INSTALL file for further installation issues.

4. Running RLGO

RLGO uses the Go Text Protocol (GTP) (http://www.lysator.liu.se/~gunnar/gtp/gtp2-spec-draft2/gtp2-spec.html).
As with any GTP program, RLGO can be run from the command line, or from an interface such as GoGui.

RLGO supports the following command line options:

-settings filename  
    Use specified settings file (see section 5)
-Foo Bar
    Override all settings with the name "Foo" found in the settings file with the value "Bar"
-Fee.Foo Bar
    Override the setting with the name "Foo" in the object "Fee" with the value "Bar"

The value "Bar" must be a single alphanumeric string. 
The tilde character '~' will be replaced by a space during the override 
  (to allow for multi-word strings).

If no settings file is specified then RLGO will use dyna2-settings.set by default.

Some particularly important settings to know about are:

OutputPath: where output files (log files, saved games, etc.) should be placed
BoardSize:  size of board to use
SelfPlay:   play a series of test-games using self-play, instead of waiting for GTP commands
MaxGames:   how many games to simulate before each real move

By default RLGO is launched in GTP mode. After initialisation, it will wait to receive GTP commands on  the standard input. To launch RLGO in self-test mode, launch RLGO with the option -SelfPlay SelfPlay.

RLGO must be initialised with the correct board size, using the BoardSize setting. It is not sufficient to set the boardsize using a GTP command.


5. Settings files

The setup of RLGO is controlled by a settings file. The settings file not only specifies the values of parameters, but also which objects are created by RLGO. The purpose of the settings file is to make it easy to run scripts that vary the setup of RLGO, for example to run RLGO with three different learning rules, or to run RLGO with four different step-sizes (see section 6).

Each object is listed in the settings file with the following format:

Object = <CLASSNAME>
{
    ID = <OBJECTNAME>
    <SETTING1> = <VALUE1>
    <SETTING2> = <VALUE2>
	...
}

Settings may refer to other objects by their identifier <OBJECTNAME>. For example, RLGO supports many different learning rules (for example the general-settings.set file includes TD0, TDLambda and LambdaReturn algorithms). When these rules are included in the settings file, objects of the specified classes are created and initialised on start-up. However, this does not imply that these objects are used! In this example, the trainer specifies which learning rule to use, for example by specifying LearningRule = TD0.

Settings are case-sensitive and order-sensitive. There must be whitespace on either side of the equals sign " = ". Comments may be included in settings files by prefixing with a hash #. The tilde character is converted into a space internally (this avoids problems in scripts with multi-word settings). Some settings expect vectors of objects, in the following format (whitespace must be included before, between and after each object name):

<SETTING> = <NUM_OBJECTS> [ <OBJECTNAME1> <OBJECTNAME2> ... ]

There are two special types of object. The RlInclude object includes all settings from another settings file. The RlOverride object overrides settings in any subsequent objects. This allows for different variants of a main settings file to be maintained, by including the general settings and overriding specific settings.

The meaning of each setting is documented in the corresponding class declaration. The order in which settings are expected is not currently documented (@TODO), but is easily identified from the LoadSettings method of the corresponding class.

Settings may be overridden in the following ways (highest priority listed first):

a) Command line (see section 4).
b) Environment variable. Any environment variables with the prefix "Rl" are assumed to be setting overrides. The prefix is stripped to give the setting.
c) RlOverride object (see above)

Standard settings files for RLGO include:

  localshape-settings.set Local shape features with a long-term memory (no learning, no search)
  tdlearn-settings.set    Temporal-difference learning with a long-term memory
  tdsearch-settings.set   Temporal-difference search with a short-term memory
  dyna2-settings.set      Dyna-2 search with long and short-term memories
  tourney-settings.set    Settings for RLGO with tournament settings (time control, alpha-beta search, pondering)


6. Running scripts

Experiments with RLGO are executed by running shell scripts in the scripts subdirectory. Each script provides its own command line options, which can be viewed by invoking the script with no arguments. Some examples of using these scripts are provided below:

rlgomain/rlgo -settings tdlearn -Alpha 0.2 -Lambda 0.9 -LearningRule TDLambda
# Run RLGO with tdlearn-settings.set, but overriding the setting Alpha to have a value of 0.2, lambda to have a value of 0.9, and using the TD Lambda learning rule.

scripts/getprogram.sh tdsearch -MaxGames 10000
# Output the command line for RLGO using tdsearch-settings.set and running 10000 simulations per move

scripts/getprogram.sh gnugo0
# Output the command line for running GnuGo level 0 in GTP mode

scripts/multi-matches.sh tdlearn gnugo0 output/learningrules LearningRule "TD0 TDLamda LambdaReturn" 9 0 1000 submit-para.sh -MaxGames 10000
# Play three 1000-game matches in parallel between RLGO using tdlearn-settings.set and GnuGo level 0, using three different learning rules. The output will be placed into the output/learningrules directory, including RLGO's log files and .sgf game records of all games. RLGO will execute 10000 simulations per move.

analyze.sh ../output/alpha LearningRule "TD0 TDLamda LambdaReturn" 1600
# Analyze the results of the above multi-matches.sh invocation. Elo ratings will be generated relative to GnuGo level 0, which is assigned a 1600 Elo rating.

scripts/multi-run.sh tdlearn output/alpha Alpha "0.1 0.05 0.02 0.01" 9 0 1000 submit-para.sh -MaxGames 100000
# Run four training runs of 100000 games in parallel, using RLGO and tdlearn-settings.set, using four different values for the step-size. The output will be placed into the output/alpha directory.