How Do I Run Sutton and Barton's "Reinforcement Learning" Lisp Code?
Asked Answered
S

5

7

I have been reading a lot about Reinforcement Learning lately, and I have found "Reinforcement Learning: An Introduction" to be an excellent guide. The author's helpfully provice source code for a lot of their worked examples.

Before I begin the question I should point out that my practical knowledge of lisp is minimal. I know the basic concepts and how it works, but I have never really used lisp in a meaningful way, so it is likely I am just doing something incredibly n00b-ish. :)

Also, the author states on his page that he will not answer questions about his code, so I did not contact him, and figured Stack Overflow would be a much better choice.

I have been trying to run the code on a linux machine, using both GNU's CLISP and SBCL but have not been able to run it. I keep getting a whole list of errors using either interpreter. In particular, most of the code appears to use a lot of utilities contained in a file 'utilities.lisp' which contains the lines

(defpackage :rss-utilities
  (:use :common-lisp :ccl)
  (:nicknames :ut))

(in-package :ut)

The :ccl seems to refer to some kind of Mac-based version of lisp, but I could not confirm this, it could just be some other package of code.

> * (load "utilities.lisp")
>
> debugger invoked on a
> SB-KERNEL:SIMPLE-PACKAGE-ERROR in
> thread #<THREAD "initial thread"
> RUNNING {100266AC51}>:   The name
> "CCL" does not designate any package.
> 
> Type HELP for debugger help, or
> (SB-EXT:QUIT) to exit from SBCL.
> 
> restarts (invokable by number or by
> possibly-abbreviated name):   0:
> [ABORT] Exit debugger, returning to
> top level.
> 
> (SB-INT:%FIND-PACKAGE-OR-LOSE "CCL")

I tried removing this particular piece (changing the line to

  (:use :common-lisp)

but that just created more errors.

> ; in: LAMBDA NIL ;     (+
> RSS-UTILITIES::*MENUBAR-BOTTOM* ;     
> (/ (- RSS-UTILITIES::MAX-V
> RSS-UTILITIES::V-SIZE) 2)) ;  ; caught
> WARNING: ;   undefined variable:
> *MENUBAR-BOTTOM*
> 
> ;     (-
> RSS-UTILITIES::*SCREEN-HEIGHT*
> RSS-UTILITIES::*MENUBAR-BOTTOM*) ;  ;
> caught WARNING: ;   undefined
> variable: *SCREEN-HEIGHT*
> 
> ;     (IF RSS-UTILITIES::CONTAINER ;  
> (RSS-UTILITIES::POINT-H ;         
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::CONTAINER)) ;        
> RSS-UTILITIES::*SCREEN-WIDTH*) ;  ;
> caught WARNING: ;   undefined
> variable: *SCREEN-WIDTH*
> 
> ;     (RSS-UTILITIES::POINT-H
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ;  ; caught
> STYLE-WARNING: ;   undefined function:
> POINT-H
> 
> ;     (RSS-UTILITIES::POINT-V
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ;  ; caught
> STYLE-WARNING: ;   undefined function:
> POINT-V

Anybody got any idea how I can run this code? Am I just totally ignorant of all things lisp?

UPDATE [March 2009]: I installed Clozure, but was still not able to get the code to run.

At the CCL command prompt, the command

(load "utilities.lisp")

results in the following error output:

;Compiler warnings :
;   In CENTER-VIEW: Undeclared free variable *SCREEN-HEIGHT*
;   In CENTER-VIEW: Undeclared free variable *SCREEN-WIDTH*
;   In CENTER-VIEW: Undeclared free variable *MENUBAR-BOTTOM* (2 references)
> Error: Undefined function RANDOM-STATE called with arguments (64497 9) .
> While executing: CCL::READ-DISPATCH, in process listener(1).
> Type :GO to continue, :POP to abort, :R for a list of available restarts.
> If continued: Retry applying RANDOM-STATE to (64497 9).
> Type :? for other options.
1 >

Unfortuately, I'm still learning about lisp, so while I have a sense that something is not fully defined, I do not really understand how to read these error messages.

Schizont answered 10/2, 2009 at 20:9 Comment(0)
S
3

That code is for Macintosh Common Lisp (MCL). It will only run there. Using Clozure CL (CCL) will not help. You would have to comment the graphics code. The random state stuff also is slightly special for MCL. You have to port it to portable Common Lisp (make-random-state, etc.). Also the file names are special for the Mac.

Clozure CL is a fork from Macintosh Common Lisp, but has be changed to Unix conventions (pathnames, ...) and does not include the special graphics code of MCL.

Soddy answered 13/3, 2009 at 2:16 Comment(0)
M
4

My guess is that the code is CCL-dependent, so use CCL instead of CLISP or SBCL. You can download it from here: http://trac.clozure.com/openmcl

Musculature answered 11/2, 2009 at 0:45 Comment(0)
S
3

That code is for Macintosh Common Lisp (MCL). It will only run there. Using Clozure CL (CCL) will not help. You would have to comment the graphics code. The random state stuff also is slightly special for MCL. You have to port it to portable Common Lisp (make-random-state, etc.). Also the file names are special for the Mac.

Clozure CL is a fork from Macintosh Common Lisp, but has be changed to Unix conventions (pathnames, ...) and does not include the special graphics code of MCL.

Soddy answered 13/3, 2009 at 2:16 Comment(0)
K
2

Using the latest version of CCL on linux x86, with this file saved as foo.lisp:

#+ccl (defun random-state (x y)
        (ccl::initialize-random-state x y))

(load "utilities.lisp")
(use-package 'rss-utilities)


(load "testbed.lisp")

(setup)
(init)

(print (runs 10 10 .1))

Running

~/svn/ccl/lx86cl -l foo.lisp

prints a bunch of warning messages and the desired answer of:

(-0.77201915 0.59691894 0.78171235 0.41514033 0.6744591 0.26383805 0.8981678 1.1274683 0.50265205 0.4081622)

To figure out the required #'random-state defun, I guessed that the “#.(RANDOM-STATE 64497 9)” was a serialized random-state object from MCL. To see how CCL handles that, I checked what MAKE-RANDOM-STATE outputs in CCL:

$ ~/svn/ccl/lx86cl 
Welcome to Clozure Common Lisp Version 1.3-r11936  (LinuxX8632)!
? (make-random-state)
#.(CCL::INITIALIZE-RANDOM-STATE 64497 9)
Kizzee answered 13/4, 2009 at 3:5 Comment(0)
D
2

If you have never used lisp in a meaningful way, there is a Matlab code for "Reinforcement Learning: An Introduction".

Dowson answered 13/5, 2013 at 8:23 Comment(0)
P
0

In addition to Rainer Joswig's answer: Once you install Clozure you'll have to update references to the function RANDOM-STATE in utilities.lisp to random-mrg31k3p-state.

More specifically replace: #.(RANDOM-STATE 64497 9) with #.(ccl::random-mrg31k3p-state)

random-mrg31k3p-state seems to have replaced random-state sometime after the code was written see l1-numbers.lisp?rev=13327

Polly answered 29/11, 2015 at 2:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.