As an exercise, I rewrote the example program in the blog post Gibbs sampler in various languages (revisited) by Darren Wilkinson.
The code appears below. This code runs on my (5 year old) machine in around 53 seconds, using SBCL 1.0.56, creating a core image using buildapp, and then running it with
time ./gibbs > gibbs.dat
Since this was how the timings were calculated for the other languages in the post, I thought I would do something comparable The C code in the post runs in around 25 seconds. I'd like to try and speed up the Lisp code if possible.
##############################
gibbs.lisp
##############################
(eval-when (:compile-toplevel :load-toplevel :execute)
(require :cl-rmath) (setf *read-default-float-format* 'double-float))
(defun gibbs (N thin)
(declare (fixnum N thin))
(declare (optimize (speed 3) (safety 1)))
(let ((x 0.0) (y 0.0))
(declare (double-float x y))
(print "Iter x y")
(dotimes (i N)
(dotimes (j thin)
(declare (fixnum i j))
(setf x (cl-rmath::rgamma 3.0 (/ 1.0 (+ (* y y) 4))))
(setf y (cl-rmath::rnorm (/ 1.0 (+ x 1.0)) (/ 1.0 (sqrt (+ (* 2 x) 2))))))
(format t "~a ~a ~a~%" i x y))))
(defun main (argv)
(declare (ignore argv))
(gibbs 50000 1000))
Then I built the executable gibbs
with calling sh gibbs.sh
with gibbs.sh
as
##################
gibbs.sh
##################
buildapp --output gibbs --asdf-tree /usr/share/common-lisp/source/ --asdf-tree /usr/local/share/common-lisp/source/ --load-system cl-rmath --load gibbs.lisp --entry main
I get 6 compiler notes when compiling with SBCL 1.0.56, which reproduce below. I'm not sure what to do about them, but would be grateful for any hints.
; compiling file "/home/faheem/lisp/gibbs.lisp" (written 30 MAY 2012 02:00:55 PM):
; file: /home/faheem/lisp/gibbs.lisp
; in: DEFUN GIBBS
; (SQRT (+ (* 2 X) 2))
;
; note: unable to
; optimize
; due to type uncertainty:
; The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT))
; &OPTIONAL), not a (VALUES FLOAT &REST T).
; (/ 1.0d0 (SQRT (+ (* 2 X) 2)))
;
; note: unable to
; optimize
; due to type uncertainty:
; The second argument is a (OR (DOUBLE-FLOAT 0.0)
; (COMPLEX DOUBLE-FLOAT)), not a (COMPLEX
; DOUBLE-FLOAT).
;
; note: forced to do static-fun Two-arg-/ (cost 53)
; unable to do inline float arithmetic (cost 12) because:
; The second argument is a (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)), not a DOUBLE-FLOAT.
; The result is a (VALUES (OR (COMPLEX DOUBLE-FLOAT) (DOUBLE-FLOAT 0.0))
; &OPTIONAL), not a (VALUES DOUBLE-FLOAT &REST T).
; (CL-RMATH:RGAMMA 3.0d0 (/ 1.0d0 (+ (* Y Y) 4)))
;
; note: doing float to pointer coercion (cost 13)
; (SQRT (+ (* 2 X) 2))
;
; note: doing float to pointer coercion (cost 13)
; (CL-RMATH:RNORM (/ 1.0d0 (+ X 1.0d0)) (/ 1.0d0 (SQRT (+ (* 2 X) 2))))
;
; note: doing float to pointer coercion (cost 13)
;
; compilation unit finished
; printed 6 notes
; /home/faheem/lisp/gibbs.fasl written
; compilation finished in 0:00:00.073
UPDATE 1: Rainer Joswig's answer pointed out that the argument of the SQRT might be negative, which was the source of the obscure compiler notes I was seeing, namely
The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT))
; &OPTIONAL), not a (VALUES FLOAT &REST T).
The compiler was complaining that since it didn't know whether the value of the argument was positive,
the result could be a complex number. Since in the example, the value x
is a sample variate from the gamma distribution,
it is always greater than 0. It was helpfully pointed out by Stephan in the SBCL user mailing list,
(see second message in the thread "Optimizing a simple Common Lisp Gibbs sampler program", that this could be resolved by declaring x to be greater than or zero as follows,
(declare (type (double-float 0.0 *) x))
See the Common Lisp Hyperspec for the relevant documentation about FLOAT types and Interval Designators.
This seems to speed up the code a bit. It is now reliably below 52 seconds, but still, not much of a gain. This also leaves the notes about
note: doing float to pointer coercion (cost 13)
If this is not fixable for some reason, I'd like to know why. Also, an explanation of what the note means would be interesting, regardless.
Specifically, what does the word pointer
mean here? Is this related to the fact that C functions are being called? Also, cost 13 doesn't seem
very useful. What is being measured?
Also, Rainer suggested that it might be possible to reduce consing, which might reduce runtime. I don't know whether either consing reduction is possible, or whether it would reduce runtime, but I'd be interested in opinions and approaches. Overall, it seems not much can be done to improve the performance of this function. Maybe it is too small and simple.
OPTIMIZE.
I tried usingTHE
, but don't see where it would be useful except on the RHS of thex
andy
assignments, and in those cases, I would expect that the compiler could deduce that all those expressions, that derive from doubles, are themselves doubles. I did try it, but it doesn't make any noticeable difference. For lack of better ideas, I'd like to get rid of the compiler notes, but I don't understand what they are telling me. – Alost