Running utf-8 encoded scripts with Steel Bank Common Lisp
Asked Answered
W

1

7

I am trying to run a common lisp script from the command line, on Ubuntu 12.04, using SBCL 1.1.7. I start the script with

$ sbcl --script <my-script.lisp>

Since the script is UTF_8 encoded, I get some error messages:

; compiling (DEFUN PRINT-USAGE ...)unhandled SB-INT:STREAM-DECODING-ERROR in thread #<SB-THREAD:THREAD
                                               "main thread" RUNNING
                                                {1002A39983}>:
:ASCII stream decoding error on
#<SB-SYS:FD-STREAM
    for "file ... .lisp"
    {10045745E3}>:

    the octet sequence #(194) cannot be decoded.

I guess the solution would be to tell SBCL to treat the source file as UTF-8 but I cannot find anything in the documentation or on google on how to do this.

Any hint?

Weldonwelfare answered 2/4, 2014 at 20:52 Comment(0)
M
10

I'm not much of an SBCL hacker, but looking at toplevel.lisp, it appears that the code that handles --script is:

(defun process-script (script)
  (flet ((load-script (stream)
           ;; Scripts don't need to be stylish or fast, but silence is usually a
           ;; desirable quality...
           (handler-bind (((or style-warning compiler-note) #'muffle-warning)
                          (stream-error (lambda (e)
                                          ;; Shell-style.
                                          (when (member (stream-error-stream e)
                                                        (list *stdout* *stdin* *stderr*))
                                            (exit)))))
             ;; Let's not use the *TTY* for scripts, ok? Also, normally we use
             ;; synonym streams, but in order to have the broken pipe/eof error
             ;; handling right we want to bind them for scripts.
             (let ((*terminal-io* (make-two-way-stream *stdin* *stdout*))
                   (*debug-io* (make-two-way-stream *stdin* *stderr*))
                   (*standard-input* *stdin*)
                   (*standard-output* *stdout*)
                   (*error-output* *stderr*))
               (load stream :verbose nil :print nil)))))
    (handling-end-of-the-world
      (if (eq t script)
          (load-script *stdin*)
          (with-open-file (f (native-pathname script) :element-type :default)
            (sb!fasl::maybe-skip-shebang-line f)
            (load-script f))))))

It looks like the file is opened with (with-open-file (f (native-pathname script) :element-type :default) …). According to the answer to usockets: How do I specify the external format when I open a socket, the default encoding should be UTF-8, and a quick interactive test seems to confirm:

CL-USER> sb-impl::*default-external-format*
:UTF-8

However, what you might be able to do, depending on the order in which options are processed, is use an --eval option to set sb-impl::*default-external-format* before processing the script. E.g., a command line like:

$ sbcl --eval '(setf sb-impl::*default-external-format* …)' --script my-script.lisp

However, that said, I'm not sure at all whether that's supported or not. According to a thread on comp.lang.lisp, How to change external-format in SBCL (c-string encoding error), the default encoding is determined by examining the environment, so there may be something in the environment that you can do to get the encoding that you need as the default. One response in that thread indicates that the following may work:

$ LC_CTYPE=en_US.UTF-8 
$ export LC_CTYPE
$ sbcl --script my-script.lisp
Mahmoud answered 2/4, 2014 at 21:58 Comment(2)
Setting sb-impl::*default-external-format* before loading the main application file works. Thanks a lot!Weldonwelfare
I did sbcl -- eval '(setf sb-impl::*default-external-format* :UTF-8)' with success on Mac OSX Yosemite because it was US-ASCII to start with. Org-babel slime did this for me automatically, too.Civilian

© 2022 - 2024 — McMap. All rights reserved.