How to dump an executable SBCL image that uses osicat
Asked Answered
L

2

5

I have a simple common lisp server program, that uses the osicat library to interface with the posix filesystem. I need to do this because the system creates symbolic links to files, and uses the POSIX stat metadata, and neither of those things are straightforward to do in portable lisp.

I am managing the dependencies with quicklisp, and I have all of this pinned to a working distribution. The app is portable between CCL and SBCL, and I tend to build it in the first and deploy it using the latter. I declare the dependencies for the app with an asdf defsystem, and I can use quicklisp to load it for easy development from local projects.

For deployment I was just using some ansible playbooks that replicated a developer environment on a remote (.e. setting up quicklisp, pushing code into local projects, running out of a user home directory) which was hacky, but mostly ok. More recently, as it's becoming more stable I have been compiling it using sb-ext:save-lisp-and-die, using a simple compile script. This means I get an executable that I can run more like a server, with service management scripts, and an anonymous user account.

This has been working very well, and so I recently moved this step to the next level, and I'm building .deb packages with my compile script, so I can bundle up everything into a relocatable binary. This also sort of works, but the resultant binaries are not relocatable from the original build host. They refuse to start up, and it appears that they try to dynamically load a shared library for osicat

Unhandled SIMPLE-ERROR in thread #<SB-THREAD:THREAD "main thread" RUNNING
Mar 15 12:47:14 annie [479]:                                     {10005C05B3}>:
Mar 15 12:47:14 annie [479]:   Error opening shared object "libosicat.so":
Mar 15 12:47:14 annie [479]:   libosicat.so: cannot open shared object file: No such file or directory.

it looks like the image expects to find this in the original build tree's quicklisp archives

(ERROR "Error opening ~:[runtime~;shared object ~:*~S~]:~%  ~A." "/home/builder/buil...quicklisp/dists/quicklisp/software/osicat-20180228-git/posix/libosicat.so
(SB-SYS:DLOPEN-OR-LOSE #S(SB-ALIEN::SHARED-OBJECT :PATHNAME #P"

so poking around the source, I realise that when quicklisp fetches osicat and exercise its build operation, it compiles this DLL to wrap it's interface with the system libaries, rather than just ffi to them directly - possibly because it's using cffi groveller, I don't really know much about cffi (yet). This is fine, but rather than linking to a .so using the system linker it's trying to dlopen it from a fixed path, which is not very portable, and kind of breaks the usefulness of save-image

I'm a bit stumped at this point, but before I go diving any much further into QL and cffi builds, I wondered if there's some build or compile configuration I'm missing that would make it bootstrap in a more 'static' fashion or influence the production of the wrapped library. Ideally I just want a single blob I can wrap in an installer, and link it against system libraries, but if I have to deploy some additional artefacts that's probably alright. I don't know how to make the autogenerated shared objects occur at a more controlled path.

At that point though, I may as well write a .so for my posix calls and distribute this alongside the app and try and FFI to it more directly. That would be a bit of a pain, so I would prefer to not do this.

Lithopone answered 15/3, 2019 at 13:3 Comment(0)
T
4

You're right, when a dumped image is starting up, it is trying to reload the shared libraries. Which, as you are experiencing, is not working if the image is not starting on the machine it was dumped on.

This is almost what static-program-op set out to solve. A simple system definition like this should help you compile a static program:

(defsystem "foo"
  :defsystem-depends-on ("cffi-grovel")
  :build-operation "static-program-op" ; "asdf" package is implied
  :build-pathname "foo" ; path of the generated binary
  :entry-point "foo:main" ; function to use as the entry point
  ;; ... everything else ...
  )

If your system depends on grovel files (defined by :cffi-wrapper-file, :c-file or :o-file), such as the ones provided by osicat, then it will statically link those to your dumped image.

However, it's not perfect.

Essentially, there are still some issues. Some are being fixed upstream by CFFI itself (e.g. not reloading shared libraries of the statically embedded libraries), others are a bit harder. (E.g. SBCL default compilation options don't let you use static-program-op by default. This is being fixed in Debian builds of SBCL, but other distributions are being less responsive.)

This is obviously a problem that the community at large has met, and there are several libraries that are around to help:

  • The first one, that has been around for a while, is Deploy. The approach it takes is that it embeds the dumped image and the libraries into an archive, and rearranges things for the binary to load them from wherever it is extracted to.
  • The second one, which I am biased towards because I made it, is linux-packaging. It takes the approach of fixing static-program-op by extending it, but requires you to build a custom SBCL. However, it generates distribution packages like .deb and .rpm, in order to be able to specify dependencies for system shared libraries (e.g. if you depend on sqlite, it will figure out which package provides it and add it as a dependency in the .deb). I highly recommend looking at the .gitlab-ci.yml for examples.

I recommend reading the webpages of both of those libraries to make your choice, they both have their advantages and drawbacks. <joke>Obviously, linux-packaging is superior.</joke>

Tarantella answered 8/8, 2020 at 12:53 Comment(0)
G
2

Maybe you can use sb-posix:symlink and sb-posix:fstat on SBCL instead, removing the osicat dependency by feature toggle.

Gudgeon answered 15/3, 2019 at 14:18 Comment(2)
wow, i had no idea about sb-posix. that's a good tip. I would lose the portability for ccl, but I only use that in local dev, so I could flip it with reader conditionals. I guess that will also work for the dependencies in the defsystemLithopone
it was pretty simple to get the app to build with this change, and it's relocatable, so this is a useful work-around. It's a shame that it is not portable, though, there's already cruft happening in the build scripts to handle the differences.Lithopone

© 2022 - 2024 — McMap. All rights reserved.