Idiomatic way to ship command line tools written in Erlang
Asked Answered
A

2

13

The problem

Most of the articles and books about Erlang I could find focus on creating long running server-like applications, leaving the process of command line tools creation not covered.

I have a multi-app rebar3 project consisting of 3 applications:

  • myweb - a Cowboy-based web service;
  • mycli - a command line tool to prepare assets for myweb;
  • mylib - a library used by both myweb and mycli, depends on a NIF.

As a result of the build I want to get such artifacts:

  1. an executable for the web part that is going to serve HTTP requests;
  2. an executable command line tool for the assets preparation;
  3. a set of libraries used by the above.

Requirements

  • CLI should behave like a sane non-interactive command line tool: handle arguments, deal with stdin/stdout, return non-zero exit code on error, etc;
  • both server and CLI should be able to use NIFs;
  • it should be easy to package the artifacts as a set of deb/rpm packages, so both server and CLI should reuse common dependencies.

Things tried so far

Building an escript

One of the ways I've seen in the wild is to create a self-contained escript file. At least rebar3 and relx do so. So I gave it a try.

Pros:

  • has support for command line arguments;
  • in case of errors, it returns non-zero exit code.

Cons:

  • embeds all the dependencies in a single file making it impossible to reuse mylib;
  • since *.so files get embedded into the resulting escript file, they cannot be loaded at runtime, thus NIFs don't work (see erlang rebar escriptize & nifs);
  • rebar3 escriptize doesn't handle dependencies well (see bug 1139).

Unknowns:

  • Should the CLI app become a proper OTP application?
  • Should it have a supervision tree?
  • Should it be started at all?
  • If so, how do I stop it when the assets have been processed?

Building a release

Another way to build a command line tool was described in a How I start: Erlang article by Fred Hebert.

Pros:

  • Each of the dependency applications get into their own directory, making it easy to share and package them.

Cons:

  • there's no defined entry point like escript's main/1;
  • as a consequence both command line arguments and exit code must be handled manually.

Unknowns:

  • How to model the CLI OTP app in a non-interactive way?
  • How to stop the app when the assets have been processed?

Neither of the approaches above seem to work for me.

It would be nice to get the best of both worlds: get the infrastructure that is provided by escript such as main/1 entry point, command line parameters, and exit code handling while still having a nice directory structure that is easy to package and which doesn't hinder the use of NIFs.

Aegyptus answered 1/4, 2016 at 14:34 Comment(0)
B
7

Regardless if you are starting a long-running daemon-like application in Erlang, or a CLI command, you always need the following:

  1. erts application - the VM and kernel in a particular version
  2. Erlang OTP applications
  3. Your applications' dependencies
  4. CLI entry point

Then in either case the CLI entry point has to start the Erlang VM and execute the code that it supposed to execute in a given situation. Then it will either exit or continue running - the later for a long-running application.

The CLI entry point can be anything that starts an Erlang VM, e.g. an escript script, sh, bash, etc. The obvious advantage of escript over generic shell is that escript is already being executed in the context of an Erlang VM, so no need to handle starting/stopping the VM.

You can start Erlang VM in two ways:

  1. Use system-wide Erlang VM
  2. Use an embedded Erlang release

In the first case you don't supply erts nor any OTP application with your package, you only make a particular Erlang version a dependency for your application. In the second case you supply erts and all required OTP applications along with your application's dependencies in your package.

In the second case you also need to handle setting the code root properly when starting the VM. But this is quite easy, see the erl script that Erlang uses to start the system-wide VM:

# location: /usr/local/lib/erlang/bin/erl
ROOTDIR="/usr/local/lib/erlang"
BINDIR=$ROOTDIR/erts-7.2.1/bin
EMU=beam
PROGNAME=`echo $0 | sed 's/.*\///'`
export EMU
export ROOTDIR
export BINDIR
export PROGNAME
exec "$BINDIR/erlexec" ${1+"$@"}

This can be handled by scripts, for example the node_package tool that Basho uses to package their Riak database for all major operating systems. I am maintaining my own fork of it which I am using with my own build tool called builderl. I just say that so you know that if I managed to customize it you will well be able to do that as well :)

Once the Erlang VM is started, your application should be able to load and start any application, either supplied with Erlang or with your application (and that includes the mylib library that you mentioned). Here are some examples how this could be achieved:

escript example

See this builderl.esh example how I handle loading other Erlang applications from builderl. That escript script assumes that the Erlang installation is relative to the folder from which it's executed. When it's a part of another application, like for example humbundee, the load_builderl.hrl include file compiles and loads bld_load, which in turn loads all remaining modules with bld_load:boot/3. Notice how I can use standard OTP applications without specifying where they are - builderl is being executed by escript and so all the applications are loaded from where they were installed (/usr/local/lib/erlang/lib/ on my system). If libraries used by your application, e.g. mylib, are installed somewhere else, all you need to do is add that location to the Erlang path, e.g. with code:add_path. Erlang will automatically load modules used in the code from folders added to the code path list.

embedded Erlang

However, the same would hold if the application was a proper OTP release installed independently from the system-wide Erlang installation. That's because in that case the script is executed by escript belonging to that embedded Erlang release rather than the system-wide version (even if it's installed). So it knows the location of all applications belonging to that release (including your applications). For example riak does exactly that - in their package they supply an embedded Erlang release that contains its own erts and all dependent Erlang applications. That way riak can be started without Erlang being even installed on the host operating system. This is an excerpt from a riak package on FreeBSD:

% tar -tf riak2-2.1.1_1.txz
/usr/local/sbin/riak
/usr/local/lib/riak/releases/start_erl.data
/usr/local/lib/riak/releases/2.1.0/riak.rel
/usr/local/lib/riak/releases/RELEASES
/usr/local/lib/riak/erts-5.10.3/bin/erl
/usr/local/lib/riak/erts-5.10.3/bin/beam
/usr/local/lib/riak/erts-5.10.3/bin/erlc
/usr/local/lib/riak/lib/stdlib-1.19.3/ebin/re.beam
/usr/local/lib/riak/lib/ssl-5.3.1/ebin/tls_v1.beam
/usr/local/lib/riak/lib/crypto-3.1/ebin/crypto.beam
/usr/local/lib/riak/lib/inets-5.9.6/ebin/inets.beam
/usr/local/lib/riak/lib/bitcask-1.7.0/ebin/bitcask.app
/usr/local/lib/riak/lib/bitcask-1.7.0/ebin/bitcask.beam
(...)

sh/bash

This doesn't differ much in principle from the above apart from having to explicitly call the function that you want to execute when starting the Erlang VM (the entry point or the main function as you called it).

Consider this script that builderl generates to start an Erlang application just to execute a specified task (generate the RELEASES file), after which the node shuts down:

#!/bin/sh
START_ERL=`cat releases/start_erl.data`
APP_VSN=${START_ERL#* }
run_erl -daemon ../hbd/shell/ ../hbd/log "exec erl ../hbd releases releases/start_erl.data -config releases/$APP_VSN/hbd.config -args_file ../hbd/etc/vm.args -boot releases/$APP_VSN/humbundee -noshell -noinput -eval \"{ok, Cwd} = file:get_cwd(), release_handler:create_RELEASES(Cwd, \\\"releases\\\", \\\"releases/$APP_VSN/humbundee.rel\\\", []), init:stop()\""

This is a similar script but doesn't start any specific code or application. Instead, it starts a proper OTP release, so which applications are started and in what order depends on the release (specified by the -boot option).

#!/bin/sh
START_ERL=`cat releases/start_erl.data`
APP_VSN=${START_ERL#* }
run_erl -daemon ../hbd/shell/ ../hbd/log "exec erl ../hbd releases releases/start_erl.data -config releases/$APP_VSN/hbd.config -args_file ../hbd/etc/vm.args -boot releases/$APP_VSN/humbundee"

In the vm.args file you can provide additional paths to your applications if required, e.g.:

-pa lib/humbundee/ebin lib/yolf/ebin deps/goldrush/ebin deps/lager/ebin deps/yajler/ebin

In this example these are relative, but could be absolute if your application is installed into a standard well-known location. Also, this would be only required if you are using the system-wide Erlang installation and need to add the additional paths to locate your Erlang applications, or if your Erlang applications are located in non-standard location (e.g. not in lib folder, as Erlang OTP requires). In a proper embedded Erlang release, where the applications are located in the code root/lib folder, Erlang is able to load those applications without specifying any additional paths.

Summing up and other considerations

The deployment of Erlang applications doesn't differ much from other projects written in scripting languages, e.g. ruby or python projects. All those projects have to deal with similar issues and I believe each operating system's package management deals with them one way or another:

  1. Get to know how your operating system deals with packaging projects that have run-time dependencies.

  2. See how other Erlang applications are packaged for your operating system, there are plenty of them that are usually distributed by all major systems: RabbitMQ, Ejabberd, Riak among others. Just download the package and unpack it to a folder, then you will see where all the files are placed.

EDIT - reference the requirements

Coming back to your requirements, you have the following choices:

  1. Install Erlang as an OTP release system-wide, as an embedded Erlang, or as a bag with applications in some random folders (sorry Rebar)

  2. You can have multiple entry points in the form of sh or escript scripts executing a selection of applications from the installed release. Both will work as long as you configured the code root and paths to those applications correctly (as outlined above).

Then each of your applications: myweb and mycli, would need to be executed in its own new context, e.g. start a new VM instance and execute the required application (from the same Erlang release). In case of myweb the entry point can be a sh scripts that starts a new node according to the release (similar to Riak). In case of mycli the entry point can be an escript, which finishes executing once the task is completed.

But it's entirely possible to create a short-running task that exits the VM even if it's started from sh - see the example above. In that case mycli would require separate release files - the script and boot to boot the VM. And of course it's also possible to start a long-running Erlang VM from escript.

I provided an example project that uses all these methods at once, humbundee. Once it's compiled it provides three access points:

  1. The cmd release.
  2. The humbundee release.
  3. The builder.esh escript.

The first one is used to start the node for installation and then shut it down. The second is used to start a long-running Erlang application. The third is a build tool to install/configure the node. This is how the project looks like once the release has been created:

$:~/work/humbundee/tmp/rel % ls | tr " " "\n"
bin
erts-7.3
etc
lib
releases

$:~/work/humbundee/tmp/rel % ls bin | tr " " "\n"   
builderl.esh
cmd.boot
humbundee.boot
epmd
erl
escript
run_erl
to_erl
(...)

$:~/work/humbundee/tmp/rel % ls lib | tr " " "\n"
builderl-0.2.7
compiler-6.0.3
deploy-0.0.1
goldrush-0.1.7
humbundee-0.0.1
kernel-4.2
lager-3.0.1
mnesia-4.13.3
sasl-2.7
stdlib-2.8
syntax_tools-1.7
yajler-0.0.1
yolf-0.1.1

$:~/work/humbundee/tmp/rel % ls releases/hbd-0.0.1 | tr " " "\n"
builderl.config
cmd.boot
cmd.rel
cmd.script
humbundee.boot
humbundee.rel
humbundee.script
sys.config.src

The cmd entry point will use application deploy-0.0.1 and builderl-0.2.7 as well as release files cmd.boot, cmd.script, and some OTP applications. The standard humbundee entry point will use all applications apart from builderl and deploy. Then the builderl.esh escript will use application deploy-0.0.1 and builderl-0.2.7. All from the same embedded Erlang OTP installation.

Babettebabeuf answered 1/4, 2016 at 17:15 Comment(5)
Thank you for the detailed answer. From what I understood you suggest kind of a mixed approach: build a release so that each app gets into its own directory and then use a small escript as an entry point. Did I get it right?Aegyptus
Not really, I outlined all possible ways known to me of achieving what you want :-) You can either have a proper OTP release, a mixed bag of applications created by Rebar or something in between. In each case you can achieve what you want by starting erl with the code root set correctly and paths to all your applications. And I outlined how to do that using either escript or sh. OTP release (embedded Erlang) is easiest because of its defaults it requires the least configuration. But it can also be used with either escript or sh in the way I outlined. I hope that makes sense?Babettebabeuf
Yes, that makes sense, thank you. Though I'm still not sure what's the idiomatic way. Is there any at all?Aegyptus
The idiomatic way depends on the operating system, but in general it follows the way Erlang itself is installed. Consider that Erlang is the VM (erts) + Erlang applications. And your application is exactly the same, Erlang VM + Erlang applications. And Erlang itself is a standard OTP release installed system-wide (with Erlang executables available in the Path). Riak follows the suit and I bet other Erlang applications are installed in a similar way. Riak then uses sh scripts to start the node but you could use escript (i.e. as I proposed) if it suits you better.Babettebabeuf
Just now added an edit to the answer. Maybe the rewording will help you to understand it better.Babettebabeuf
O
0

A small escript that then goes into code from 'conventional' modules could be a solution.

As an example, Concuerror is expected to be used as a command line tool and uses an escript as its entry point. It handles command-line arguments via getopt. All the main code is in regular Erlang modules, which are included in the path with simple arguments to the escript.

As far as I understand, NIFs can then be loaded with regular -onload attributes (Concuerror does not use NIFs).

Oviform answered 1/4, 2016 at 16:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.