RPC from C++ code to Common Lisp code
Asked Answered
R

3

5

I have two codebases: one written in C++ and the other in Common Lisp. There is a particular functionality implemented in the Lisp codebase that I would like to access from my C++ code. I searched for Foreign Function Interfaces to call Lisp functions from C++, but couldn't seem to find any (I found FFIs for the other direction mostly). So I decided to implement some form of RPC that fits my requirements, which are:

  • both codes are going to run on the same machine, so extensibility to remote machine calls is not important.

  • the input from C++ is going to be a Lisp-style list, which is what the function from the Lisp code is going to take as input.

  • this call is going to be made 1000s of times per execution of the code, so performance per remote call is critical.

So far, I've learnt from various resources on the web that possible solutions are:

  • Sockets - set up an instance of the Lisp code that will listen for function calls from the C++ code, run the function on the given input, and return the result to the C++ code.

  • XML-RPC - set up an XML-RPC server on the Lisp side (which will be easy since I use Allegro Common Lisp, which provides an API that supports XML-RPC) and then use one of the many XML-RPC libraries for C++ to make the client-side call.

The pros and cons I see with these approaches seem to be the following:

  • Sockets are a low-level construct, so it looks like I would need to do most of the connection management, reading and parsing the data on the sockets, etc on my own.

  • XML-RPC seems to suit my needs much better, but I read that it always uses HTTP, and there is no way to use UNIX domain sockets. So, it feels like XML-RPC might be overkill for what I have in mind.

Does anyone have any experience in achieving some similar integration of codes? Are there significant differences in performance between sockets and XML-RPC for local RPC? Any advice on which approach might be better would be extremely helpful. Also, suggestions on a different technique to do this would also be appreciated.

EDIT: Here are a few more details on the shared functionality. There is a function f available in the Lisp code (which is complex enough to make reimplementation in C++ prohibitively expensive). It takes as input two lists L1 and L2. How I envision this happening is the following:

  • L1 and L2 is constructed in C++ and sent over to the Lisp side and waits for the results,
  • f is invoked on the Lisp side on inputs L1 and L2 and returns results back to the C++ side,
  • the C++ side takes in the results and continues with its computation.

The sizes of L1 and L2 are typically not big:

  • L1 is a list containing typically 100s of elements, each element being a list of atmost 3-4 atoms.

  • L2 is also a list containing < 10 elements, each element being a list of atmost 3-4 atoms.

So the total amount of data per RPC is probably a string of 100s/1000s of bytes. This call is made at the start of each while loop in my C++ code, so its hard to give concrete numbers on number of calls per second. But from my experiments, I can say that its typically done 10s-100s of times per second. f is not a numerical computation: its symbolic. If you're familiar with AI, its essentially doing symbolic unification in first-order logic. So it is free of side-effects.

Reproval answered 19/11, 2013 at 20:6 Comment(4)
You should explain a bit more about the shared functionality....Handed
You may find this of use: common-lisp.net/projects/cffi/manual/html_node/…Scientism
Even with the edit, your don't explain enough about the shared functionality. What does it really do (in a few words); what are the actual data types (of remotely passed arguments, of received results)... How often do you call it...? Is it idempotent...? If the types are lists, what is the type of their elements?Handed
you can check cl-cxxMosul
H
1

There are many other ways to make two processes communicate. You could read the inter-process communication wikipage.

One of the parameters is asynchronous or synchronous character. Is your remote processing a remote procedure call (every request from client has exactly one response from server) or is it an asynchronous message passing (both sides are sending messages, but there is no notion of request and response; each side handle incoming messages as events).

The other parameter is the latency and bandwidth i.e. the volume of data exchanged (per message and e.g. per second).

Bandwidth does matter, even on the same machine. Of course, pipes or Unix sockets give you a very big bandwidth, eg 100 Megabytes/second. But there are scenarii where that might not be enough. In that pipe case, the data is usually copied (often twice) from memory to memory (e.g. from one process address space to another one).

But you might consider e.g. CORBA (see e.g. CLORB on the lisp side, and this tutorial on OmniORB), or RPC/XDR, or XML-RPC (with S-XML-RPC on the lisp side), or JSON-RPC etc...

If you don't have a lot of data and a lot of bandwidth (or a many requests or messages per second), I would suggest using a textual protocol (perhaps serializing with JSON or YAML or XML) because it is easier than a binary protocol (BSON, protobuf, etc...)

The socket layer (which could use unix(7) AF_UNIX sockets, plain anonymous or named pipe(7)-s, or tcp(7) i.e. TCP/IP, which has the advantage of giving you the ability to distribute the computation on two machines communicating by a network) is probably the simplest, as soon as you have on both (C++ and Lisp) sides a multiplexing syscall like poll(2). You need to buffer messages on both sides.

Maybe you want MPI (with CL-MPI on the lisp side).

We can't help you more, unless you explain really well and much more in the details what is the "functionality" to be shared from C++ to Lisp (what is it doing, how many remote calls per second, what volume and kind of data, what computation time, etc etc....). Is the remote function call idempotent or nullipotent, does it have side-effects? Is it a stateless protocol...

The actual data types involved in the remote procedure call matters a lot: it is much more costly to serialize a complex [mathematical] cyclic graph with shared nodes than a plain human readable string....

Given your latest details, I would suggest using JSON... It is quite fit to transmit abstract syntax tree like data. Alternatively, transmit just s-expressions (you may be left with the small issue in C++ to parse them, which is really easy once you specified and documented your conventions; if your leaf or symbolic names have arbitrary characters, you just need to define a convention to encode them.).

Handed answered 19/11, 2013 at 20:15 Comment(6)
Yes, I did see the IPC page. The kind of remote processing I'm interested in is a remote procedure call (some data from C++ is set over to the Lisp server, the computation is done there and then sent back to the C++ code, which is waiting for the result). Bandwidth is of no importance since all computations are on the same machine. Thanks, I'll take a look at CORBA and JSON/YAML. The data sent across each time is not too large and like I said previously, bandwidth is of no importance since both codes run on the same machine.Reproval
To comment on cl-mpi - I tried it literally couple days ago. Unfortunately, the project seems to be abandoned. It still has bugs and it doesn't play well with recent versions of OpenMPI (at the time the project was started these were experimental). Even though in the end I could make it work, I only tried very basic things, and I'm not sure whether it is safe to use it for large scale complex projects.Crammer
Yes, the socket-layer solution and XML-RPC (and S-XML-RPC) are what I've considered so far. The MPI approach also sounds promising. I'm not familiar with the usage, so I would need to read that in more detail though. @wvxvw: Ah, I see! Thanks for the heads-up.Reproval
@Basile: When you say transmit s-expressions, do you mean by any particular RPC protocol or just over plain UNIX sockets?Reproval
Just over plain Unix sockets (or even pipes). It it quite simple, and it is even simpler if you add the convention that the s-expression fits on a single (perhaps long) line ended with a newline. FWIW, MELT is doing nearly that. You can look inside the code (it is free software), or ask me more details on [email protected] forum.Handed
Ah, all right. Thanks! I think I'm going to give the UNIX socket approach a shot first.Reproval
F
4

If you look at some Common Lisp implementations, their FFIs allow calling Lisp from the C side. That's not remote, but local. Sometimes it makes sense to include Lisp directly, and not call it remotely.

Commercial Lisps like LispWorks or Allegro CL also can delivered shared libraries, which you can use from your application code.

For example define-foreign-callable allows a LispWorks function to be called.

Franz ACL can do it: http://www.franz.com/support/documentation/9.0/doc/foreign-functions.htm#lisp-from-c-1

Also something like ECL should be usable from the C side.

Fleury answered 19/11, 2013 at 20:27 Comment(2)
This works if both Lisp and C++ code are in the same process.Handed
@Rainer Joswig: The ACL documentation says: "The C functions must have been loaded into Lisp, and must have been called from Lisp." But this is problematic for me since I want to use the Lisp code simply as a server and do not want to have to call couple the code bases any further, let alone call the C++ executable from Lisp. I did look into ECL, and I think the same difficulties come in there. Also, the Lisp code uses several ACL-specific constructs, which might break if I try to port it to ECL.Reproval
C
3

I've started working recently on a project that requires similar functionality. Here are some things I've researched so far with some commentary:

  • cl-mpi would in principle allow (albeit very low-level) direct inter-process communication, but encoding data is a nightmare! You have very uncomfortable design on C/C++ side (just very-very limited + there's no way around sending variable length arrays). And on the other side, the Lisp library is both dated and seems to be at the very early stage in its development.

  • Apache Trift which is more of a language, then a program. Slow, memory hog. Protobuf, BSON are the same. Protobuf might be the most efficient in this group, but you'd need to roll your own communication solution, it's only the encoding/decoding protocol.

  • XML, JSON, S-expressions. S-expressions win in this category because they are more expressive and one side has already a very efficient parser. Alas, this is even worse then Trift / Protobuf in terms of speed / memory.

  • CFFI. Sigh... Managing pointers on both sides will be a nightmare. It is possible in theory, but must be very difficult in practice. This will also inevitably tax the performance of Lisp garbage collector, because you would have to get in its way.

  • Finally, I switched to ECL. So far so good. I'm researching mmaped files as means of sharing data. The conclusion that I've made so far for myself, this will be the way to go. At least I can't think of anything better at the moment.

Crammer answered 19/11, 2013 at 20:53 Comment(0)
H
1

There are many other ways to make two processes communicate. You could read the inter-process communication wikipage.

One of the parameters is asynchronous or synchronous character. Is your remote processing a remote procedure call (every request from client has exactly one response from server) or is it an asynchronous message passing (both sides are sending messages, but there is no notion of request and response; each side handle incoming messages as events).

The other parameter is the latency and bandwidth i.e. the volume of data exchanged (per message and e.g. per second).

Bandwidth does matter, even on the same machine. Of course, pipes or Unix sockets give you a very big bandwidth, eg 100 Megabytes/second. But there are scenarii where that might not be enough. In that pipe case, the data is usually copied (often twice) from memory to memory (e.g. from one process address space to another one).

But you might consider e.g. CORBA (see e.g. CLORB on the lisp side, and this tutorial on OmniORB), or RPC/XDR, or XML-RPC (with S-XML-RPC on the lisp side), or JSON-RPC etc...

If you don't have a lot of data and a lot of bandwidth (or a many requests or messages per second), I would suggest using a textual protocol (perhaps serializing with JSON or YAML or XML) because it is easier than a binary protocol (BSON, protobuf, etc...)

The socket layer (which could use unix(7) AF_UNIX sockets, plain anonymous or named pipe(7)-s, or tcp(7) i.e. TCP/IP, which has the advantage of giving you the ability to distribute the computation on two machines communicating by a network) is probably the simplest, as soon as you have on both (C++ and Lisp) sides a multiplexing syscall like poll(2). You need to buffer messages on both sides.

Maybe you want MPI (with CL-MPI on the lisp side).

We can't help you more, unless you explain really well and much more in the details what is the "functionality" to be shared from C++ to Lisp (what is it doing, how many remote calls per second, what volume and kind of data, what computation time, etc etc....). Is the remote function call idempotent or nullipotent, does it have side-effects? Is it a stateless protocol...

The actual data types involved in the remote procedure call matters a lot: it is much more costly to serialize a complex [mathematical] cyclic graph with shared nodes than a plain human readable string....

Given your latest details, I would suggest using JSON... It is quite fit to transmit abstract syntax tree like data. Alternatively, transmit just s-expressions (you may be left with the small issue in C++ to parse them, which is really easy once you specified and documented your conventions; if your leaf or symbolic names have arbitrary characters, you just need to define a convention to encode them.).

Handed answered 19/11, 2013 at 20:15 Comment(6)
Yes, I did see the IPC page. The kind of remote processing I'm interested in is a remote procedure call (some data from C++ is set over to the Lisp server, the computation is done there and then sent back to the C++ code, which is waiting for the result). Bandwidth is of no importance since all computations are on the same machine. Thanks, I'll take a look at CORBA and JSON/YAML. The data sent across each time is not too large and like I said previously, bandwidth is of no importance since both codes run on the same machine.Reproval
To comment on cl-mpi - I tried it literally couple days ago. Unfortunately, the project seems to be abandoned. It still has bugs and it doesn't play well with recent versions of OpenMPI (at the time the project was started these were experimental). Even though in the end I could make it work, I only tried very basic things, and I'm not sure whether it is safe to use it for large scale complex projects.Crammer
Yes, the socket-layer solution and XML-RPC (and S-XML-RPC) are what I've considered so far. The MPI approach also sounds promising. I'm not familiar with the usage, so I would need to read that in more detail though. @wvxvw: Ah, I see! Thanks for the heads-up.Reproval
@Basile: When you say transmit s-expressions, do you mean by any particular RPC protocol or just over plain UNIX sockets?Reproval
Just over plain Unix sockets (or even pipes). It it quite simple, and it is even simpler if you add the convention that the s-expression fits on a single (perhaps long) line ended with a newline. FWIW, MELT is doing nearly that. You can look inside the code (it is free software), or ask me more details on [email protected] forum.Handed
Ah, all right. Thanks! I think I'm going to give the UNIX socket approach a shot first.Reproval

© 2022 - 2024 — McMap. All rights reserved.