LLVM as base compiler for different languages
Asked Answered
L

2

9

I am new to LLVM compiler and infrastructure. I have the following thought. Clang is the LLVM front end for C/C++, similarly Rustc for Rust programming language. Both can emit the LLVM IR code and the emitted code can be compiled to executable application.

My question is is it possible to link different programming languages? Example shown below -

/* Code in C */
int add(int, int);
int main()
{
  printf("%d", add(5 ,6));
}

The function defined in Rust for example

// Code in Rust
fn main()
{
  println!("{}", add(5, 6));
}

fn add (x: i32, y: i32) -> i32
{
  x + y
}

Once the IR is generated from both the source files, is it possible to link them and create a single application?

I am just curious to know if this works, please let me know.

Lecompte answered 7/7, 2016 at 14:33 Comment(3)
I think you need to have a rust runtime somewhere... but problem is different if call c from rust or the converse.Lengthways
Each source code has to be given to clang separately first. Afterwards, llvm-link command can merge multiple bitcode files. If the call from one source code matches the second, it could work.Katlin
Both languages has to be Application Binary Interface-compatible.Katlin
S
10

Short answer: Yes.


Long answer: Yes, as long as some requirements are fulfilled.

There are two kinds of compatibility: API (Application Program Interface) and ABI (Application Binary Interface). Essentially, the API dictates whether your program compiles whereas the ABI dictates whether it links, loads and runs.

Since Rust has a C FFI, Rust can emit code that can normally interact with C (it has the proper C ABI, for the considered platform). This is evident in that a Rust binary can call a C library.

If you take the LLVM IR of that Rust binary, the LLVM IR of that C library, merge both together, and use LLVM to produce a new binary, then you'll get a single binary (no dependency).

So, the "only" requirement is that your two pieces of code must be able to link/load/run independently first.


Another way to obtain a single binary, which is independent from LLVM, is static linking; in Rust you can for example static link with the musl implementation of the C standard library. The main advantage of merging at LLVM IR, is that you can then run LLVM optimization passes on the merged IR, and therefore benefit from cross-language inlining (and other optimizations).

Standford answered 7/7, 2016 at 15:12 Comment(7)
I think the long answer should be sometimes, not necessarily yes because an ABI is required in order to have interop. An ABI certainly requires some work to implement so you can't simply link two different language's IR outputs. Other than that I think you gave a pretty good answer to the question.Knucklehead
@BennetLeff: That's exactly that the long answer actually says: if both pieces of code can load/run when in separate libraries, then you can merge their IRs and have them run (since you validated the ABI). Or do you mean that I should edit something because it's not too clear?Standford
I think your answer can remain as is. However, to me it would be more clear if instead of "Long answer: yes..." it said "Long answer: sometimes..."Knucklehead
@BennetLeff: Updated; I kept the yes to make it clear it is possible, but immediately qualified that it was not automatic.Standford
@Matthieu M. But for the languages that do not have FFI, Say linking Go and Rust, both have LLVM front end, then? Also if possible, can you please show me the above example working?Lecompte
@Bharadwaj: If the languages do not have FFI, then no it is not possible. In essence, merging at LLVM IR level is only an optimization compared to building a static library/binary from multiple static libraries coming from different languages.Standford
@MatthieuM.: Thank you for putting it this way. This is correct and concise.Knucklehead
K
3

Firstly, Rust and C can talk but through Rust's FFI (Foreign Function Interface). For very basic functions, I imagine it would be possible to compile both languages to LLVM and have some sort of functionality but we're talking hello world length programs (maybe even not at that level though). In general there must be some sort of ABI to implement what you're suggesting. However, even with an ABI the implementation is done at the Front End level.

Concisely put, LLVM can't represent all language specific constructs. So you can't just link two program's LLVM IR and hope it works. There must be some work done at the front end to ensure compatibility between two languages.

Knucklehead answered 7/7, 2016 at 14:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.