Using LLVM bytecode for libraries (instead of native object files)
Asked Answered
T

1

5

What are the implications on

  • portability (calling convention: does it really matter at an LLVM level when only calling into C or OS library functions)
  • link time
  • optimizations

I would like to compile a toy language with LLVM, due to all the hard parts already being present (optimization, object code generation), but am struggling with a concept I'd like to keep if it is worth it: library files should be redistributable, usable as static and shared lib (for linking, in the shared case a real so or dll will be generated when the final app is linked), portable. I believe this would cut part of compilation time (as the native code generation and perhaps optimization is only done once, at final binary link time). I envision the linker taking care of calling convention (if possible) and the conversion to a shared library if requested. In a far-stretched addition, perhaps LLVM could be leveraged to not link, and use the LLVM JIT to run the generated bytecode directly, completely removing link times when writing code.

Does this sound

  1. Doable?
  2. Worth it? I know C/C++ link time is comparatively long, which is problematic when frequently rebuilding. What about free link time optimization (cfr /GL and -flto as it will be essentially LLVM bytecode being linked together, which will then be turned into a native binary).

This may be a vague question, if I have to clarify something, please ask.

Theressathereto answered 18/2, 2012 at 14:33 Comment(2)
Link time longer than what? Afaik link time is mostly dependent on the number of symbols per compilation unit that must be resolved (originates in other compilation units), and not on language. I'm also not sure that LLVM bytecode actually automagically brushes away calling conventions. Then C++ code compiled with LLVM couldn't access anything non LLVM compiled.Annapurna
@MarcovandeVoort: LLVM takes care of calling convention when it generates native object code. If I only call LLVM code (so no OS libraries) everything will follow the same LLVM-generated calling convention. C/C++ code is compiled with a certain calling convention in mind in the first place. Clang generated LLVM bitcode is not platform-independent. I don't see why my toy language needs to care about C calling conventions internally.Theressathereto
A
8

I have done something similar to this in the past. One thing that you should realize is that LLVM bitcode is not "portable" in that it is not completely machine independent. Bitcode files have knowledge of things like the size of pointers, etc. that are specific to the processor being targeted.

Having said that, in the past I have compiled programs and their support libraries to bitcode and linked the bitcode files together before generating an assembly file for the whole program. You're right that calling conventions aren't important for calls that are internal but calls made outside (or from outside) still require that the ABI is followed.

You may be able to design your toy language in such a way that you can avoid processor dependent bit code, but you'll have to be very careful.

I noticed that linking the bitcode files together took quite a while, especially at high optimization levels. That may have speeded up by now, I did it with LLVM from 2 or 3 years ago.

One final point: depending on the target processor you'll probably need the equivalent of libgcc.a or compiler-rt to handle things that the processor can't like floating point or 64 bit integer stuff if the processor doesn't have instructions that perform those operations.

Architectural answered 18/2, 2012 at 17:24 Comment(2)
Thanks for sharing your experience. I do indeed plan on the linker to "know" about the calling conventions to OS libraries (which will be C) and have considered the compiler-rt stuff too (for exception support for example), but that's still a long way off. It also seems like the speed of llvm-ld is actively improved in time (example)Theressathereto
I figured the type system re-write would help. I should try whole program compilation again, the concept is pretty cool. :-)Architectural

© 2022 - 2024 — McMap. All rights reserved.