Source-to-source compilation with LLVM [closed]
Asked Answered
C

1

8

I need to convert x86 assembly source code to LLVM human-readable .ll file (aka LLVM assembly language). How can I do this? If there is no direct solution would it be possible to implement one within the LLVM infrastructure with as less efforts as possible?

I guess, the solution I'm looking for should be some kind of llc's counterpart that converts .s file back to .ll representation.

Chiapas answered 26/1, 2012 at 5:20 Comment(5)
This question was already asked & answered. There is no direct solution due to many stuff (e.g. indirect branches). You might find the project like llvm-qemu and libcpu useful for you. In any case, this question is a dup of #6982310Nakada
Thank you. I've already took a look on that projects you mentioned. Unfortunately, llvm-qemu looks dead. And libcpu looks like going it's own way in parsing assembly rather than using LLVM's infrastructure (so it appears to be incomplete in supporting x86 ISA). Actually, I thought that the tool I'm looking for should do the work of LLVM's AsmPrinter but in reverse direction translating native ISA instructions into LLVM's MachineInstr or LLVM-MC's MCInst.Chiapas
And what about the LLVM's subproject llvm-mc? It has AsmParser class that is able to eat .s file and generate its representation based on MCInst class. In this case the only part remained undone is to go back in reverse direction with respect to MCLowering class towards to LLVM's MachineInstr-based representation.Chiapas
MachineInstr != LLVM IR. MI is still a machine code. Consider e.g. you have "jmp [eax]" instruction. Which LLVM IR instruction(s) will you convert it into?Nakada
For example, I would be interested with x86/x86_64 -> LLVM converter with restriction, that is capable of disassembling limited set of x86,x86_64 instructions, but reasonable to reassemble hello world and some computation algorithms.Punishment
C
8

Just for those who are still seeking for more information on this topic, I want to share the information about one ongoing project (http://dslab.epfl.ch/proj/s2e) that I've found on the web. The project has two components:

  1. x86-to-LLVM backend for dynamic translation of x86 machine code to LLVM IR
  2. RevGen tool for static analysis of x86 binaries, capable of translating inline x86 assembly to LLVM IR

Here is RevGen prototype: RevGen takes as input an x86 binary and outputs an equivalent LLVM module in three steps. First, RevGen looks for all executable blocks of code and converts them to LLVM translation blocks. Second, when there are no more translation blocks to cover, RevGen transforms them into basic blocks and rebuilds the control flow graph of the original binary in LLVM format. Third, RevGen resolves external function calls to build the final LLVM module. For dynamic analysis, a last step links the LLVM module with a run-time library that allows the execution of the LLVM module.

Chiapas answered 30/1, 2012 at 6:2 Comment(2)
Those tools are for working with programs which are already assembled. unless your saying either can produce LLVM bitcode/IR from an x86 .ASM?Kilbride
@techzilla found something starting from x86 source code ?Tumblebug

© 2022 - 2024 — McMap. All rights reserved.