Just for those who are still seeking for more information on this topic, I want to share the information about one ongoing project (http://dslab.epfl.ch/proj/s2e) that I've found on the web. The project has two components:
- x86-to-LLVM backend for dynamic translation of x86 machine code to LLVM IR
- RevGen tool for static analysis of x86 binaries, capable of translating inline x86 assembly to LLVM IR
Here is RevGen prototype:
RevGen takes as input an x86 binary and outputs an equivalent LLVM module in three steps. First, RevGen looks for all executable blocks of code and converts them to LLVM translation blocks. Second, when there are no more translation blocks to cover, RevGen transforms them into basic blocks and rebuilds the control flow graph of the original binary in LLVM format. Third, RevGen resolves external function calls to build the final LLVM module. For dynamic analysis, a last step links the LLVM module with a run-time library that allows the execution of the LLVM module.