How to write a custom intermodular pass in LLVM?
Asked Answered
C

3

12

I've written a standard Analysis pass in LLVM, by extending the FunctionPass class. Everything seems to make sense.

Now what I'd like to do is write a couple of intermodular passes, that is, passes that allows me to analyze more than one module at a time. The purpose of one such pass is to construct a call graph of the entire application. The purpose of the other such pass is that I have an idea for an optimization involving function calls and their parameters.

I know about interprocedural passes in LLVM, via extending the ModulePass class, but that only allows analysis within a single module.

I know about link time optimization (LTO) in LLVM, but (a) I'm not quite clear if this is what I want and (b) I've found no examples or documentation on how to actually write an LTO pass.

How can I write an intermodular pass, i.e., a pass that has access to all the modules in an application, in LLVM?

Conundrum answered 12/5, 2015 at 18:1 Comment(0)
C
7

I found one way to achieve my goal: write a simple program that uses llvm::parseBitcodeFile() to read in a bitcode file and create a Module object that can be traversed and analyzed. It's not ideal, because it's not a Pass that can be run within the LLVM framework. However, it is a way to achieve my goal of analyzing multiple modules at once.

For future readers, here's what I did.

Create a simple tool to read in a bitcode file and produce a Module

//ReadBitcode.cpp
#include <iostream>
#include "llvm/IR/Module.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/Bitcode/ReaderWriter.h"

using namespace llvm;

int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        std::cerr << "Usage: " << argv[0] << " bitcode_filename" << std::endl;
        return 1;
    }
    StringRef filename = argv[1];
    LLVMContext context;

    ErrorOr<std::unique_ptr<MemoryBuffer>> fileOrErr = MemoryBuffer::getFileOrSTDIN(filename);
    if (std::error_code ec = fileOrErr.getError())
    {
        std::cerr << "Error opening input file: " + ec.message() << std::endl;
        return 2;
    }

    ErrorOr<llvm::Module *> moduleOrErr = parseBitcodeFile(fileOrErr.get()->getMemBufferRef(), context);
    if (std::error_code ec = fileOrErr.getError())
    {
        std::cerr << "Error reading Module: " + ec.message() << std::endl;
        return 3;
    }

    Module *m = moduleOrErr.get();
    std::cout << "Successfully read Module:" << std::endl;
    std::cout << " Name: " << m->getName().str() << std::endl;
    std::cout << " Target triple: " << m->getTargetTriple() << std::endl;

    for (auto iter1 = m->getFunctionList().begin(); iter1 != m->getFunctionList().end(); iter1++)
    {
        Function &f = *iter1;
        std::cout << "  Function: " << f.getName().str() << std::endl;
        for (auto iter2 = f.getBasicBlockList().begin(); iter2 != f.getBasicBlockList().end();
             iter2++)
        {
            BasicBlock &bb = *iter2;
            std::cout << "    BasicBlock: " << bb.getName().str() << std::endl;
            for (auto iter3 = bb.begin(); iter3 != bb.end(); iter3++)
            {
                Instruction &i = *iter3;
                std::cout << "      Instruction: " << i.getOpcodeName() << std::endl;
            }
        }
    }
    return 0;
}

Compile the tool

$ clang++ ReadBitcode.cpp -o reader `llvm-config --cxxflags --libs --ldflags --system-libs`

Create a bitcode file to analyze

$ cat foo.c 
int my_fun(int arg1){
    int x = arg1;
    return x+1;
}

int main(){
    int a = 11;
    int b = 22;
    int c = 33;
    int d = 44;
    if (a > 10){
        b = c;
    } else {
        b = my_fun(d);
    }
    return b;
}

$ clang -emit-llvm -o foo.bc -c foo.c

Run the reader tool on the bitcode

$ ./reader foo.bc
Successfully read Module:
 Name: foo.bc
 Target triple: x86_64-pc-linux-gnu
  Function: my_fun
    BasicBlock: 
      Instruction: alloca
      Instruction: alloca
      Instruction: store
      Instruction: load
      Instruction: store
      Instruction: load
      Instruction: add
      Instruction: ret
  Function: main
    BasicBlock: 
      Instruction: alloca
      Instruction: alloca
      Instruction: alloca
      Instruction: alloca
      Instruction: alloca
      Instruction: store
      Instruction: store
      Instruction: store
      Instruction: store
      Instruction: store
      Instruction: load
      Instruction: icmp
      Instruction: br
    BasicBlock: 
      Instruction: load
      Instruction: store
      Instruction: br
    BasicBlock: 
      Instruction: load
      Instruction: call
      Instruction: store
      Instruction: br
    BasicBlock: 
      Instruction: load
      Instruction: ret
Conundrum answered 13/5, 2015 at 13:5 Comment(2)
So can you perform analysis using this method?Browband
@VanushVee Yes. It's not quite the same as writing a standard opt pass, because the analysis is not scheduled by a PassManager, and there is no way to pass the result to other built-in opt passes. But, it still works and you have access to all the LLVM backend classes to do whatever you'd like.Conundrum
B
5

This can be done using a module pass. Below is my code, and if you need help running it you can look here.

bar.c

int your_fun(int arg2) {
    int x = arg2;
    return x+2;
}

Skeleton.cpp

#include "llvm/Pass.h"
#include "llvm/IR/Module.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/Transforms/IPO/PassManagerBuilder.h"
using namespace llvm;

namespace {
  struct SkeletonPass : public ModulePass {
    static char ID;
    SkeletonPass() : ModulePass(ID) {}

    virtual bool runOnModule(Module &M) {
        for (auto& F : M) {
            errs() << "\tFunction: " << F.getName() << "\n";

            for (auto& BB : F) {
                errs() << "\t\tBasic Block: " << BB.getName() << "\n";

                for (auto& I : BB) {
                    errs() << "\t\t\tInstruction: " << I.getOpcodeName() << "\n";
                }
            }
        }

        return false;
    }
  };
}

char SkeletonPass::ID = 0;

// Automatically enable the pass.
// http://adriansampson.net/blog/clangpass.html
static void registerSkeletonPass(const PassManagerBuilder &,
                         legacy::PassManagerBase &PM) {
  PM.add(new SkeletonPass());
}

static RegisterStandardPasses RegisterMyPass(PassManagerBuilder::EP_ModuleOptimizerEarly,
                                                registerSkeletonPass);

static RegisterStandardPasses RegisterMyPass1(PassManagerBuilder::EP_EnabledOnOptLevel0,
                                                registerSkeletonPass);

Output:

| => clang -Xclang -load -Xclang build/skeleton/libSkeletonPass.so foo.c bar.c
Module: foo.c!
        Function: my_fun!
            Basicblock: entry!
            Instruction: alloca
            Instruction: alloca
            Instruction: store
            Instruction: load
            Instruction: store
            Instruction: load
            Instruction: add
            Instruction: ret
        Function: main!
            Basicblock: entry!
            Instruction: alloca
            Instruction: alloca
            Instruction: alloca
            Instruction: alloca
            Instruction: alloca
            Instruction: store
            Instruction: store
            Instruction: store
            Instruction: store
            Instruction: store
            Instruction: load
            Instruction: icmp
            Instruction: br
            Basicblock: if.then!
            Instruction: load
            Instruction: store
            Instruction: br
            Basicblock: if.else!
            Instruction: load
            Instruction: call
            Instruction: store
            Instruction: br
            Basicblock: if.end!
            Instruction: load
            Instruction: ret
Module: bar.c!
        Function: your_fun!
            Basicblock: entry!
            Instruction: alloca
            Instruction: alloca
            Instruction: store
            Instruction: load
            Instruction: store
            Instruction: load
            Instruction: add
            Instruction: ret

Output: If you include header file linking to bar.c

Module: foo.c!
        Function: your_fun!
            Basicblock: entry!
            Instruction: alloca
            Instruction: alloca
            Instruction: store
            Instruction: load
            Instruction: store
            Instruction: load
            Instruction: add
            Instruction: ret
        Function: my_fun!
            Basicblock: entry!
            Instruction: alloca
            Instruction: alloca
            Instruction: store
            Instruction: load
            Instruction: store
            Instruction: load
            Instruction: add
            Instruction: ret
        Function: main!
            Basicblock: entry!
            Instruction: alloca
            Instruction: alloca
            Instruction: alloca
            Instruction: alloca
            Instruction: alloca
            Instruction: store
            Instruction: store
            Instruction: store
            Instruction: store
            Instruction: store
            Instruction: load
            Instruction: icmp
            Instruction: br
            Basicblock: if.then!
            Instruction: load
            Instruction: store
            Instruction: br
            Basicblock: if.else!
            Instruction: load
            Instruction: call
            Instruction: store
            Instruction: load
            Instruction: call
            Instruction: store
            Instruction: br
            Basicblock: if.end!
            Instruction: load
            Instruction: ret
Biographer answered 29/3, 2016 at 17:55 Comment(4)
I don't think this is what I'm looking for. I want a pass that can run on multiple modules at the same time; what you've provided only runs on a single module at a time. (Think: what if I added bar.c?)Conundrum
You can use llvm-link with all of the binary files that will link them together. I've never tried to call them together, but I can try it and see what the results will look like.Biographer
I made edits to my answer that will hopefully answer your question. You can do it yourself pre-compile with LLVM bitcode files using llvm-link or how you would compile any project with multiple files. Both will result in the same output above, but it won't be considered a separate module because a module represents a source file.Biographer
What is the command line/invocation of the last example? Thanks!Bernardabernardi
J
4

In LTO all the modules are combined and you can see the whole program IR in one module.

You need to write a module pass like any module pass and add it to the list of LTO passes in populateLTOPassManager function in PassManagerBuilder.cpp. Here is the doc for PassManagerBuilder: http://llvm.org/docs/doxygen/html/classllvm_1_1PassManagerBuilder.html

When you do this, your pass will be executed with other LTO passes.

Jedidiah answered 13/12, 2016 at 1:45 Comment(1)
how would you do that with a dynamically loaded pass? Is it possible?Bernardabernardi

© 2022 - 2024 — McMap. All rights reserved.