Adding intrinsics using an LLVM pass
Asked Answered
M

1

6

I've added an intrinsic to an input code using an LLVM pass. I'm able to see the intrinsic call, yet I can't figure out how to compile the code to my target architecture (x86_64). I'm running the following command:

clang++ $(llvm-config --ldflags --libs all) ff.s -o foo

But the linker complains about undefined references:

/tmp/ff-2ada42.o: In function `fact(unsigned int)':
/home/rubens/Desktop/ff.cpp:9: undefined reference to `llvm.x86.sse3.mwait.i32.i32'
/tmp/ff-2ada42.o: In function `fib(unsigned int)':
/home/rubens/Desktop/ff.cpp:16: undefined reference to `llvm.x86.sse3.mwait.i32.i32'
/home/rubens/Desktop/ff.cpp:16: undefined reference to `llvm.x86.sse3.mwait.i32.i32'
/home/rubens/Desktop/ff.cpp:16: undefined reference to `llvm.x86.sse3.mwait.i32.i32'

Despite using ldflags from llvm-config, the compilation does not proceed. Any ideas on what should be done for the code to compile properly?

To generate the assembly code, I've done the following:

# Generating optimized code
clang++ $(llvm-config --cxxflags) -emit-llvm -c ff.cpp -o ff.bc
opt ff.bc -load path/to/mypass.so -mypass > opt_ff.bc

# Generating assembly
llc opt_ff.bc -o ff.s

I'm currently using llvm version 3.4.2; clang version 3.4.2 (tags/RELEASE_34/dot2-final); gcc version 4.9.2 (GCC); and Linux 3.17.2-1-ARCH x86_64.


Edit: adding the IR with the intrinsic:

File ~/llvm/include/llvm/IR/IntrinsicsX86.td:

...
589 // Thread synchronization ops.                                          
590 let TargetPrefix = "x86" in {  // All intrinsics start with "llvm.x86.".
591     def int_x86_sse3_monitor : GCCBuiltin<"__builtin_ia32_monitor">,      
592               Intrinsic<[], [llvm_ptr_ty,                               
593                          llvm_i32_ty, llvm_i32_ty], []>;                
594     def int_x86_sse3_mwait : GCCBuiltin<"__builtin_ia32_mwait">,          
595               Intrinsic<[], [llvm_i32_ty,                               
596                          llvm_i32_ty], []>;                             
597 }                                                                       
...

And calls (from file ff.s):

...
.Ltmp2:                                       
    callq   llvm.x86.sse3.mwait.i32.i32   
    movl    $_ZStL8__ioinit, %edi         
    callq   _ZNSt8ios_base4InitC1Ev       
    movl    $_ZNSt8ios_base4InitD1Ev, %edi
    movl    $_ZStL8__ioinit, %esi         
    movl    $__dso_handle, %edx           
    callq   __cxa_atexit                  
    popq    %rax                          
    ret                                   
...

Edit 2: Here's how I'm adding the intrinsic during the opt pass:

Function *f(bb->getParent());
Module *m(f->getParent());

std::vector<Type *> types(2, Type::getInt32Ty(getGlobalContext()));
Function *mwait = Intrinsic::getDeclaration(m, Intrinsic::x86_sse3_mwait, types);

std::vector<Value *> args;
IRBuilder<> builder(&bb->front());
for (uint32_t i : {1, 2}) args.push_back(builder.getInt32(i));

ArrayRef<Value *> args_ref(args);
builder.CreateCall(mwait, args_ref);
Margarettmargaretta answered 19/12, 2014 at 16:21 Comment(3)
can you share the LLVM IR with the intrinsic with both: the call to the intrinsic and there declaration?Mallarme
@MichaelHaidl I've added the request info. I was expecting the instrinsic calls to be expanded into the associate builtins, but the call remains in the assembly file after compilation.Margarettmargaretta
I talked about the LLVM IR. You can use llvm-dis to make your .bc files readable or pass -S to opt. it would also be interesting how you add the intrinsic and the call in your opt pass. currently it looks like that the function called is not an intrinsic just a function with the same name as the llvm intrinsic.Mallarme
M
6

EDIT: I am currently writing an LLVM pass that is basicaly doing what you tried to do in this question. The problem with your code is the following:

std::vector<Type *> types(2, Type::getInt32Ty(getGlobalContext()));
Function *mwait = Intrinsic::getDeclaration(m, Intrinsic::x86_sse3_mwait, types);

You are trying to get the deceleration for an Intrinsic function with the name llvm.x86.sse3.mwait.i32.i32 and this Intrinsic does not exist. However, llvm.x86.sse3.mwait exists and therefor you have to write this:

Function *mwait = Intrinsic::getDeclaration(m, Intrinsic::x86_sse3_mwait);

notice the missing type argument to the call. This is because llvm.x86.sse3.mwait has no overloadings.

I hope you figured it out in the meantime.


Ok since I want be able to answer you for a while here is a wild guess answer.

The problem is the way you add the intrinsic through your optimizer pass. It looks like you are just creating a function with the same name as the intrinsic not the intrinsic itself.

Here is a little C++ code that just uses the clang built-in to get the intrinsic inside the IR (I use clang 3.5 but this should not have any impact).

int main ()
{
    __builtin_ia32_mwait(4,2);
}

Compiling it with clang -emit-llvm -S I get:

; ModuleID = 'intrin.cpp'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable
define i32 @main() #0 {
  call void @llvm.x86.sse3.mwait(i32 4, i32 2)
  ret i32 0
}

; Function Attrs: nounwind
declare void @llvm.x86.sse3.mwait(i32, i32) #1

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.5.0 "}

Please not that the SSE3 intrinsic has no type overloads like in your version.

Using llc on the generated file provides me:

.Ltmp2:
        .cfi_def_cfa_register %rbp
        movl    $4, %ecx
        movl    $2, %eax
        mwait
        xorl    %eax, %eax
        popq    %rbp
        retq

Proper assembly was created.

So I assume the way you are introducing the intrinsic into the function is wrong in your opt pass.

Get the intrinsic function and call it:

vector<Type*> types;
types.push_back(IntegerType::get(/*LLVM context*/, 32));
types.push_back(IntegerType::get(/*LLVM context*/, 32));

Function* func = Intrinsic::getDeclaration(/* module */, Intrinsic::x86_sse3_mwait, types);
CallInst* call = CallInst::Create(func, /* arguments */);

Mallarme answered 19/12, 2014 at 23:32 Comment(8)
Thanks for the reply. I'm using pretty much the same method you pointed in order to insert the intrinsic function. Do you see any pitfall in my code that may be detouring me from getting the intrinsic expanded in the final assembly? I feel like missing some flag or argument while running llc, for the function llvm.x86.sse3.mwait.i32.i32 appears in my assembly.Margarettmargaretta
Well this is strange, try to get the intrinsic decleration without something in the types vector. Maybe the type overloading is the problem. If its not you can look at the -mcpu or -mattr command line flags from llc.Mallarme
Would you mind adding which flags you've used to generate the assembly code with llc? That may shed some light on why I'm not getting the function body expanded.Margarettmargaretta
I just used llc input.ll -o input.SMallarme
@Margarettmargaretta I updated my answer, I hope you have figured it out in the meantime on your own.Mallarme
Works like a charm! Thank you very much for the reply! :D Btw, I'm seriously considering proposing an llvm entry to stackexchange. What do you think?Margarettmargaretta
@Margarettmargaretta +1 for that idea!Mallarme
Proposed area51.stackexchange.com/proposals/82389/… Hope you join the group! :DMargarettmargaretta

© 2022 - 2024 — McMap. All rights reserved.