I want to design a IR (like LLVM IR) for my toy compiler and I don't know
what is the purpose of the alloca
instruction in further analysis. In which optimizations alloca
informations are used?
TL;DR
The purpose of the alloca
instruction is to make generating code for imperative languages with mutable variables easier. Without it you would need to structure all your assignments in SSA form. And with mutable variables this can become quite complex.
Example
Let's look at this simple C program (taken from the LLVM docs) and focus on the variable X and it's assignments. As you can notice, its final value (which we return) depends on the if-branch taken in the program.
int G, H;
int test(_Bool Condition) {
int X;
if (Condition)
X = G;
else
X = H;
return X;
}
Translating the example program to LLVM IR without alloca
Now, if we were to translate this program to LLVM IR, we would get something like:
@G = weak global i32 0 ; type of @G is i32*
@H = weak global i32 0 ; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32* @G
br label %cond_next
cond_false:
%X.1 = load i32* @H
br label %cond_next
cond_next:
%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
ret i32 %X.2
}
Because there are two different possible values for X before the return instruction we have to insert a Phi node to merge the two values. Why is that? Well, because LLVM requires all assignments to be in SSA form and the Phi node is the only way to merge two values. However, such SSA construction with Phi nodes requires non-trivial algorithms and is inconvenient and wasteful to re-implement for each compiler.
So, you might wonder how do we solve this problem? And the answer, as you might have already guessed, is alloca
!
Translating the example program to LLVM IR with alloca
The ‘trick’ here is that while LLVM does require all register values to be in SSA form, it does not require (or permit) memory objects to be in SSA form. With this in mind, the high-level idea is that we want to make a stack variable (which lives in memory) for each mutable object in a function. Using that, our code now becomes:
@G = weak global i32 0 ; type of @G is i32*
@H = weak global i32 0 ; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
%X = alloca i32 ; type of %X is i32*.
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32* @G
store i32 %X.0, i32* %X ; Update X
br label %cond_next
cond_false:
%X.1 = load i32* @H
store i32 %X.1, i32* %X ; Update X
br label %cond_next
cond_next:
%X.2 = load i32* %X ; Read X
ret i32 %X.2
}
With this, we have discovered a way to handle arbitrary mutable variables without the need to create Phi nodes at all!
Further (for the curious):
While this solution has solved our immediate problem, it introduced another one: we have now apparently introduced a lot of stack traffic for very simple and common operations, a major performance problem. Fortunately for us, the LLVM optimizer has a highly-tuned optimization pass named “mem2reg” that handles this case, promoting allocas like this into SSA registers, inserting Phi nodes as appropriate.
The ‘alloca‘ instruction allocates memory on the stack frame of the currently executing function, to be automatically released when this function returns to its caller. The object is always allocated in the address space for allocas indicated in the datalayout.
The ‘alloca‘ instruction is commonly used to represent automatic variables that must have an address available.
Some notes from llvm IR book:
The contents of an entire LLVM file, either assembly or bitcode, are said to define an LLVM module. The module is the LLVM IR top-level data structure. Each module contains a sequence of functions, which contains a sequence of basic blocks that contain a sequence of instructions. The module also contains peripheral entities to support this model, such as global variables, the target data layout, and external function prototypes as well as data structure declarations.
So alloca instruction (in my undersating) is just to support IR.
For example the following code:
int sum(int a, int b) {
return a+b;
}
In IR will looks like:
; Function Attrs: noinline nounwind uwtable
define i32 @sum(int, int)(i32, i32) #0 !dbg !6 {
%3 = alloca i32, align 4
%4 = alloca i32, align 4
store i32 %0, i32* %3, align 4
call void @llvm.dbg.declare(metadata i32* %3, metadata !10, metadata !11), !dbg !12
store i32 %1, i32* %4, align 4
call void @llvm.dbg.declare(metadata i32* %4, metadata !13, metadata !11), !dbg !14
%5 = load i32, i32* %3, align 4, !dbg !15
%6 = load i32, i32* %4, align 4, !dbg !16
%7 = add nsw i32 %5, %6, !dbg !17
ret i32 %7, !dbg !18
}
The alloca instruction reserves space on the stack frame of the current function. The amount of space is determined by element type size, and it respects a specified alignment. The first instruction, %a.addr = alloca i32, align 4, allocates a 4-byte stack element, which respects a 4-byte alignment. A pointer to the stack element is stored in the local identifier, %a.addr. The alloca instruction is commonly used to represent local (automatic) variables.
Introducing LLVM Intermediate Representation
, I 've updated my answer –
Silverts add nsw i32 %0, %1
? –
Stenographer © 2022 - 2024 — McMap. All rights reserved.