What is LLVM IR?
-
LLVM IR stands for low-level virtual machine intermediate representation.
-
An universal and architecture-independent IR for compiler optimization, code generation, and program analysis
-
All LLVM family compilers first compiler source programs into LLVM IR, then performs optimizations and target code generations.
Advantages of Using LLVM IR
-
Utilize existing code optimization passes in LLVM framework to generate fast code
-
Utilize existing backends in LLVM framework to generate binary code for different architectures
-
Utilize existing program analysis tools built on LLVM IR for your compiler
LLVM IR At a Glance
C Program Language | LLVM IR |
Scope:file, function | module, function |
Type:bool, char, int, struct{int, char} | i1, i8, i32, {i32, i8} |
A statement with multiple expressions | A sequence of instructions each of which is in a form of “x = y op z” |
Data-flow: a sequence of reads/writes on variables | 1. load the values of memory addresses(variables) to registers; 2. Compute the values in registers; 3. Store the values of registers to memory addresses each register must be assigned exactly once (SSA) |
Control-flow in a function: if, for, while, … | A set of basic blocks each of which ends with a conditional jump (or return) |
LLVM IR Example
Source Code | IR | Note |
#include <stdio.h> |
||
int x, y; | @x = common global i32 0, align 4 @x = common global i32 0, align 4 |
@ stands for global variable In C int is in 32 bit thus they are assigned with i32 Initialize them with 0 4 bytes alignment |
int main(){ | define i32 @main() #0 entry: |
define: A function with body entry: The entry basic block the name does have to be entry declare: A function with declaration only |
int t; | %t = alloca i32, align 4 |
% stands for local alloca: instruction for allocating a temporary register |
scanf(“%d %d”, &x, &x); | %call = call i32 (i8*, ...)* @__isoc99_scanf (...i32* @x, i32* @y) |
call: command for calling a function call follows function signature and then arguments |
t = x - y; | %0 = load i32* @x, align 4 %1 = load i32* @y, align 4 %sub = sub nsw i32 %0 %1 store i32 %sub, i32* %t, align 4 |
load: Instruction for loading register value sub: Instruction for subtraction nsw: No signed wrap store: Instruction for storing value into register |
if (t > 0) | %2 = load i32* %t, align 4 %cmp = icmp sgt i32 %2, 0 br i1 %cmp, label %if.then, label %if.end |
icmp:Instruction for performing integer comparison sgt: signed greater than br: Instruction for branching br condition: Conditional jump based on condition |
printf(“x > y”) | if.then: %call1 = call i32 ... br label %if.end |
|
return 0; | if.end: ret i32 0 |
ret: Instruction for returning |
Content
LLVM IR Architecture
- RISC-like instruction set
- Only 31 op-codes exist
- Most instructions are in three-address form
- Load/store architecture
- Memory can be accessed via load/store instruction
- Computational instructions operate on registers
- Infinite and typed virtual registers
- It is possible to declare a new register any point
- (The backend maps virtual registers to physical ones).
- A register is declared with a primitive type (boolean, int, float, pointer)
- It is possible to declare a new register any point
Static Single Assignment (SSA)
-
In SSA, each variable is assigned exactly once, and every variable is defined before its uses.
- Conversion
- For each definition, create a new version of the target variable (left-hand side) and replace the target variable with the new variable
- For each use, replace the original referred variable with the versioned variable reaching the use point
Original Code SSA form x = y + x $x_1$ = $y_0$ + $x_0$ y = x + y $y_1$ = $x_1$ + $y_0$ if (y > 0) if $(y_1 > 0)$ x = y; $x_2$ = $y_1$ else else x = y + 1 $x_3$ = $y_1 + 1$ - LLVM IR follows the SSA form:
- Every virtual register can be only assigned once in the program.
- Memory loads/stores are not affected by this rule.
- Multiple assignments to a same variable have to mapped to multiple virtual registers.
- Therefore, llvm::Value is the base class of all LLVM IR elements.
- Every instruction is uniquely associated with the value it defines (assigns to)
- How to handle local variables being modified in different branches
- Option1: Use alloca instruction to allocate stack memory space to hold local variables and then use memory loads/stores
- Option2: Use phi instruction
SSA and Phi Functions
- Use $\phi$ function if two versions of a variable are reaching one use point at a joining basic block
-
$\phi(x_1, x_2)$ returns either $x_1$ or $x_2$ depending on which block was executed
Original Code SSA form x = y + x $x_1$ = $y_0$ + $x_0$ y = x + y $y_1$ = $x_1$ + $y_0$ if (y > 0) if $(y_1 > 0)$ x = y; $x_2$ = $y_1$ else else x = y + 1 $x_3$ = $y_1 + 1$ y = x - y $x_4$ = $\phi(x_2, x_3)$
$y_2 = x_4 - y_1$
-
Data Representation
Primitive Types
-
Language independent primitive types with predefined size
void void bool i1 integers i[N] where N is to $2^{23} - 1$
i8, i16, i32, i1942652floating-point types half (16-bit floating point value)
float (32-bit floating point value)
double (64-bit floating point value) -
Pointer type is a form of <type>* (e.g. i32*, (i32*)*)
Constants
-
Boolean(i1): true and false
-
Integer: standard integers including negative numbers
-
Floating point: decimal notation, exponential notation, or hexadecimal notation (IEEE753 Std.)
-
Pointer: null is treated as a special value
Registers
-
Named registers & Unnamed registers
- A register has a function-level scope.
- Two registers in different functions may have the same identifier
- A register is assigned for a particular type and a value at its first (and the only) definition (SSA form)
Variables
-
In LLVM, all addressable objects are explicitly allocated
- Global variables
- Each variable has a global scope symbol that point to the memory address of the object
- Local variables
- The alloca instruction allocates memory in the stack frame.
- Deallocated automatically if the function returns
- Heap variables
- The malloc function call allocates memory on the heap
- The free function call frees the memory allocated by malloc
Load and Store Instructions
- Load
- <result> = load <type>* <ptr>
- result: The target register
- type: The type of the data (a pointer type)
- ptr: The register that has the address of the data
- Store
- store <type> <value>, <type>* <ptr>
- type: The type of the value
- value: Either a constant, or a register that holds the value
- ptr: The register that has the address where the data should be stored
Variable Example
Source Code | IR |
#include <stdlib.h> int g = 0; int main() { int t = 0; int * p; p = malloc(sizeof(int)); free(p); } |
@g = global i32 0 define i32 @main() { %t = alloca i32 store i32 0, i32* %t %p = alloca i32* %call = call noalias i8* @malloca(i64 4) %0 = bitcast i8* %call to i32* store i32* %0, i32** %p %1 = load i32** %p |
Aggregate Types and Function Type
- Array: [<# of elements> x <type>]
- Single Dimensional Array: [40 x i32], [4, x i8]
- Multi Dimensional Array: [3 x [4 x i8]], [12 x [10 x float]]
- Structure: type {<a list of types>}
- E.g., type{i32, i32, i32}, type{i8, i32}
- Function: <return type> (a list of parameter types)
- E.g. i32 (i32), float (i16, i32*)*
Getelementptr Instruction
-
A memory in an aggregate type variable can be accessed by load/store instruction and getelementptr instruction that obtains the pointer to the element
-
Syntax:
<res> = getelementptr <pty>* <ptrval> {, <t> <idex>}*
- res: the target register
- pty: the register that defines the aggregate type
- ptrval: the register that points to the data variable
- t: the type of index
- idx: the index value
Aggregate Example
Source Code | IR |
struct pair{ int first; int second;} int main(){ int arr[10]; struct pair a; a.first = arr[1] } |
%struct.pair = type {i32, i32} define i32 @main () { entry: $arr = alloca [10 x i32] $a = alloca %struct.pair %arrayidx = get elementptr [10 x i32]* %arr, i32 0, i64 1 %0 = load i32* %arraridx %first = getelementptr %struct.pair* %a, i32 0, i32 0 %store i32 %0. i32* %first |
Aggregate Example2
Source Code | IR |
struct RT { char A; int B[10][20]; char C}; struct ST {int X; double Y; struct RT Z;}; int *foo(struct ST* s) {return &s[1].Z.B[5][13]} |
%struct.RT = type {i8, [10 x [20 x i32]], i8} %struct.ST = type {i32, double, %struct.RT} define i32* @foo(%struct.ST* %s){ enrty: %arrayidx = getelementptr inbounds %struct.ST* %s i64 1, i32 2, i32 1, i64 5, i64 13 |
Integer Conversion
Truncate
Syntax: <res> = trunc <iN1> <value> to <iN2> where iN1 and iN2 are of integer type, and N1 > N2 Example:
%X = turnc i32 257 to i8
%8 becomes i8:1%Y = trunc i32 123 to i1
%Y becomes i1:true%Z = turnc i32 122 to i1
%Z becomes i1:false
Zero Extension
Syntax: <res> = zext <iN1> <value> to <iN2> where iN1 and iN2 are interger types and N1 < N2
Fill the remaining bits with zero
Examples:
%X = zext i32 257 to i64
%Y = zext i1 true to i32
Sign Extension
Syntax: <res> = sext <iN1> <value> to <iN2> where iN1 and iN2 are interger types and N1 < N2
Fill the remaining bits with the sign bit(the highest order bit) of value
Examples:
%X = sext i8 -1 tp i16
%X becomes i16:65535%Y = sext i1 true to i32
%Y becomes i32
Other Conversions
- Float-to-float
- fptrunc .. to
- fpext .. to
- Float-to-integer
- fptoui .. to
- tptosi .. to
- uitofp .. to
- sitofp .. to
- Pointer-to-integer
- ptrtoint .. to
- inttoptr .. to
- Bitcast
- res = bitcast <t1> value to <t2>
- where t1 and t2 should be different types and have the same size
Computational Instructions
- Binary Operations:
- Add: add, sub, fsub
- res = add [nuw] [nsw] <iN> <op1>, <op2>
- nuw (no unsigned wrap): if unsigned overflow occurs, the result value becomes a poison value(undefined)
- E.g:
add nuw i8 255, i8 1
- E.g:
- nsw (no signed wrap): if signed overflow occurs, the result value becomes a poison value
- E.g:
add nsw i8 127, i8 1
- E.g:
- Multiplication: mul, fmul
- Division: udiv, sdiv, fdiv
- Remainder: urem, srem, frem
- Add: add, sub, fsub
- Bitwise binary operations:
- Shift operations: shl, lshl, ashr
- Logical operations: and, or, xor
Control Flow Representation
- The LLVM front-end constructs the control flow graph(CFG) of every function explicitly in LLVM iR
- A function has a set of basic blocks each of which is a sequence of instructions
- A function has exactly one entry basic block
- Every basic block is ended with exactly one terminator instruction which explicitly specifies its successor basic blocks if there exist
- Terminator instructions: branches (conditional, unconditional), return. unwind, invoke
- Due to its simple control flow structure, it is convenient to analyze, transform the target program in LLVM IR
Label, Return, and Unconditional Branch
- A label is located at the start of a basic block
- Each basic block is addresses as the start label
- A label x is referenced as register %x whose type is label
- The label of the entry block of a function “entry”
-
Return: ret <type> <value> | ret void
- Unconditional branch: br label <dest>
- At the end of a basic block, this instruction makes a transition to the basic block starting with label <dest>
- E.g: br label %entry
Conditional Branch
- Syntax: <res> = icmp <cmp> <ty> <op1> <op2>
- Returns either true of false based on comparison of two variables (op1 and op2) of the same type
- cmp: comparison option
- eq(equal), ne(not equal), ugt(unsigned greater than)
- uge (unsigned greater or equal), ult(unsigned less than)
- ule (unsigned less or equal), sgt(signed greater than)
- sge (signed greater or equal), slt(signed less than), sle(signed less or equal)
- br i1 <cond> label <thenbb>, label <elsebb>
- Causes the current execution to transfer to the basic block <thenbb> if the value of <cond> is true; to the basic block <elsebb> otherwise
Source Code | IR |
if(x > y) return 1; return 0; |
%0 = load i32* %x %1 = load i32* %y $cmp = icmp sgt i32 %0, %1 br i1 $cmp, label %if.then. label %if.end |
Switch
- Syntax: switch <iN> <value>, label <defaultdest> [<iN> <val>, label <dest>]
- Transfer control flow to one of many possible destinations
- If the value is found(val), control flow is transfered to the corresponding destination(dest); or to the default destination(defaultdest)
Source Code | IR |
switch(x) { case 1: break; case 2: break; default: break; |
%0 = load i32* %x switch i32 %0, label %sw.default [i32 1, label $sw.bb i32 2, label %sw.bb1] sw.bb: br label $sw.epilog sw.bb1: be label $sw.epilog sw.default: br label %sw.epilog |
PHI($\phi$) Instruction
- Syntax: <res> = phi <t> [ <val0>, <label 0>], [<val1>, <label1>]
- Return a value val of type t such that the basic block executed right before the current one if of label
Source Code | IR |
y = (x > 0)? x : 0 |
%0 = load i32* x %c = icmp sgt i32 %0 0 br i1 %c, label %c.t, %c.f c.t: %1 = load i32* %x br label %c.end c.f: br label %c.end c.end: %cond = phi i32 [%1, %c.t], [0, %c.f] |
Function Call
- Syntax: <res> = call <t> [<fnty>*] <fnptrval> (<fn args>)
- t: the type off the call return value
- fnty: the signature of the pointer to the target function (optional)
- fnptrval: an LLVM value containing a pointer to a target function
- fn args: argument list whose types match the function signature
Source Code | IR |
print("%d", abs(x))' |
@.str = [3 x i8] c"%d\00" %0 = load i32* %x $call = call i32 @abs(i32 %0) %call1 = call i32 (i8*, ...)* @prints(i8* getelementptr ([3 x i8]* @.str, i32 0, i32 0), i32 %call) |
How to Generate LLVM IR?
- Option 1: Directly generate LLVM IR as texts following the syntax
- Quick to get started
- Lack syntax checking and verification
- Lack semantic checking
- Option 2: Use LLVM framework APi to build llvm modules
- Built-in syntax and semantic checking
- Recommended way for building serious compilers
- Steps:
- Create a new Module object(llvm::Module)
- Create a Builder object associated with the module (llvm::IRBuilder)
- Create global variable (llvm::GlobalVariable) objects for the module.
- Create new Function objects (llvm::Function) for the Module
- Create the entry basic block for the function (llvm::BasicBlock)
- Set the insertion point of the builder to the entry basic block.
- Call CreateXXX() methods in IRBuilder to insert new instructions.
- Create additional basic blocks and change the insertion pont of the builder when needed