LLVM Transform Pass
User-Use-Value relation
Suppose that we have the following code to optimize:
1
2
3
4
5
6
%2 = add %1, 0 ; Algebraic Identity
%3 = mul %2, 2
-- Suppose we omit the identity
; Algebraic Identity
%3 = mul ???, 2
-
Program crashes because we did not update the references properly.
-
How to make sure that all references (i.e. uses) are updated properly?
- LLVM User-Use-Value Relationship
1 2 3
----- ---- ----------- |Value| ------> |User| -------> |Instruction| ----- ---- -----------
- We are going to show that the Instruction can play the role both as an User and as an Usee(Value).
- LLVM User-Use-Value Relationship
Value
-
The Value class is the most important base class in LLVM, as almost all object types inherit from it.
-
A Value has a type (e.g., integer, floating point): getType.
-
A Value might or might not have a name: hasName, getName.
-
Most importantly, a Value has a list of Users that are using itself.
Instruction as an User
An Instruction is an User
- Each User(Instruction) has a list of values it is using.
- Those values are known as the Operands, of type Value.
1 2 3 4 5
User &Inst = ... for (auto Iter = Inst.op_begin(); Iter != Inst.op_end(); ++Iter) Value *Operand = *Iter; // %2 = add %1, 0 -> operand %1 & 0
- Those values are known as the Operands, of type Value.
Instruction as an Usee
-
Why is an Instruction is an Usee?
- The answer lies in our interpretation of LLVM Instruction:
-
%2 = add %1, 0
-
%2 is the Value representation of instruction add %1, 0.
-
Don’t treat %2 as the result of a computation.
-
Instead, treat it as the alias of an instruction.
-
- Therefore, wherever in later text we use the value %2, we mean to use the instruction
add %1, 0
Summary
Back to the problem we have at the very beginning:
1
2
%2 = add %1, 0 ; Algebraic Identity
%3 = mul %2, 2
- Let Inst be a reference to instruction
%2 = add %1, 0
1 2 3
for (auto Iter = Inst.op_begin(); Iter != Inst.op_end(); ++Iter) Value *Operand = *Iter;
- Operand %1, 0
1 2 3
for (auto Iter = Inst.user_begin(); Iter != Inst.user_end(); ++Iter) User *InstUser = *Iter;
-
Instruction
mul %2, 2
(alternatively, Value %3) -
Since %3 is using %2, that is why %3 is an user of it
Example Code
Source Code for transformation
1
2
3
4
5
6
7
int foo(int e, int a){
int b = a + 1;
int c = b * 2;
b = e << 1;
int d = b / 4;
return c * d;
}
LLVM Transform Pass
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
#include <llvm/IR/Constants.h>
#include <llvm/IR/Function.h>
#include <llvm/IR/Instructions.h>
#include <llvm/IR/Module.h>
#include <llvm/Pass.h>
#include <llvm/Support/raw_ostream.h>
using namespace llvm;
namespace {
class Transform final : public ModulePass {
private:
bool runOnBasicBlock(BasicBlock &B) {
// get the first and second instruction
Instruction &Inst1st = *B.begin(), &Inst2nd = *(++B.begin());
// The address of the 1st instruction is equal to that of the 1st operand of
// the 2nd instruction
assert(&Inst1st == Inst2nd.getOperand(0));
// print the 1st instruction
outs() << Inst1st << "\n";
// print the 1st instruction as an operand
Inst1st.printAsOperand(outs(), false);
outs() << "\n";
// User-Use-Value
for (auto *Iter = Inst1st.op_begin(); Iter != Inst1st.op_end(); ++Iter) {
Value *Operand = *Iter;
if (Argument *Arg = dyn_cast<Argument>(Operand)) {
outs() << "I am function " << Arg->getParent()->getName() << "\'s #"
<< Arg->getArgNo() << " argument"
<< "\n";
}
if (ConstantInt *C = dyn_cast<ConstantInt>(Operand)) {
outs() << "I am a constant integer of value " << C->getValue() << "\n";
}
}
for (auto Iter = Inst1st.user_begin(); Iter != Inst1st.user_end(); ++Iter) {
outs() << *(dyn_cast<Instruction>(*Iter)) << "\n";
}
for (auto Iter = Inst1st.use_begin(); Iter != Inst1st.use_end(); ++Iter) {
outs() << *(dyn_cast<Instruction>(Iter->getUser())) << "\n";
}
// Instruction Manipulation
Instruction *NewInst = BinaryOperator::Create(
Instruction::Add, Inst1st.getOperand(0), Inst1st.getOperand(0));
NewInst->insertAfter(&Inst1st);
Inst1st.user_begin()->setOperand(0, NewInst);
return true;
}
bool runOnFunction(Function &F) {
bool Transformed = false;
for (auto Iter = F.begin(); Iter != F.end(); ++Iter) {
if (runOnBasicBlock(*Iter)) {
Transformed = true;
}
}
return Transformed;
}
public:
static char ID;
Transform() : ModulePass(ID) {}
virtual ~Transform() override {}
virtual void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesCFG();
}
virtual bool runOnModule(Module &M) override {
bool Transformed = false;
for (auto Iter = M.begin(); Iter != M.end(); ++Iter) {
if (runOnFunction(*Iter)) {
Transformed = true;
}
}
return Transformed;
}
};
char Transform::ID = 0;
RegisterPass<Transform> X("transform", "Example Transform Pass");
} // anonymous namespace
Result
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
@ Before Transformation
define i32 @foo(i32 %0, i32 %1){
%3 = add nsw i32 %1, 1
%4 = mul nsw i32 %3, 2
@ Note since the SSA, the same variable b is assigned with a new address %5 instead of original %3
%5 = shl i32 %0, 1
%6 = sdiv i32 55, 4
%7 = mul nsw i32 %4, %6
ret i32 %7
}
@ After Transormation
define i32 @foo(i32 %0, i32 %1){
%3 = add nsw i32 %1, 1 ;
%4 = add nsw i32 %3, %3 ; Pass insert a line of code
%5 = mul nsw i32 %4, 2 ; Pass changes the operand of the instruction
%6 = shl i32 %0, 1 ;
%7 = sdiv i32 %6, 4 ;
%8 = mul nsw i32 %5, %7 ;
ret i32 %8
}
LLVM Pass Manager
Recall from the previous section:
- The optimization passes can be categorized into analysis and transform passes.
- This is a form of isolation.
We have the isolation, how should we build the connection?
- LLVM Pass Manager
- Where is it?
1
virtual void getanalysisUsage(AnalysisUsage &AU)
What does the Pass Manager do?
- Requires & Preserves Information
- If a pass A requires pass B, bt the time A initiates, LLVM will automatically run B if B has not been run before
-
Suppose A requires B
-
If we run
opt -A
-
The actual order for running is
B -> A
-
Think about dependency chain in the soft development
-
- if a pass A preserves pass B, by the time pass A terminates, LLVM will preserve B, and B will not
be rerun unless subsequent passes invalidate it.
-
Suppose both A & C requires B and A preserve B
-
If we run
opt -A -C
-
The actual order for running is
B -> A -> C
since B is preserved it will not terminate since A preserves B -
If A does not perverse B
-
The actual order for running is
B -> A -> B -> C
since A does not preserve B, then before running C the B will be run again since C requires the B
-
- If a pass A requires pass B, bt the time A initiates, LLVM will automatically run B if B has not been run before
-
To use the pass manager, the
getAnalysisUsage
must be overridden to specify the required and invalidated sets. - Some of common usages
-
setPreservesAll
: Preserves all previous passes. -
setPreservesCFG
: Preserves the CFG -
addPreserved<A>
: Preserves pass A after this pass. -
addRequired<A>
: Requires pass A to run before this pass. -
getAnalysis<A>
: Gets the information from the required pass A.
-