Introduction to LLVM 2

LLVM Transform Pass

User-Use-Value relation

Suppose that we have the following code to optimize:

%2 = add %1, 0 ; Algebraic Identity
%3 = mul %2, 2

-- Suppose we omit the identity
               ; Algebraic Identity
%3 = mul ???, 2

Program crashes because we did not update the references properly.

How to make sure that all references (i.e. uses) are updated properly?

LLVM User-Use-Value Relationship

   -----            ----            -----------
  |Value|  ------> |User| -------> |Instruction|
   -----            ----            -----------

We are going to show that the Instruction can play the role both as an User and as an Usee(Value).

Value

The Value class is the most important base class in LLVM, as almost all object types inherit from it.
A Value has a type (e.g., integer, floating point): getType.
A Value might or might not have a name: hasName, getName.
Most importantly, a Value has a list of Users that are using itself.

Instruction as an User

An Instruction is an User

Each User(Instruction) has a list of values it is using.

Those values are known as the Operands, of type Value.

 User &Inst = ...
 for (auto Iter = Inst.op_begin();
           Iter != Inst.op_end(); ++Iter)
      Value *Operand = *Iter;
// %2 = add %1, 0 -> operand %1 & 0

Instruction as an Usee

Why is an Instruction is an Usee?
The answer lies in our interpretation of LLVM Instruction:
- %2 = add %1, 0
- %2 is the Value representation of instruction add %1, 0.
- Don’t treat %2 as the result of a computation.
- Instead, treat it as the alias of an instruction.
Therefore, wherever in later text we use the value %2, we mean to use the instruction add %1, 0

Summary

Back to the problem we have at the very beginning:

%2 = add %1, 0 ; Algebraic Identity
%3 = mul %2, 2

Let Inst be a reference to instruction %2 = add %1, 0

  for (auto Iter = Inst.op_begin();
            Iter != Inst.op_end(); ++Iter)
      Value *Operand = *Iter;

Operand %1, 0

  for (auto Iter = Inst.user_begin();
            Iter != Inst.user_end(); ++Iter)
      User *InstUser = *Iter;

Instruction mul %2, 2 (alternatively, Value %3)
Since %3 is using %2, that is why %3 is an user of it

Example Code

Source Code for transformation

int foo(int e, int a){
    int b = a + 1;
    int c = b * 2;
    b = e << 1;
    int d = b / 4;
    return c * d;
}

LLVM Transform Pass

#include <llvm/IR/Constants.h>
#include <llvm/IR/Function.h>
#include <llvm/IR/Instructions.h>
#include <llvm/IR/Module.h>
#include <llvm/Pass.h>
#include <llvm/Support/raw_ostream.h>

using namespace llvm;

namespace {

class Transform final : public ModulePass {
private:
  bool runOnBasicBlock(BasicBlock &B) {
    // get the first and second instruction
    Instruction &Inst1st = *B.begin(), &Inst2nd = *(++B.begin());

    // The address of the 1st instruction is equal to that of the 1st operand of
    // the 2nd instruction
    assert(&Inst1st == Inst2nd.getOperand(0));

    // print the 1st instruction
    outs() << Inst1st << "\n";
    // print the 1st instruction as an operand
    Inst1st.printAsOperand(outs(), false);
    outs() << "\n";

    // User-Use-Value
    for (auto *Iter = Inst1st.op_begin(); Iter != Inst1st.op_end(); ++Iter) {
      Value *Operand = *Iter;

      if (Argument *Arg = dyn_cast<Argument>(Operand)) {
        outs() << "I am function " << Arg->getParent()->getName() << "\'s #"
               << Arg->getArgNo() << " argument"
               << "\n";
      }
      if (ConstantInt *C = dyn_cast<ConstantInt>(Operand)) {
        outs() << "I am a constant integer of value " << C->getValue() << "\n";
      }
    }

    for (auto Iter = Inst1st.user_begin(); Iter != Inst1st.user_end(); ++Iter) {
      outs() << *(dyn_cast<Instruction>(*Iter)) << "\n";
    }

    for (auto Iter = Inst1st.use_begin(); Iter != Inst1st.use_end(); ++Iter) {
      outs() << *(dyn_cast<Instruction>(Iter->getUser())) << "\n";
    }

    // Instruction Manipulation
    Instruction *NewInst = BinaryOperator::Create(
        Instruction::Add, Inst1st.getOperand(0), Inst1st.getOperand(0));

    NewInst->insertAfter(&Inst1st);
    Inst1st.user_begin()->setOperand(0, NewInst);

    return true;
  }

  bool runOnFunction(Function &F) {
    bool Transformed = false;

    for (auto Iter = F.begin(); Iter != F.end(); ++Iter) {
      if (runOnBasicBlock(*Iter)) {
        Transformed = true;
      }
    }

    return Transformed;
  }

public:
  static char ID;

  Transform() : ModulePass(ID) {}
  virtual ~Transform() override {}

  virtual void getAnalysisUsage(AnalysisUsage &AU) const override {
    AU.setPreservesCFG();
  }

  virtual bool runOnModule(Module &M) override {
    bool Transformed = false;

    for (auto Iter = M.begin(); Iter != M.end(); ++Iter) {
      if (runOnFunction(*Iter)) {
        Transformed = true;
      }
    }
    return Transformed;
  }
};

char Transform::ID = 0;
RegisterPass<Transform> X("transform", "Example Transform Pass");

} // anonymous namespace

Result

@ Before Transformation
define i32 @foo(i32 %0, i32 %1){
    %3 = add nsw i32 %1, 1
    %4 = mul nsw i32 %3, 2
    @ Note since the SSA, the same variable b is assigned with a new address %5 instead of original %3
    %5 = shl i32 %0, 1
    %6 = sdiv i32 55, 4
    %7 = mul nsw i32 %4, %6
    ret i32 %7
}

@ After Transormation
define i32 @foo(i32 %0, i32 %1){
    %3 = add nsw i32 %1, 1 ;
    %4 = add nsw i32 %3, %3 ; Pass insert a line of code
    %5 = mul nsw i32 %4, 2 ; Pass changes the operand of the instruction
    %6 = shl i32 %0, 1 ; 
    %7 = sdiv i32 %6, 4 ;
    %8 = mul nsw i32 %5, %7 ;
    ret i32 %8 
}

LLVM Pass Manager

Recall from the previous section:

The optimization passes can be categorized into analysis and transform passes.
This is a form of isolation.

We have the isolation, how should we build the connection?

LLVM Pass Manager

Where is it?

virtual void getanalysisUsage(AnalysisUsage &AU) 

What does the Pass Manager do?

Requires & Preserves Information
- If a pass A requires pass B, bt the time A initiates, LLVM will automatically run B if B has not been run before
  - Suppose A requires B
  - If we run opt -A
  - The actual order for running is B -> A
  - Think about dependency chain in the soft development
- if a pass A preserves pass B, by the time pass A terminates, LLVM will preserve B, and B will not be rerun unless subsequent passes invalidate it.
  - Suppose both A & C requires B and A preserve B
  - If we run opt -A -C
  - The actual order for running is B -> A -> C since B is preserved it will not terminate since A preserves B
  - If A does not perverse B
  - The actual order for running is B -> A -> B -> C since A does not preserve B, then before running C the B will be run again since C requires the B
To use the pass manager, the getAnalysisUsage must be overridden to specify the required and invalidated sets.
Some of common usages
- setPreservesAll: Preserves all previous passes.
- setPreservesCFG: Preserves the CFG
- addPreserved<A>: Preserves pass A after this pass.
- addRequired<A>: Requires pass A to run before this pass.
- getAnalysis<A>: Gets the information from the required pass A.

Introduction to LLVM 2

CSCD70 tut2

LLVM Transform Pass

User-Use-Value relation

Value

Instruction as an User

Instruction as an Usee

Summary

Example Code

LLVM Pass Manager

What does the Pass Manager do?

CATALOG

FEATURED TAGS