Introduction to LLVM

CSCD70 tut1

Posted by Dino on January 12, 2021

C++ Review

Pass by Reference

1
void foo(int &a);
  • What is there an & before variable a? Pass by Reference

  • Review: What is the purpose of “Pass by Pointer”?
    • Modify the values and/or
    • Avoid the overhead of copying large objects.
  • However, need to take the pain of indirect access:
    1
    2
    3
    
      int *a;
      a = &...;
      ... = *a;
    
    • Reference allow us to access the variables as is.
  • Reference must be initialized by a variable. There is no null reference.

  • Example

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    
      #include <iostream>
        
      int main(){
          int x = 100;
          int *x_ptr = &x; // `x_ptr` is a pointer to `x`
          int &x_ref = x; // `x_ref` is a *reference* to `x`
          int x_copy = x; // `x_copy` is a copy of x
            
          x = 50; // When we modify the value of `x`, we are modifying the value of
                  // `x_ref` and the value that `x_ptr` is pointing to and at the same
                  // time, but NOT the value of `x_copy`
          std:cout << "x: " << x ", "
                   << "ptr_x: " << *x_ptr << ", "
                   << "ref_x: " <<  x_ref << ", "
                   << "copy_x: " << x_copy << std::endl;  // `endl` is the end-line.
      }
    

Public Inheritance

  • C++ class: C struct with methods.
    1
    2
    3
    
    class A{
      void DoSomething();
    }
    
  • Similar to Java & Python, public inheritance allow us to leverage the members and methods of the base class

    java python c++
    class Fox extends Animal class Fox(Animal) class Fox : public Animal{};
  • Public inheritance can implement abstract methods of the base class
    1
    2
    3
    4
    5
    6
    7
    
      class Animal {
          virtual void Run() = 0; // Abstract Method;
      };
        
      class Fox: public Animal {
         virtual void Run() override;
      };
    
  • Base class pointer/reference can point to any child class instances.
    1
    2
    
      Fox fox = Fox();
      Animal &animal = fox;
    
  • Dynamic casting can be used to downcast pointers/references.
    1
    
      Fox &fox = dynamic_cast<Fox &>(animal);
    
  • Example
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    
      #include <iostream>
      #include <typeinfo>
      
      
      /*! @brief Abstract Base Class @p Animal
       */
      class Animal {
          public:
          /*! @ note If a base class method is marked as **virtual**, then all the
           *         inherited methods will also be virtual.  Furthermore, when invoking a
           *         virtual method from a base class pointer/reference, the decision on
           *         which to call is made based on the type that the pointer/reference
           *         is pointing to, rather than the pointer/reference iteself.  Clearly,
           *         an abstract method should always be virtual.
           */
          virtual void Run() = 0;
      }
      
      class Fox: public Animal{
          public:
              Fox() = default; // Constructor
              virtual ~Fox() = default; // Destructor
                
              virtual void Run() override {
                  std::cout << "Fox is runnign" << std::endl;
              }  
      };
      
        
      class Dog: public Animal{
          public:
              Dog() = default; // Constructor
              virtual ~Dog() = default; // Destructor
      
              virtual void Run()  override {
                  std::cout << "Dog is running" << std::endl;
              }
         
      }
      
      int main(){
          Fox fox = Fox();
          Animal &animal = fox;
          animal.Run();
          Fox &fox_ref = dynamic_cast<Fox &>(animal);
          try{
              Dog &dog_ref = dynamic_cast<Dog &>(animal);
          } catch (std::bad_cast &except){
              std::cerr << except.what() << std::endl;
          }
          return 0;
      }
    

Standard Template Library (STL)

  • Provides easy access to common data structures. E.g.,
    • vector - List-like Data Structure

      vector<unsigned> a = {1, 2, 3, 4, 5};

    • unordered_map - Dict-like Data Structure

      unordered_map<string, unsigned> b = { {"Red", 0}, {"Green", 1}, {"Blue", 2}};

  • We traverse through STL containers using iterators:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    
      vector<unsigned> a = {1, 2, 3, 4, 5};
        
      for (auto iter = a.begin(); iter != a.end(); ++iter){
          // dereference the iterator just as pointer
          unsigned &a_elem = *iter; // 1, 2, 3, 4, 5
      }
        
      unordered_map<string, unsigned> b = {
              {"Red", 0}, {"Green", 1}, {"Blue", 2}
      };
      
      for (std::unordered_map<std::string, unsigned>::iterator iter = b.begin(); 
           iter != b.end(); ++iter) {
          std::pair<const std::string, unsigned> &key_value_pair = *iter;
        
          std::cout << "(" << key_value_pair.first << ", "
                    << key_value_pair.second << "), ";
      }
      std::cout << std::endl; // newline character
    
  • Dereferencing an iterator gives reference to the stored elements.

LLVM Analysis Pass

Intermediate Representation

  • Many optimizations are invariant w.r.t. frontend and backend

  • Want to have an intermediate representation that is portable to multiple programming languages and target machines and have optimization passes act on those representations.

  • Intermediate Representation (IR) has syntax and semantics similar to the assembly language

  • We categorize optimization passes into analysis and transform passes.
    • Analysis passes gather information about the program

    • Transform passes mutate the program

  • Why such isolation exists?
    • Better Readability

    • Very frequently, multiple passes might require the same information.

      • The isolation avoids redundant analysis

How to write an LLVM Analysis pass?

  • LLVM Module: How is our program translated in LLVM
  • Iterators: How to traverse through the module?
  • Downcasting: How to get more information out of the Iterators?
  • LLVM Pass Interfaces: What interfaces does LLVM provide us with?

LLVM Module

Iterators

1
2
3
4
5
Module &M = ...;
for (auto iter = M.begin(); iter != M.end(); ++iter){
    Function &F = *iter;
    // do some stuff for each function
}
  • Similar syntax with STL container vector.

Downcasting

Why do we need Downcasting ?

  • Suppose that we have an Instruction, how do we know whether this is an unary instruction? or a binary operator? or a branch instruction? etc.

  • Downcasting helps us retrieve more information from the iterators.

  • Example

    1
    2
    3
    4
    
      Instruction *inst = ...;
      if (CallInst *call_inst = dynamic_cast<CallInst>(inst)){
          // Do something with it
      }
    

LLVM Pass Interfaces

  • LLVM provides multiple pass interfaces that target at different module levels (e.g., FunctionPass) and even more (e.g., LoopPass).

  • How to use those interfaces? => Public Inheritance

  • Example

    1
    2
    3
    4
    5
    6
    7
    
      class ModulePass {
          virtual bool runOnModule (Module &M) = 0;
      };
        
      class MyModulePass : public ModulePass {
          bool runOnModule (Module &M) {for (iter = ...)}
      };
    

    sdsdsd

    sdsdsd