Dataflow Analysis

CSCD70 Lec2

Posted by Dino on January 18, 2021

What is Data Flow Analysis

  • Local analysis (e.g., value numbering)
    • analyze effect of each instruction
    • compose effects of instructions to derive information from beginning of basic block to each instruction
  • Data flow analysis
    • analyze effect of each basic block
    • compose effects of basic blocks to derive information at basic block boundaries
    • from basic block boundaries, apply local technique to generate information on instructions
    • Flow-sensitive: sensitive to the control flow in a function
    • intra-procedural analysis
  • Examples of optimizations:
    • Constant propagation
    • Common subexpression elimination
    • Dead code elimination

  • Static Program vs. Dynamic Execution
    • Statically: Finite program
    • Dynamically: Can have infinitely many possible execution paths
    • Data flow analysis abstraction:
      • for each point in the program: combines information of all the instances of the same program point
    • Example of a data flow question:
      • Which definition defines the value used in statement “b=a”

Reaching Definition

  • Effect of a statement: a = b + c
    • Uses variables (b,c)
    • Kills an old definition (old definition of a)
    • new definition (a)
  • Compose effects of statements -> Effect of a basic block
    • A locally exposed use in a b.b is a use of a data item which is not preceded in the b.b by a definition of the data item

    • Any definition of a data item in the b.b kills all definitions of the same data item reaching the basic block.

    • A locally available definition = last definition of data item in b.b

  • Every assignment is a definition

  • A definition d reaches a point p if there exists path from the point immediately following d to p such that d is not killed (overwritten) along that path.

Problem Setup

  • Problem statement
    • For each point in the program, determine if each definition in the program reaches the point
    • A bit vector per program point, vector-length = #defs
  • Build a flow graph(nodes = basic blocks, edges = control flow)

  • Set up a set of equations between in and out for all basic block
    • Effect of code in the basic block:
      • Transfer function $f_b$ relates in[$b$] and out[$b$] for same b
    • Effect of flow of control:
      • relates out[$b_1$], in[$b_2$] if $b_1$ and $b_2$ are adjacent
    • Find a solution to the equations

    • $f_s$: A transfer function of a statement
      • Abstracts the execution with respect to the problem of interest
    • For a statement s (d: x = y + z)
      • $outs[s]$ = $f_s(in[s])$ = Gen[s] $\cup$ (IN[s] - Kill[s])
      • Gens[s]: definitions generated: Gen[s] = d
      • Propagated definitions: $in[s] - kill[s]$
        • Where Kill[s] = set of all other definitions to x in the rest of program
    • Transfer function of a statement s:
      • out[s] = $f_s(in[s])$ = $Gen[s] \cup (In[s] - Kill[s])$
    • Transfer function of a basic block B
      • Composition of transfer functions of statements in B
      • Eg: $f_b$ = $f_{d2} \cdot f_{d1} \cdot f_{d0}$
    • out[B] = $f_B(in[B])$ = $f_{d_2} \cdot f_{d_1} \cdot f_{d_0} (in[B])$
      • $Gen[d_2] \cup (IN[d_2] - KILL[d_2])$
      • $Gen[d_2] \cup ((Out[d_1]) - KILL[d_2])$
      • $Gen[d_2] \cup ((Gen[d_1] \cup (In[d_1] - Kill[d_1])) - Kill[d_2])$
      • $Gen[d_2] \cup ((Gen[d_1] \cup (Out[d_0] - Kill[d_1])) - Kill[d_2])$
      • $Gen[d_2] \cup ((Gen[d_1] \cup ((Gen[d_0] \cup (In[d_0] - Kill[d_0])) - Kill[d_1])) - Kill[d_2])$
      • Gen[$B$] $\cup$ (in[$B$] - Kill[$B$])

Effects of the Edges(acyclic)

  • out[b] = $f_b$(in[b])

  • Join node: A node with multiple predecessors

  • meet operator:

    • in[b] = $out[p_1] \cup out[p_2] \cup … \cup out[p_n]$ where $p_1, …, p_n$ are all predecessors of b

Reaching definitions: Iterative Algorithm

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
input: control flow graph CFG = (N, E, Entry, Exit)

// Boundary condition
  out[Entry] = empty
  
// Initialization for iterative algorithm
  For each basic block B other than Entry
    out[B] = empty
    
// Iterate
  While (Changes to any out[] occur) {
    For each basic block B other than Entry{
      in[B] = union (out[p]), for all predecessors p of B
      out[B] = transfer function(in[B]) // out[B] = gen[B] union (in[B] - kill[B])
    }
  }

Reaching definitions: Worklist Algorithm

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
input: control flow grapg CFG = (N, E, Entry, Exit)

// Initialize
  out[Entry] = empty
  
  For all nodes i 
    out[i] = empty
    
  ChangeNodes = N // All nodes
  
// Itearte
  While ChangedNodes != empty:
    Remove i from ChangedNodes
    in[i] = union (out[p]), for all predecessors p of i
    oldout = out[i]
    out[i] = transfer function(in[i]) // out[i] = gen[i] union (in[i] - kill[i])
    if (oldout != out[i]){
      for all successors s of i
        add s to ChagnedNodes
    }

Example

  First Iteration Second Iteration
IN[B1] 0000000 0000000
Gen[B1] 1, 2, 3 1, 2, 3
Kill[B1] 4, 5, 6, 7 4, 5, 6, 7
Out[B1] 1110000 1110000
IN[B2] 1110000 1110111
Gen[B2] 4, 5 4, 5
Kill[B2] 1, 2, 7 1, 2, 7
Out[B2] 0011100 0011110
IN[B3] 0011100 0011110
Gen[B3] 6 6
Kill[B3] 3 3
Out[B3] 0001110 0001110
IN[B4] 0011110 0011110
Gen[B4] 7 7
Kill[B4] 1,4 1,4
Out[B4] 0010111 0010111

Live Variable Analysis

  • Definition
    • A variable v is live at point p if
      • The value of v is used along some path in the flow graph starting at p
  • Problem statement
    • For each basic block
      • Determine if each variable is live in each basic block
    • Size of bit vector: one bit for each variable

Transfer Function

  • A basic block b can
    • generate live variables: Use[b]
      • set of locally exposed uses in b
    • propagate incoming live variables: Out[b] - Def[b]
      • where Def[b] = set of variables defined in b.b
  • Transfer function for the block b:
    • in[b] = Use[b] $\cup$ (out(b) - Def[b])

Flow Graph

  • in[b] = $f_b(out[b])$
  • Join Node: A node with multiple successors
  • meet operator:
    • $out[b] = in[s_1] \cup in[s_2] \cup … \cup in[s_n]$ where $s_1, s_2, …., s_n$ are all successors of b

Liveness: Iterative Algorithm

1
2
3
4
5
6
7
8
9
10
11
12
13
input: control flow graph CFG = (N, E, Entry, Exit)

// Boundary condition
  in[Exit] = Empty
  
// Initializeation for iterative 
  For each basic block B other than Exit
    in[B] = Empty
    
// Iterate
  while (Changes to any in[] occue)
    out[B] = union (in[s]) // For all successor s of B
    in[B] = transfer function(out[B]) // in[B] = Use[B] union (Out[B] minus Def[B])

Example

  First Iteration Second Iteration
Out[B4] $\emptyset$ {i, j, u2, u3}
Use[B4] {u3} {u3}
Def[B4] {i} {i}
In[B4] {u3} {j, u2, u3}
Out[B3] {u3} {j, u2, u3}
Use[B3] {u2} {u2}
Def[B3] {a} {a}
In[B3] {u2, u3} {j, u2, u3}
Out[B2] {u2, u3} {j, u2, u3}
Use[B2] {i, j} {i, j}
Def[B2] {i, j} {i, j}
In[B2] {i, j, u2, u3} {i, j, u2, u3}
Out[B1] {i, j, u2, u3} {i, j, u2, u3}
Use[B1] {m, n, u1} {m, n, u1}
Def[B1] {i, j, a} {i, j, a}
In[B1] {u2, u3, m, n, u1} {u2, u3, m, n, u1}