Semantic Analysis

CSC488 Lec5

Posted by Dino on February 8, 2021

Semantic Analysis

  • Validation of non-syntactic language constraints
    • Static semantic analysis - during compilation
    • Dynamic semantic analysis - run time checks
  • Semantic analysis
    • Visibility and Accessibility analysis
    • Type Checking
    • Proper Usage analysis
    • Range and Value Analysis
    • Range and Value Propagation
  • The compilers symbol table is usually built during semantic analysis as a side effect of declaration processing.

Type Equivalence, Compatibility, Suitability

  • Every language definition includes rules about when objects of different types can be used together in various constructs.
  • Many languages have several rules concerning types:
    • Type Equivalence: Conditions under which objects of two different types are considered to be equivalent. Equivalence is usually required when the addresses of data objects are being manipulated.
      • Example: whether two types are equal
    • Type Compatibility: Conditions under which objects of two different types are considered to be compatible. Compatibility is usually required for assignment
      • Example: Auto boxing, Upward reference, assign int for float
    • Type Suitability: Conditions under which objects of two different types are considered suitable to be used together.
      Suitability is usually required for operands in expressions.
      • Example: Expression operation
  • The two most widely used type equivalence rules are structural equivalence and name equivalence

Type Equivalence Rules

  • Define: Name Type Equivalence
    • Two types are name equivalent iff they ultimately derive from a common definition.
    • Ultimately derive allows type renaming, e.g., type S : T
    • In implementation terms, two named types are equivalent if they refer to the same type table entry.
  • Define: Structural Type Equivalence
    • Types are structurally equivalent if their definitions have the same structure and corresponding values are equal
    • Structural equivalence is isomorphism for types
    • In implementation terms a parallel walk of two type trees is required to establish structural equivalence.
  • Algorithm
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
    function isEquivalent(type T1, type T2): Boolean{
    /* Test type T1 and T2 for structural equivalence*/
    if T1.typeKind != T2.typeKind then
      return false;
    for each value field in T1, T2
      if T1.field != T2.field
          return false;
    for each subtree of T1, T2
      if not isEquivalent(T1.subtree, T2.subtree)
          return false
    return true;
    }
    

Type Equivalence Example

1
2
3
4
5
6
7
8
9
10
typedef struct {
  int B;
  int C;
} A;
typedef A D;

typedef struct {
  int X;
  int Y;
} F;
  • A & F are structurally equivalent but not name equivalent.
  • A & D are name equivalent and thus structurally equivalent.

Type equivalence checking

  • Type equivalence checking is used to guarantee that the type of data object that a pointer points at is the same as the declared type for the pointer.

  • This check is necessary to ensure that when the pointer is used to access parts of the object that access will yield the correct result.

  • Type equivalence implies memory image equivalence, i.e., the two types have identified memory images.

  • Usually type equivalence checking is done in two cases
    • When the address of a data object is assigned to a pointer
    • When a variable is passed as a reference (i.e. address) parameter
  • Memory image equivalence guarantees that when an address is used to access a type, the internal parts of the type(e.g. fields in a structure) will be accessed correctly.

Memory Equivalence Example

1
2
3
4
5
6
7
8
9
10
11
typedef struct {
   ...
   int V1;
   ...
} T1;

typedef struct {
  ...
  int V2;
  ...
} T2;
  • Then structural equivalence guarantees that for all corresponding fields Fi in T1 and T2 the address of the field relative to the start of the type is equal

  • Implementing memory equivalence constrains the way that record fields within a record/structure

Why Memory Equivalence Matters

1
2
3
4
5
6
T1 a;
T2* b;
void foo(T1 x, T1*y){
  ...
  y.V1 = x.V1;
}

Defined only if memory equivalence: foo(a, (T1*)b)

Type Compatibility and Suitability Checking

  • Type compatibility checking is used to determine when a value of some given type is compatible with the type of some variable.
  • Compatibility checking is typically used to check
    • That the expression on the right side of an assignment statement may be legally assigned to the variable on the left side.
    • That an expression being passed as a value parameter to a function can be legally assigned to the corresponding formal parameter variable.
  • Type suitability checking is used to determine when one or more values can be used together or with an operator
  • Typical instances of suitability checking include
    • Checking that the operand of a unary operator is of a suitable type for the operator.
    • Checking that both operands of a binary operator are of suitable types for the operator and for each other.

Type Checking in MiniC

  • MiniC only has primitive and array types.
  • We are going to maintain the type of each expression in the program.
  • Types must be identical for parameter passing and assignments.
  • The return type and the returned expression must match.
  • Arithmetic opcode must have int operand.
  • Boolean opcode must have bool operand.
  • If statement and for statement conditional expressions must have boolean type.
  • Index expression of an array must have int type.

Visibility and Accessibility

  • Visibility analysis determines whether a given reference to an identifier at some point it the program is legal according to the scope rules of the language.

  • The perform visibility analysis, the compiler must keep track of declared symbols behaves (logically at least) in a way that is consistent with the scope rules of the language.

  • Semantic analyzer must also track the scope structure of the program as it is being processed.

  • Accessibility analysis determines whether a visible identifier can be accessed in a given way at some point in a program.

    • Example: const in C prevents from being modified

Usage Analysis

  • Verify the appropriate use of constants, variables, types, functions.
    • Is an identifier used as a constant actually a constant?
    • Is an identifier used as a variable actually a variable?
    • Is an identifier used as a type actually a type?
    • Is a variable used as a scalar variable actually scalar?
    • Is a variable used as an array actually an array of the right dimensionality?
    • Is an identifier used as a function actually a function?
    • Are all variables and constants being used in a way that is consistent with their declared type?
    • Are the operands of all operators compatible with the operator?
  • Detection of potential runtime faults?

Usage Analysis Examples

1
2
3
typedef int T1;
int x, y[10];
int foo (int p);
Statement Error
T1 = 17 Assignment to a type
f(1) = 1 Assignment to a function
y = x Assign array to scalar variable
y[0] = x.a Field selection on a scalar variable
x = foo(1, y[0]) Wrong number and type of parameters of foo

Runtime Fault Detection Example

1
2
3
4
5
6
7
8
void foo(int x, int y){
  int a, b, c, d[10];
  if (x == 1) 
    a = 0;
  else
    a = y;
 ...
}
Expression Error
x / b Divide by zero
d[c] Array out of bound
d[a] Maybe array out of bound?

Programming Language Influences

  • Definition of the programming language being compiled can have a major influence on the way semantic analysis is implemented.
  • Languages that do not require declaration before use will usually require a separate semantic analysis pass.
  • Language with a weak or non-existent declaration structure require that most semantic analysis be done dynamically.
  • Object-Oriented languages that allow dynamic object binding may require extensive run time checking.
  • The presence of dynamically sized objects in a language (e.g. arrays whose bounds are determined at run time) may require more run time checking.

Semantic Analysis of Declarations

  • The canonical declaration associates one or more identifiers with type, structure and size attributes. The syntax of the language determines how these associations are made.

  • Declaration processing involves collecting attribute information and applying defaults for missing attributes.

  • Essentially declaration processing involves filling in a symbol table entry for each declared item.

Basic declaration Processing

  • Accumulate list of identifiers being declared
  • Lookup each identifier in the current scope to check for multiple declaration in the current scope.
  • Enter each symbol in the symbol table for the current scope.
  • Associate attributes from declaration with each identifier. Apply language-specific defaults as required.
  • Process initial value if one is present.

Expression Processing

  • Semantic analysis expression processing involves type and usage checking of expressions. It assumes that references to identifiers are processed as described above.

  • Due to the embedding of expressions within expressions, stacks are often used to save type and symbol information during expression processing.

  • Conceptually expression processing is a DFS of the abstract syntax tree for each expression.

  • Frequently expression processing if operator driven, i.e. the arithmetic operator at each node in the expression tree determines what checks are performed on the operands attached to that node.

Operator(s) Action(s)
constant Tage node with type of constant
variable Add link to symbol table entry for variable
Tag node with type of variable
+, -, *, / Verify left and right operands are of a suitable arithmetic type.
Record arithmetic result type of operator
<, <=, ==, !=, >=, > Verify operands are of a comparable type
Verify comparison is legal
Record result type is boolean
++, – Check operand is variable compatible with arithmetic
record result type and non-variable status
and, or, not Check operands are boolean type
Record result type is boolean