GRP 1: Semantics Analysis PDF
Document Details
Uploaded by CozyPoplar
University of the Philippines Cebu
Tags
Summary
This document provides an overview of semantic analysis in compiler design. It covers various concepts and techniques related to interpreting and analyzing program code, including different types of scoping, type checking, and attribute grammars. The included examples illustrate the application of these concepts and the use of syntax and parse trees.
Full Transcript
GRP 1: SEMANTICS ANALYSIS Where we currently are source code Lexical Analysis Token Syntax Analysis Syntax...
GRP 1: SEMANTICS ANALYSIS Where we currently are source code Lexical Analysis Token Syntax Analysis Syntax Tree Semantic Analysis WHAT IS SEMANTIC ANALYSIS? THE TOOTHBRUSH SANG A LULLABY Predicate Subject Verb Object THE TOOTHBRUSH SANG A LULLABY This is syntactically correct but semantically wrong! AGENDA HIGH-LEVEL PROGRAM REPRESENTATIONS SCOPE AND BINDING RESOLUTION TYPE CHECKING ATTRIBUTE GRAMMARS HIGH-LEVEL PROGRAM REPRESENTATIONS A SIMPLIFIED CODE FORM USED BY COMPILERS FOR OPTIMIZATION COVERAGE WHAT IS HIGH-LEVEL INTERMEDIATE REPRESENTATION (HLIR)? TAXONOMY OF IRS GRAPHICAL IRS WHAT IS HIGH-LEVEL INTERMEDIATE REPRESENTATION? Representation for all of the facts that a compiler derives from a program High Level: It is almost similar to the source language. TAXONOMY OF INTERMEDIATE REPRESENTATIONS 1st axis: Structural Graphical IR: nodes and edges Linear IR: sequence of instructions executed in order of appearance Hybrid IR: combination of graphical and linear 2nd axis: Abstraction Level Near-source (high-level) Low-level 3rd axis: Naming Discipline Naming scheme for distinct values Translating source-code to lower-level GRAPHICAL INTERMEDIATE REPRESENTATION GRAPHICAL IR Intermediate representations which makes use of nodes and edges Two types Syntax trees Graphs source code Lexical Analysis Token Rules/Grammar Syntax Analysis Syntax Tree Semantic Analysis PARSE TREE Classic Expression Grammar Parse Tree Derivation of: ax2+ax2xb ABSTRACT SYNTAX TREE Eliminates extraneous nodes DIRECTED ACYCLIC GRAPH Contraction of AST Can have multiple parents Identical subtrees are reused SCOPING AND BINDING RESOLUTIONS DETERMINES HOW NAMES (VARIABLES AND FUNCTIONS) ARE RESOLVED IN A PROGRAM. COVERAGE INTRODUCTION TO SCOPE AND BINDING TYPES OF SCOPES STATIC VS DYNAMIC SCOPING SYMBOL TABLE RESOLUTIONS BINDING SCOPE AND SCOPE Scope in programming defines where a variable or function can be accessed within a codebase. e.g., { } in C, or begin-end in Pascal. SCOPE CHECKING When we try to use a variable in a program, the program checks if that identifier is accessible at that point. BINDING Binding is the association between identifiers (like variable names) and the entities (like values or functions) they represent. SCOPE TYPES OF GLOBAL SCOPE Variables defined outside of any function or class have a global scope. These can be accessed from anywhere in the code after their declaration. LOCAL SCOPE Variables defined within a function or block (e.g., within { } ) are only accessible within that block. BLOCK (NESTED) SCOPE Variables declared in nested blocks (e.g., loops or inner if statements) are only accessible within that block. GLOBAL SCOPE LOCAL SCOPE BLOCK SCOPE STATIC VS DYNAMIC SCOPING STATIC SCOPING Static scoping is also called lexical scoping. In this scoping, a variable always refers to its top-level environment. This is a property of the program text and is unrelated to the run-time call stack. In simpler terms, in static scoping the compiler first searches in the current block, then in global variables, then in successively smaller scopes. In most programming languages including C, C++, and Java, variables are always statically (or lexically) scoped i.e., binding of a variable can be determined by program text and is independent of the run-time function call stack. GeeksforGeeks. (2024, September 14). Static and dynamic scoping. GeeksforGeeks. https://www.geeksforgeeks.org/static-and-dynamic-scoping/ HOW STATIC SCOPING WORKS? DYNAMIC SCOPING With dynamic scope, a global identifier refers to the identifier associated with the most recent environment and is uncommon in modern languages. In technical terms, this means that each identifier has a global stack of bindings and the occurrence of an identifier is searched in the most recent binding. In simpler terms, in dynamic scoping, the compiler first searches the current block and then successively all the calling functions. GeeksforGeeks. (2024, September 14). Static and dynamic scoping. GeeksforGeeks. https://www.geeksforgeeks.org/static-and-dynamic-scoping/ HOW DYNAMIC SCOPING WORKS? SYMBOL TABLE IN SEMANTIC ANALYSIS SYMBOL TABLE A semantic analyzer builds and maintains a symbol table data structure Identifier Type Scope that maps each identifier to the information known about it, such as the identifier’s type, internal structure, and scope With the symbol table, the semantic analyzer can enforce a large variety of rules to check for errors KESHAV PINGALI, ADVANCED TOPICS IN COMPILERS, HTTPS://WWW.CS.UTEXAS.EDU/~PINGALI/CS38 0C/2013/LECTURES/INTRO.PDF ROLE IN SCOPE AND Example: BINDING RESOLUTION Simple Symbol Table Entry after encountering int x = 10; in C++: The symbol table binds the identifier name to this metadata, so when the identifier is used, Identifier Type Scope Value the compiler can verify its usage. x int local 10 The symbol table helps the compiler determine where each variable or function is visible and accessible. It ensures that each variable name refers to the correct memory location based on the variable’s scope. TYPE CHECKING COVERAGE WHAT IS TYPE CHECKING? HOW TYPE CHECKING WORKS? TYPE COERCION STATIC VS. DYNAMIC CHECKING WHAT IS TYPE CHECKING? The process of verifying that each operation executed in a program respects the type system of the language. Ensures that all operands in any expression are of appropriate types and number. Sometimes rules regarding operations are defined by other parts of the code (as in function prototypes), and sometimes such rules are a part of the definition of the language itself (as in "both operands of a binary arithmetic operation must be of the same type"). Much of what we do in the semantic analysis phase is type checking. HOW TYPE CHECKING WORKS? C code: int x = 3; x = x+5; HOW TYPE CHECKING WORKS? = C code: int x = 3; x + x = x+5; x 5 HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int x = x+5; x 5 STATUS: CHECKING HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int x = x+5; x 5 int STATUS: CHECKING HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int x = x+5; x 5 int int STATUS: CHECKING HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int int x = x+5; x 5 int int STATUS: CHECKING HOW TYPE CHECKING WORKS? = int C code: int x = 3; x + int int x = x+5; x 5 int int STATUS: VALID HOW TYPE CHECKING WORKS? C code: int x = 3; x = x + ‘c’; HOW TYPE CHECKING WORKS? = C code: int x = 3; x + x = x + ‘c’; x ‘c’ HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int x = x + ‘c’; x ‘c’ STATUS: CHECKING HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int x = x + ‘c’; x ‘c’ int STATUS: CHECKING HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int x = x + ‘c’; x ‘c’ int char STATUS: CHECKING HOW TYPE CHECKING WORKS? = C code: int x = 3; x + int x = x + ‘c’; x ‘c’ int char STATUS: ERROR! TYPE MISMATCH. TYPE COERCION a compiler finds a type error and then changes the type of the variable to an appropriate type. also known as Implicit type conversion. HOW TYPE COERCION WORKS? C code: int x = 3; x = x + true; HOW TYPE COERCION WORKS? = C code: int x = 3; x + x = x + true; x true HOW TYPE COERCION WORKS? = C code: int x = 3; x + int x = x + true; x true STATUS: CHECKING HOW TYPE COERCION WORKS? = C code: int x = 3; x + int x = x + true; x true int STATUS: CHECKING HOW TYPE COERCION WORKS? = C code: int x = 3; x + int x = x + true; x true int bool STATUS: CHECKING HOW TYPE COERCION WORKS? = C code: int x = 3; x + int x = x + true; x true int bool STATUS: CHECKING HOW TYPE COERCION WORKS? = C code: int x = 3; x + int x = x + true; x 1 int int STATUS: CHECKING HOW TYPE COERCION WORKS? = C code: int x = 3; x + int int x = x + true; x 1 int int STATUS: CHECKING HOW TYPE COERCION WORKS? = int C code: int x = 3; x + int int x = x + true; x 1 int int STATUS: VALID STATIC VS DYNAMIC TYPE CHECKING C++ Python STATIC TYPE CHECKING DYNAMIC TYPE CHECKING Done during run time Done at compile time Uses type tags to reference It uses the symbol table to the type reference the type Permits programmers to be enforces strict, rigid less concerned with types programming rules and Mandatory in some guidelines from the start situations, such as, array Can catch many common bounds check errors More flexible but complex Desirable when faster and expensive execution is important e.g. Python, JavaScript, PHP e.g. Java, C, C++, G0 DECLARATIVE SPECIFICATIONS SUCH AS ATTRIBUTE GRAMMARS COVERAGE ATTRIBUTE GRAMMARS SYNTHESIZED AND INHERITED ATTRIBUTES ATTRIBUTE GRAMMAR - EXAMPLE DEPENDENCE GRAPH ATTRIBUTE EVALUATION INTRODUCTION Static semantics of PL can be specified using attribute grammars Semantic analyzers can be generated semi-automatically from attribute grammars Attribute grammars are extensions of context-free grammars Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE GRAMMARS Let G = (N, T, P, S) be a CFG and let V = N ∪ T Every symbol X of V has associated with it a set of attributes (denoted by X.a, X.b, etc.) Two types of attributes: inherited (denoted by AI(X))and synthesized (denoted by AS(X)) Each attribute takes values from a specified domain (finite or infinite), which is its type Typical domains of attributes are, integers, reals, characters, strings, booleans, structures, etc. New domains can be constructed from given domains by mathematical operations such as cross product, map, etc. Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE COMPUTATION RULES A production p ∈ P has a set of attribute computation rules (functions) Rules are provided for the computation of Synthesized attributes of the LHS non-terminal of p Inherited attributes of the RHS non-terminals of p These rules can use attributes of symbols from the production p only Rules are strictly local to the production p (no side effects) Restrictions on the rules define different types of attribute grammars Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. SYNTHESIZED AND INHERITED ATTRIBUTES An attribute cannot be both synthesized and inherited, but a symbol can have both types of attributes Attributes of symbols are evaluated over a parse tree by making passes over the parse tree Synthesized attributes are computed in a bottom-up fashion from the leaves upwards Always synthesized from the attribute values of the children of the node Leaf nodes (terminals) have synthesized attributes (only) initialized by the lexical analyzer and cannot be modified Inherited attributes flow down from the parent or siblings to the node in question ATTRIBUTE GRAMMAR - EXAMPLE 1 The following CFG S → A B C, A → aA | a, B → bB | b, C → cC | c generates: L(G) = { | m, n, p ≥ 1} We define an AG (attribute grammar) based on this CFG to generate L = { |n≥ 1} All the non-terminals will have only synthesized attributes AS(S) = {equal ↑: {T, F}} AS(A) = AS(B) = AS(C) = {count ↑: integer} ATTRIBUTE GRAMMAR - EXAMPLE 1 Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE GRAMMAR - EXAMPLE 1 Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE GRAMMAR - EXAMPLE 1 Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE GRAMMAR - EXAMPLE 1 Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE DEPENDENCE GRAPH Let T be a parse tree generated by the CFG of an AG, G. The attribute dependence graph (dependence graph for short) for T is the directed graph, DG(T) = (V, E), where V = {b|b is an attribute instance of some tree node}, and E = {(b, c)|b, c ∈ V, b and c are attributes of grammar symbols in the same production p of B, and the value of b is used for computing the value of c in an attribute computation rule associated with production p} Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE DEPENDENCE GRAPH An AG G is non-circular, iff for all trees T derived from G, DG(T) is acyclic Non-circularity is very expensive to determine (exponential in the size of the grammar) Therefore, our interest will be in subclasses of AGs whose non-circularity can be determined efficiently Assigning consistent values to the attribute instances in DG(T) is attribute evaluation Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. ATTRIBUTE EVALUATION STRATEGY Construct the parse tree Construct the dependence graph Perform topological sort on the dependence graph and obtain an evaluation order Evaluate attributes according to this order using the corresponding attribute evaluation rules attached to the respective productions Multiple attributes at a node in the parse tree may result in that node to be visited multiple number of times Each visit resulting in the evaluation of at least one attribute DEPENDENCE GRAPH FOR EXAMPLE 1 REFERENCES Barry Brown. (2023, March 3). Type checking in C [Video]. YouTube. https://www.youtube.com/watch?v=0oZBuqMK908 CS106X Handout #01 Semantic Analysis. (2012, July 16). Stanford. https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/handouts/180%20Semantic%20Analysis.pdf Jackson-Barnes, S. (2024, April 3). Dynamic Typing vs Static Typing, Explained. Orient Software. https://www.orientsoftware.com/blog/dynamic-typing-vs-static- typing/#:~:text=In%20static%20typing%2C%20developers%20must,%2C%20C%2B%2B%2C%20and%20Go. Keshav Pingali, Advanced Topics in Compilers, https://www.cs.utexas.edu/~pingali/CS38 0C/2013/lectures/intro.pdf https://www.geeksforgeeks.org/static-and-dynamic-scoping/ Principles of Compiler Design by Prof. Y.N. Srikanth,Department of Computer Science and Engineering,IISc Bangalore. https://nptel.ac.in/courses/106108113 QUESTIONS?