Compiler Construction - Introduction Lecture Slides PDF

Amity School of Engineering and Technology Module I Lecture-1 Introduction to Compilers Neetu Narayan ASET(CSE) 1 Module I:  Introduction of Compiler  Cousins of the Compiler  Phases of a Compiler  Grouping of Phases 2 Amity School of Engineering and Technology Recommended Reading Textbooks: Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, “Compilers: Principles Techniques and Tool”, Second Edition, Pearson Publication, 2007. A.A Putambekar, “Compiler Construction”, Technical Publications, 2009. Reference Book: Des Watson, A Practical Approach to Compiler Construction, First Edition, Springer, 2017 3 Amity School of Engineering and Technology OBJECTIVES After completing this section, you will be able to 1.1 Understand the working of compiler. 1.2 Understand the different phases of compiler. 4 Amity School of Engineering and Technology Module Assessment Quiz (Conceptual and Numerical Based) Assignment 5 Amity School of Engineering and Technology Translator It is a program that takes as input a program written in one programming language i.e. the source program and translates into an equivalent program in another language i.e. the target language. Source program Translator Target program Compiler is a translator. 6 Amity School of Engineering and Technology Compiler Compiler is a software which converts a program written in high level language (Source Language) to low level language (Object/Target/Machine Language). 7 Amity School of Engineering and Technology Language processing systems (using Compiler) 8 Amity School of Engineering and Technology Cousins of Compiler(other translators) 1. Interpreter: It is program that executes other programs. Instead of producing a target program as a translation, it executes a source program statement by statement. It is slower than Compiler but provides better error diagnostic than Compiler 9 Amity School of Engineering and Technology Cousins of Compiler(other translators) 2. Preprocessor : It translates programs written in one high level language into another high level language program. 3. Assembler: It translates assembly language program into machine language program 10 Amity School of Engineering and Technology Structure of a Compiler Compilation process is a complex process, so it is partitioned into a series of sub processes called Phases. Phases: A phase is logically an operation which takes as input one representation of source program and produces as output another representation. There are six different phases of compiler. 11 Amity School of Engineering and Technology Structure of a Compiler 12 Amity School of Engineering and Technology Analysis-Synthesis model of Compiler We basically have two parts of compilers: 1. Analysis part 2. Synthesis part. 1. Analysis phase creates an intermediate representation from the given source code. Lexical Analyzer Syntax Analyzer Semantic Analyzer 2. Synthesis phase creates an equivalent target program from the intermediate representation. Intermediate Code Generator Code Optimizer Code Generator 13 Amity School of Engineering and Technology Grouping of Phases Phases of Compiler are grouped into two: 1. Front End 2. Back End 1. Front End : It consists of those phases or parts of phases that primarily depends on the source language and are independent of the target machine. It includes lexical, syntax analysis, creation of symbol table, semantic analysis and generation of intermediate code. It also includes error handling which goes along with each of these phases. 2. Back End: It includes those phases or parts of phases that depend on the target machine and independent of the source language. It includes intermediate code generation, code optimization, code generation along with the necessary error handling and symbol table operations. 14 Amity School of Engineering and Technology Example 15 Amity School of Engineering and Technology Introduction of Lexical Analysis Lexical Analysis is the first phase of the compiler also known as a scanner. Its main task is to read the input characters and produce a sequence of tokens that the parser(next phase) uses for syntax analysis. Lexical Analysis can be implemented with the Deterministic finite Automata. The output is a sequence of tokens that is sent to the parser for syntax analysis. 16 Amity School of Engineering and Technology Some Important terms: 1. Pattern: It is a rule which describes a set of lexemes that can represent a particular token in the source program. For eg.- an identifier can be described as a letter followed by letters or digit. 2. Lexeme: Lexemes are the smallest logical units of a program. It is a sequence of characters present in the source program for which a token is produced. For eg.- 10, int, + , etc 3. Token: It is a sequence of characters that can be treated as a unit in the grammar of the programming languages. Classes of similar lexemes are identified by the same token. For eg: identifier, keyword, operator, constant, delimeter , etc. 17 Amity School of Engineering and Technology Introduction of Lexical Analysis Token may have two things: token-name is the type and token value points to an entry in the symbol table for the token. Example of Non-Tokens: Comments, preprocessor directive, macros, blanks, tabs, newline, etc. 18 Amity School of Engineering and Technology Symbol Table in Compiler Symbol Table is an important data structure created and maintained by the compiler in order to keep track of semantics of variable i.e. it stores information about scope and binding information about names, information about instances of various entities such as variable and function names, classes, objects, etc. It is built in lexical and syntax analysis phases. The information is collected by the analysis phases of compiler and is used by synthesis phases of compiler to generate code. It is used by compiler to achieve compile time efficiency. 19 Amity School of Engineering and Technology Symbol Table in Compiler Items stored in Symbol table: Variable names and constants Procedure and function names Literal constants and strings Compiler generated temporaries Labels in source languages Information used by compiler from Symbol table: Data type and name Declaring procedures Offset in storage If structure or record then, pointer to structure table. For parameters, whether parameter passing by value or by reference Number and type of arguments passed to function Base Address 20 Amity School of Engineering and Technology Specification of Tokens(Pattern) Regular expressions are an important notation for specifying patterns. Regular expression is a grammar. It defines the rule. For the grammar, first we have to define the alphabet An alphabet is a finite, non-empty set of symbols We use the symbol ∑ (sigma) to denote an alphabet Examples: Binary: ∑ = {0,1} All lower case letters: ∑ = {a,b,c,..z} digits ∑ = {0-9} 21 Amity School of Engineering and Technology Regular Expression A string or word is a finite sequence of symbols chosen from ∑ Empty string is  (or “epsilon”) Length of a string w, denoted by “|w|”, is equal to the number of (non- ) characters in the string E.g., x = 010100 |x| = 6 x = 01  0  1  00  |x| = ? xy = concatentation of two strings x and y 22 Amity School of Engineering and Technology Example of Minimization of DFA 23 Amity School of Engineering and Technology Transition diagram for Different types of Tokens 1. Identifiers: 2. Relational Operator 24 Amity School of Engineering and Technology Transition diagram for Different types of Tokens 3. Keyword 25 Amity School of Engineering and Technology Introduction of Lexical Analysis Example 1: Valid Tokens: 'int' 'main' '(' ')' '{' 'int' 'a' ',' 'b' ';' 'a' '=' '10' ';' 'return' '0' ';' '}' 26 Amity School of Engineering and Technology Introduction of Lexical Analysis Example 2:  There are 5 valid token in this printf statement. Example 3: int max(int i); Lexical analyzer first read int and finds it to be valid and accepts as token max is read by it and found to be a valid function name after reading ( int is also a token , then again i as another token and finally ; Answer: Total number of tokens 7: int, max, ( ,int, i, ), ; 27 Amity School of Engineering and Technology Scanning the Input The lexical analyzer scans the input from left to right one character at a time. It uses two pointers 1. begin ptr(bp):lexeme begin pointer and 2. forward ptr(fp) to keep track of the pointer of the input scanned. Initially both the pointers point to the first character of the input string This reading of characters from secondary storage is very costly. Hence buffering technique is used. A block of data is first read into a buffer, and then second by lexical analyzer. there are two methods used in this context: 1. One Buffer Scheme and 2. Two Buffer Scheme. 28 Amity School of Engineering and Technology One Buffer Scheme In this scheme, only one buffer is used to store the input string but the problem with this scheme is that if lexeme is very long then it crosses the buffer boundary, to scan rest of the lexeme the buffer has to be refilled, that makes overwriting the first of lexeme. 29 Amity School of Engineering and Technology Two Buffer Scheme To overcome the problem of one buffer scheme, in this method two buffers are used to store the input string. the first buffer and second buffer are scanned alternately. When end of current buffer is reached the other buffer is filled. To identify the boundary of first buffer, end of buffer character should be placed at the end first buffer. Similarly end of second buffer is also recognized by the end of buffer mark present at the end of second buffer. when fp encounters first eof, then one can recognize end of first buffer and hence filling up second buffer is started. in the same way when second eof is obtained then it indicates of second buffer. alternatively both the buffers can be filled up until end of the input program and stream of tokens is identified. This eof character introduced at the end is calling Sentinel which is used to identify the end of buffer. 30 Amity School of Engineering and Technology LEX Lex is a program that generates lexical analyzer. It is used with YACC parser generator. The lexical analyzer is a program that transforms an input stream into a sequence of tokens. It reads the input stream and produces the source code as output through implementing the lexical analyzer in the C program. The function of Lex is as follows: Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs the lex.1 program and produces a C program lex.yy.c. Finally C compiler runs the lex.yy.c program and produces an object program a.out. a.out is lexical analyzer that transforms an input stream into a sequence of tokens. 31 Amity School of Engineering and Technology LEX 32 Amity School of Engineering and Technology Bootstrapping Bootstrapping is widely used in the compilation development. Bootstrapping is used to produce a self-hosting compiler. Self-hosting compiler is a type of compiler that can compile its own source code. Bootstrap compiler is used to compile the compiler and then you can use this compiled compiler to compile everything else as well as future versions of itself. A compiler can be characterized by three languages: Source Language Target Language Implementation Language 33 Amity School of Engineering and Technology Bootstrapping The T- diagram shows a compiler SCIT for Source S, Target T, implemented in I. Follow some steps to produce a new language L for machine A: 1. Create a compiler SCAA for subset, S of the desired language, L using language "A" and that compiler runs on machine A. 2. Create a compiler LCSA for language L written in a subset of L. 34 Amity School of Engineering and Technology Bootstrapping 3. Compile LCSA using the compiler SCAA to obtain LCAA. LCAA is a compiler for language L, which runs on machine A and produces code for machine A. 35 Amity School of Engineering and Technology 36 Amity School of Engineering and Technology 37 Amity School of Engineering and Technology Module II Lecture-1 Syntax Analyser Neetu Narayan ASET(CSE) 1 Module II:  Role of the parser  Writing Grammars –Context-Free Grammars 2 Amity School of Engineering and Technology Recommended Reading Textbooks: Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, “Compilers: Principles Techniques and Tool”, Second Edition, Pearson Publication, 2007. A.A Putambekar, “Compiler Construction”, Technical Publications, 2009. Reference Book: Des Watson, A Practical Approach to Compiler Construction, First Edition, Springer, 2017 3 Amity School of Engineering and Technology OBJECTIVES After completing this section, you will be able to 1.1 Analyze & implement Role of the parse 1.2 Analyze & implement –Context-Free Grammars. 1.3 Analyze & implement Top Down parsing and various other parsing technique. 4 Amity School of Engineering and Technology Module Assessment Quiz (Conceptual and Numerical Based) Assignment 5 Amity School of Engineering and Technology Role of the Parser The parser obtains a string of tokens from the lexical analyzer, and verifies that the string can be generated by the grammar for the source language. The parser returns any syntax error for the source language 6 Amity School of Engineering and Technology Context Free Grammar Languages that are generated by context-free grammars are context-free languages Context-free grammars are more expressive than finite automata: if a language L is accepted by a finite automata then L can be generated by a context-free grammar Beware: The converse is NOT true 7 Amity School of Engineering and Technology Context Free Grammar Definition. A context-free grammar is a 4-tuple (, NT, R, S), where:  is an alphabet (each character in  is called terminal) NT is a set (each element in NT is called nonterminal) R, the set of rules, is a subset of NT  (  NT)* S, the start symbol, is one of the symbols in NT If (,)  R, we write production     is called a sentential form 8 Amity School of Engineering and Technology Context Free Grammar Many textbooks use different symbols and terms to describe CFG’s G = (V, S, P, S) V = variables a finite set S = alphabet or terminals a finite set P = productions a finite set S = start variable SV Productions’ form, where AV, a(VS)*: Aa 9 Amity School of Engineering and Technology Parse Tree A parse tree of a derivation is a tree in which: Each internal node is labeled with a nonterminal If a rule A  A1A2…An occurs in the derivation then A is a parent node of nodes labeled A1, A2, …, An S a S a S b S e 10 Amity School of Engineering and Technology Parse Tree S A|AB Sample derivations: A e|a|Ab|AA S  AB  AAB  aAB  aaB  aabB  aabb B b|bc|Bc|bB S  AB  AbB  Abb  AAbb  Aabb  aabb These two derivations use same productions, but in different orders. This ordering difference is often uninteresting. Derivation trees give way to abstract away ordering differences. S Root label = start node. Each interior label = variable. A B Each parent/child relation = derivation step. A A b B Each leaf label = terminal or e. a a b All leaf labels together = derived string = yield. 11 Amity School of Engineering and Technology Leftmost, Rightmost Derivations Definition. A left-most derivation of a sentential form is one in which rules transforming the left-most nonterminal are always applied Definition. A right-most derivation of a sentential form is one in which rules transforming the right-most nonterminal are always applied 12 Amity School of Engineering and Technology Leftmost, Rightmost Derivations S A|AB Sample derivations: A e|a|Ab|AA S  AB  AAB  aAB  aaB  aabB  aabb B b|bc|Bc|bB S  AB  AbB  Abb  AAbb  Aabb  aabb S These two derivations are special. A B 1st derivation is leftmost. Always picks leftmost variable. A A b B 2nd derivation is rightmost. Always picks rightmost variable. a a b 13 Derivation Trees S A|AB Other derivation trees for A e | a | A b | AA w = aabb this string? B b|bc|Bc|bB S S S A A B A B A A Infinitely A A b B A A b many others A A A b possible. a a b a A b a e A b a a Ambiguous Grammar Definition. A grammar G is ambiguous if there is a word w  L(G) having are least two different parse trees SA SB S  AB A  aA B  bB Ae Be Notice that a has at least two left-most derivations Ambiguity CFG ambiguous  any of following equivalent statements: –  string w with multiple derivation trees. –  string w with multiple leftmost derivations. –  string w with multiple rightmost derivations. Defining ambiguity of grammar, not language. Amity School of Engineering and Technology Rules for Disambiguating a grammar Generally productions are ambiguous when they have more than one occurance of a given non-terminal on their right hand side. Rules: 1. Check the precedence of the operators involved. 2. Different precedence operators are treated differently for removing ambiguity. 3. First remove the ambiguity for the minimum precedence operator and so on… 17 Amity School of Engineering and Technology Rules for Disambiguating a grammar Example: If a production rule is S->ASBSy|a1|a2|……|an Then ambiguity can be removed by rewriting the production rule as S -> ASBS’y|S’ S’-> a1|a2|…|an If the operator is left associative then change the right most symbol Eg.- E->E*E|id E->E*E’|E’ E’-> id If the operator is right associative then change the Left most symbol Eg.- E->E⬆E|id E->E’⬆E|E’ E’-> id 18 Amity School of Engineering and Technology Elimination of Left Recursion A grammar is left recursive if the first symbol on the right hand side of a rule is the same non-terminal as that in the left hand side. E.g.:- S->Sa Rule to remove Left recursion from a grammar of the form S->Sά1| Sά2|…| Sάn|β1|β2|…|βm S->β1S’|β2S’|…|βmS’ S’-> ά1S’| ά2S’|…| άnS’|€ 19 Amity School of Engineering and Technology Elimination of Left Factoring Left factoring is a process which isolates the common parts of two or more productions into a single production. After elimination of left factoring, the produced grammar is suitable for top down parsing. Rule to remove Left factoring from a grammar of the form A-> ά β1| ά β2|…| ά βm Then, A-> ά A’ A’-> β1| β2|…| βm 20 Amity School of Engineering and Technology Role of a Parser Parser takes an input string w and a context free grammar G and produces output: a) A parse tree for w if w is a sentence of a grammar G i.e. w € L(G) b) Error message if w is not a sentence of a grammar G and also provides some information like type of error and location of error. Output of a Parser is a representation of a parse tree for the input if the input is syntactically correct. 21 Amity School of Engineering and Technology Parsing The process of deriving a sentence from the given grammar is known as parsing or process of generating a parse tree for a sentence is called parsing. Parsing is of two types: 1) Top Down Parsing: In this, the parse tree is generated from root to leaves. 2) Bottom Up Parsing: In this, the parse tree is generated from leaves to root. 22 Amity School of Engineering and Technology Top-Down Parsing Top Down parsing does not support ambiguous , left recursive and left factoring. If the grammar is having all the three then 1. Remove ambiguity if possible by rewriting the grammar 2. Remove left- recursion, otherwise it may lead to an infinite loop. 3. Do left- factoring. 23 Amity School of Engineering and Technology Top-Down Parsing Top Down Parsing involves construction of parse tree from root to leaves using left most derivation. Top Down Parsing can be done in two ways: a) With Backtracking b) Without Backtracking 24 Amity School of Engineering and Technology Top-Down Parsing with Backtracking If a left most non-terminal has more than one production i.e. A->a1|a2|…|an Then there is a problem to decide which A-production should be used for derivation. Example: S->cAd, A->ab|a For string, cad 25 Amity School of Engineering and Technology Disadvantages of Backtracking Backtracking is not a good way of parsing because when backtracking occurs we have to undo certain changes like entries made in the symbol table have to be removed from it which creates a overhead, so top down parser having no backtracking is more 26 Amity School of Engineering and Technology Top-Down Parsing without Backtracking It is of two types: 1) Predictive parsing 2) Recursive Descent Parsing Predictive Parsing: It is also called as non recursive predictive parsing because it implements the recursive descent parsing by maintaining a stack of activation records. Recursive Descent Parsing: It is top down method of syntax analysis which uses collection of recursive procedures to process the input. 27 Amity School of Engineering and Technology Predictive Parser It has four main parts: 1) Input Buffer: It contains the input string to be parsed followed by symbol 2) Stack: It contains a sequence of grammar symbols with $ on the bottom. For parsing initially the start symbol is pushed on to the stack. 3) Parsing Table: It is a 2-dimensional array M[A,a] where A is a non-terminal and a is a terminal. 4) Output- The action performed. 28 Amity School of Engineering and Technology Construction of Predictive Parsing table For Predictive Parsing table- find FIRST and FOLLOW of the grammar symbols. FIRST and FOLLOW allows us to choose which production to apply based on the next input symbol. Rules to calculate FIRST(X) 1. If X is a terminal symbol, then FIRST(X)={X} 2. If X-> € is a production, then FIRST(X)={€} 3. If X is a non terminal symbol and X-> aά, then FIRST(X)={a} 4. If X-> Y1Y2….Yk is a production for some k>=1 and if FIRST(Yi)={a} and FIRST(Y1), FIRST(Y2)……FIRST(Yi-1) contains € then FIRST(X)={a} 29 Amity School of Engineering and Technology Example for finding FIRST of symbols present in a grammar Ques 1: E->E+E|E*E|(E)|id Solution: Step1: The given grammar is ambiguous, remove ambiguity from the grammar E -> E + T | T T-> T*F|F F -> (E) | id Step2: Remove Left recursion from the grammar. E->T E’ E’ -> +T E’ | e T -> F T’ T’ -> *F T’ | e F -> (E) |id 30 Amity School of Engineering and Technology Example for finding FIRST of symbols present in a grammar E->T E’ E’ -> +T E’ | e T -> F T’ T’ -> *F T’ | e F -> (E) |id FIRST(E)= FIRST(T)=FIRST(F)={(, id} FIRST(E’)={+, e} FIRST(T’)={*, e} 31 Amity School of Engineering and Technology Questions on FIRST(X) Ques 1: S-> iEtSS’|a S’->eS| e E-> b Find FIRST of all the non-terminals present in the grammar. Ques 2: Exp-> Exp Addop Term| Term Addop-> +| - Term-> Term Mulop Factor| Factor Mulop-> * Factor-> (Exp)|number Find FIRST of all the non-terminals present in the grammar. Ques 3: S-> AaAb| BbBa, A-> e, B-> e Ques 4: Lexp-> number|(op Lexp-seq), op->+|*|-, Lexp-seq-> Lexp-seq Lexp|exp 32 Amity School of Engineering and Technology Computation of FOLLOW(X) Rules to calculate FOLLOW(X) 1. If S is a start symbol, then FOLLOW(S)={$} 2. If there is a production A-> άBβ, then FOLLOW(B)=FIRST(β)- e 3. If there is a production A-> άB or A-> άBβ, if FIRST(β) contains e, then FOLLOW(B)=FOLLOW(A) 33 Amity School of Engineering and Technology Example of FOLLOW(X) E->T E’ E’ -> +T E’ | e T -> F T’ T’ -> *F T’ | e F -> (E) |id FOLLOW(E)={ ), $} FOLLOW(E’)={), $} FOLLOW(T)=FIRST(E’)-e + FOLLOW(E’)={+, ), $} FOLLOW(T’)=FOLLOW(T)= {+, ), $} FOLLOW(F)=FIRST(T’) -e + FOLLOW(T)= {*, +, ), $} 34 Amity School of Engineering and Technology Construction of Predictive Parsing Table 35 Amity School of Engineering and Technology Example Ques 1: E->E+E|E*E|(E)|id Solution: E->T E’, E’ -> +T E’ | e , T -> F T’, T’ -> *F T’ | e, F -> (E) |id The predictive parsing table is: 36 Amity School of Engineering and Technology Predictive Parser Predictive Parsers are LL(1) parsers or which can be constructed for class of LL(1) grammar. 37 Amity School of Engineering and Technology LL(1) Grammar Ambiguous Grammar or Left Recursive Grammar cannot be LL(1) grammar. 38 Amity School of Engineering and Technology Predictive Parser Predictive Parsers are LL(1) parsers 39 Amity School of Engineering and Technology Predictive Parser The parser is in the configuration with w$ in the input buffer and the start symbol S of G on top of the stack, S$. Predictive Parsers is controlled by a program. The program uses parsing Table to produce a predictive parse for the input. This program determines (X) top element of stack and (a) current input symbol and perform the following actions:-  If X=a=$, then parser halts and return successful completion of parsing.  If X=a≠$, then parser pops X from the stack and advance the input pointer to next input symbol.  If X is a non-terminal then it checks in the parsing table. If M[X,a]={x->uvw}, then parser replace X by wvu with u at the top of the stack. 40 Amity School of Engineering and Technology Predictive Parsing Example for Input: id+id*id 41 Amity School of Engineering and Technology Practice Questions Ques1: Consider the grammar Declaration-> type var_list type->int |float var_list-> identifier, var_list | identifier Show that the grammar is LL(1). Check if the sentence int x,y,z is syntactically correct or not. Ques 2: Consider the grammar S->AaAb A->BbBa A-> e B-> e Check if the sentence ba is syntactically correct or not. 42 Amity School of Engineering and Technology Types of Parsing 43 Amity School of Engineering and Technology Bottom-Up Parsing 44 Amity School of Engineering and Technology Bottom-Up Parsing 45 Amity School of Engineering and Technology Bottom-Up Parsing 46 Amity School of Engineering and Technology Bottom-Up Parsing 47 Amity School of Engineering and Technology Bottom-Up Parsing 48 Amity School of Engineering and Technology Bottom-Up Parsing 49 Amity School of Engineering and Technology Bottom-Up Parsing 50 Amity School of Engineering and Technology Practice Questions Ques1: Consider the grammar S->TL; T->int|float L->L,id|id Parse the input string int id, id; using shift reduce parser. Ques2: Consider the grammar S->TL; T->int|float L->L,id|id Check if the sentence “ float id,id,id;”is syntactically correct using predictive parsing 51 Amity School of Engineering and Technology Practice Questions Ques3: Consider the grammar E->E+T | T T-> id|id[]|id[X] X-> E, E|E Find FIRST and FOLLOW of all the non-terminals. Ques4: Consider the grammar S->(A)|0 A->SB B->,SB| e Is the above grammar LL(1)? Justify your answer. Ques5: Consider the grammar E->+EE| *EE |num Construct LL(1) parsing table for it. 52 Amity School of Engineering and Technology Practice Questions Ques 7: Construct the predictive parser for the following grammar S->a|^|(T) T->T,S| S Show the behavior of the parser in the sentences: i) (a,(a,a)) ii) (((a,a),^,(a),a) Ques 8: Construct the predictive parser for the following grammar S->(L)|a L-> L,S| S Parse the sentence (a,a) using predictive parsing for the above grammar. 53 Amity School of Engineering and Technology Operator Precedance Parsing It is a form of shift reduce parsing which is used to parse operator grammar. This parsing is needed for parsing expressions. Rules for a grammar to be operator grammar: 1) No e production should be present in the grammar. 2) No two adjacent non-terminals should appear on the right hand side of the production. Eg. E-> EAE|(E)|-E|id A->+|-|*|/|^ Given grammar is not Opeartor grammar. Operator can be obtained by substituting the values of A in E productions. E->E+E|E-E|E*E|E/E|E^E|(E)|-E |id 54 Amity School of Engineering and Technology Operator Precedance Parsing Operator precedence parsing uses two information : Precedence of operators Associativity of operators. Precedence relations: In this, three disjoint precedence relations are used between pair of terminals: 1. 2. 3. 55 Amity School of Engineering and Technology Operator Precedance Parsing There are two methods of generating precedence relational table: 1) By determining the associativity and precedence of operators 2) By using standard method. Rules to calculate operator precedence relations 56 Amity School of Engineering and Technology Rules to calculate operator precedence relations 57 Amity School of Engineering and Technology Rules to calculate operator precedence relations 58 Amity School of Engineering and Technology Rules to calculate operator precedence relations 59 Amity School of Engineering and Technology Operator Precedence Relational Table 60 Amity School of Engineering and Technology Operator Precedance Parsing table For the construction of Operator Precedence Parsing table, Leading and Trailing of all the non-terminals are required. Rules to find Leading(A) 1. ‘a’ is in Leading(A) if A -> γaδ where γ is ε or any Non-Terminal 2. 2. If’ ‘a’ is in Leading(B) and A -> Bα, then a in Leading(A) Example: Consider the grammar 1. E -> E + T 2. 2. E -> T 3. 3. T -> T * F 4. 4. T -> F 5. 5. F -> (E) 6. 6. F -> id 61 Amity School of Engineering and Technology Operator Precedance Parsing table LEADING (F) = {( , id} LEADING (T) = { *, (, id } LEADING (E) = { + , * , ( , id} Rules to findind Trailing(A) 1. a is in Trailing(A) if A -> γaδ where δ is ε or any Non-Terminal 2. If a is in Trailing(B) and A -> αB, then a in Trailing(A) 62 Amity School of Engineering and Technology Operator Precedance Parsing table Example: Consider the grammar 1. E -> E + T 2. 2. E -> T 3. 3. T -> T * F 4. 4. T -> F 5. 5. F -> (E) 6. 6. F -> id Solution: TRAILING(E)= { +, TRAILING(T)} ={+ , * , ) , id} TRAILING(T)= { *, TRAILING(F)} ={* , ) , id} TRAILING(F)= { ) , id} Acc 63 Amity School of Engineering and Technology Examples Q1. S-> a|^|(T) T-> T, S|S Q2. P->SbP|SbS|S S->WbS|W W->L*W|L L->id Q3. S->aSSb|c Q4. S->A|B A->A+B|B+B B->B*n|n 64 Amity School of Engineering and Technology Operator Precedence Relational Table After computing LEADING and TRAILING, the table is constructed between all the terminals in the grammar including the ‘$’ symbol. 65 Amity School of Engineering and Technology Operator Precedence Relational Table 1. Example: E -> E + T , E -> T , T -> T * F , T -> F , F -> (E) , F -> id 66 Amity School of Engineering and Technology Parsing Algorithm 67 Amity School of Engineering and Technology Example of Parsing Grammar: E->E+T, E->T, T->T*F, T->F, F->(E), F->id 68 Amity School of Engineering and Technology Example of Parsing Grammar: E->E+T, E->T, T->T*F, T->F, F->(E), F->id 69 Amity School of Engineering and Technology Precedence Functions 70 Amity School of Engineering and Technology Algorithm for Constructing Precedence Functions 1. Create functions fa and gb for each grammar terminal a and for the end of string symbol. 2. Partition the symbols in groups so that fa and gb are in the same group if a = b (there can be symbols in the same group even if they are not connected by this relation). 3. Create a directed graph whose nodes are in the groups, next for each symbols a and b do: place an edge from the group of gb to the group of fa if a b place an edge from the group of fa to that of gb. 4. If the constructed graph has a cycle then no precedence functions exist. When there are no cycles collect the length of the longest paths from the groups of fa and gb respectively. 71 Amity School of Engineering and Technology Algorithm for Constructing Precedence Functions 72 Amity School of Engineering and Technology Questions Ques 1: Consider the following grammar, and construct the operator precedence parsing table and check whether the input string (i) *id=id (ii) (ii)id*id=id are successfully parsed or not? S→L=R S→R L→*R L→id R→L 73 Amity School of Engineering and Technology Questions Ques 2: Consider the following grammar, and remove its ambiguity R= R&R|RⴲR|RꕕR|R*|a|b|c 74 Amity School of Engineering and Technology Operator Precedence Parsing 75 Amity School of Engineering and Technology 76 Amity School of Engineering and Technology Module III LR PARSERS Neetu Narayan ASET(CSE) 1 Module III:  Different types of LR Parsers  Simple LR(0) collection of items  SLR parsing table  SLR parsing  LR(1) collection of items  CLR parsing table  CLR parsing  LALR parsing table  LALR parsing 2 Amity School of Engineering and Technology Recommended Reading Textbooks: Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, “Compilers: Principles Techniques and Tool”, Second Edition, Pearson Publication, 2007. A.A Putambekar, “Compiler Construction”, Technical Publications, 2009. Reference Book: Des Watson, A Practical Approach to Compiler Construction, First Edition, Springer, 2017 3 Amity School of Engineering and Technology OBJECTIVES After completing this section, you will be able to 1.1 Analyze & implement SLR parsing techniques. 1.2 Analyze & implement CLR parsing techniques. 1.3 Analyze & implement LALR parsing techniques. 4 Amity School of Engineering and Technology Module Assessment Quiz (Conceptual and Numerical Based) Assignment 5 Amity School of Engineering and Technology LR Parsers The most powerful shift-reduce parsing (yet efficient) is: LR parsing is attractive because: LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient. The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive parsers. LL(1)-Grammars  LR(1)-Grammars An LR-parser can detect a syntactic error as soon as it is possible to do so a left-to- right scan of the input. 6 Amity School of Engineering and Technology LR Parsers Advantages of the LR parser Recognize the most programming languages. More general than LL parser Detect errors during parsing 7 Amity School of Engineering and Technology LR Parsers Different types of LR-Parsers 1. SLR – simple LR parser 2. CLR – Canonical LR parser( Most powerful LR parser) 3. LALR – Look-head LR parser(Intermediate LR parser ) SLR, LR and LALR work same (they used the same algorithm), only their parsing tables are different. 8 Amity School of Engineering and Technology LR Parsing Algorithm 9 Amity School of Engineering and Technology A Configuration of LR Parsing Algorithm A configuration of a LR parsing is: Sm and ai decides the parser action by consulting the parsing action table. (Initial Stack contains just So ) A configuration of a LR parsing represents the right sentential form: X1... Xm ai ai+1... an $ 10 Amity School of Engineering and Technology Actions of a LR-Parser 1. shift s -- shifts the next input symbol and the state s onto the stack ( So X1 S1... Xm Sm, ai ai+1... an $ )  ( So X1 S1... Xm Sm ai s, ai+1... an $ ) 2. reduce A (or rn where n is a production number) pop || (=r) items from the stack; then push A and s where s=goto[sm-r ,A] ( So X1 S1... Xm Sm, ai ai+1... an $ )  ( So X1 S1... Xm-r Sm-r A s, ai... an $ ) Output is the reducing production reduce A 3. Accept – Parsing successfully completed 4. Error -- Parser detected an error (an empty entry in the action table) 11 Amity School of Engineering and Technology Reduce Action pop || (=r) items from the stack; let us assume that  = Y1Y2...Yr then push A and s where s=goto[sm-r,A] ( So X1 S1... Xm-r Sm-r Y1 Sm-r+1...Yr Sm, ai ai+1... an $ )  ( So X1 S1... Xm-r Sm-r A s, ai... an $ ) In fact, Y1Y2...Yr is a handle. X1... Xm-r A ai... an $  X1... Xm Y1...Yr ai ai+1... an $ 12 Amity School of Engineering and Technology Constructing SLR Parsing Tables – LR(0) Item An LR(0) item of a grammar G is a production of G a dot at the some position of the right side. Ex: A aBb Possible LR(0) Items: A.aBb (four different possibility) A a.Bb A aB.b A aBb. Sets of LR(0) items will be the states of action and goto table of the SLR parser. A collection of sets of LR(0) items (the canonical LR(0) collection) is the basis for constructing SLR parsers. Augmented Grammar: G’ is G with a new production rule S’S where S’ is the new starting symbol. 13 Amity School of Engineering and Technology The Closure Operation If I is a set of LR(0) items for a grammar G, then closure(I) is the set of LR(0) items constructed from I by the two rules: 1. Initially, every LR(0) item in I is added to closure(I). 2. If A .B is in closure(I) and B is a production rule of G; then B. will be in the closure(I). We will apply this rule until no more new LR(0) items can be added to closure(I). 14 Amity School of Engineering and Technology The Closure Operation-Example E’ E. closure({E’ E}) = E E+T. { E’ E kernel items ET E. E+T T T*F E. T TF T. T*F F (E) T. F F id F. (E) F. id } 15 Amity School of Engineering and Technology Computation of Closure function closure ( I ) begin J := I; repeat for each item A .B in J and each production B of G such that B. is not in J do add B. to J until no more items can be added to J end 16 Amity School of Engineering and Technology GOTO Operation If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows: If A .X in I then every item in closure({A X.}) will be in goto(I,X). If I is the set of items that are valid for some viable prefix , then goto(I,X) is the set of items that are valid for the viable prefix X. Example: I ={ E’.E, E.E+T, E.T, T.T*F, T.F, F.(E), F.id } goto(I,E) = { E’ E., E E.+T } goto(I,T) = { E T., T T.*F } goto(I,F) = {T F. } goto(I,() = { F (.E), E.E+T, E.T, T.T*F, T.F, F.(E), F.id } goto(I,id) = { F id. } 17 Amity School of Engineering and Technology Practice Questions Ques 1: Consider the grammar S-> Aa| bAc| Bc| bBa, A->d, B->d Compute closure(I) and goto(I) Ques2: Consider the grammar S-> AS| b , A->SA|a Compute closure(I) and goto(I) 18 Amity School of Engineering and Technology Construction of The Canonical LR(0) Collection To create the SLR parsing tables for a grammar G, we will create the canonical LR(0) collection of the grammar G’. Algorithm: C is { closure({S’.S}) } repeat the followings until no more set of LR(0) items can be added to C. for each I in C and each grammar symbol X if goto(I,X) is not empty and not in C add goto(I,X) to C goto function is a DFA on the sets in C. 19 Amity School of Engineering and Technology The Canonical LR(0) Collection-Example I0: E’.E I1: E’ E. I6: E E+.T I9: E E+T. E.E+T E E.+T T.T*F T T.*F E.T T.F T.T*F I2: E T. F.(E) I10: T T*F. T.F T T.*F F.id F.(E) F.id I3: T F. I7: T T*.F I11: F (E). F.(E) I4: F (.E) F.id E.E+T E.T I8: F (E.) T.T*F E E.+T T.F F.(E) F.id I5: F id. 20 Amity School of Engineering and Technology Transition Diagram (DFA) of Goto Function 21 Amity School of Engineering and Technology Constructing SLR Parsing Table (of an augumented grammar G’) 1. Construct the canonical collection of sets of LR(0) items for G’. C{I0,...,In} 2. Create the parsing action table as follows If a is a terminal, A.a in Ii and goto(Ii,a)=Ij then action[i,a] is shift j. If A. is in Ii , then action[i,a] is reduce A for all a in FOLLOW(A) where AS’. If S’S. is in Ii , then action[i,$] is accept. If any conflicting actions generated by these rules, the grammar is not SLR(1). 3. Create the parsing goto table for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j 4. All entries not defined by (2) and (3) are errors. 5. Initial state of the parser contains S’.S 22 (SLR) Parsing Tables for Expression Grammar 23 Actions of (S)LR-Parser -- Example 24 Amity School of Engineering and Technology Practice Question Ques 1: Consider the following grammar E-> E+ T | T T-> TF | F F-> F * | a | b Construct the SLR parsing table for this grammar. Ques 2: Construct SLR Parsing table for the grammar S-> L= R S-> R L->*R L-> id R-> L 25 Amity School of Engineering and Technology SLR(1) Grammar An LR parser using SLR(1) parsing tables for a grammar G is called as the SLR(1) parser for G. If a grammar G has an SLR(1) parsing table, it is called SLR(1) grammar (or SLR grammar in short). Every SLR grammar is unambiguous, but every unambiguous grammar is not a SLR grammar. 26 Amity School of Engineering and Technology shift/reduce and reduce/reduce conflicts If a state does not know whether it will make a shift operation or reduction for a terminal, we say that there is a shift/reduce conflict. If a state does not know whether it will make a reduction operation using the production rule i or j for a terminal, we say that there is a reduce/reduce conflict. If the SLR parsing table of a grammar G has a conflict, we say that that grammar is not SLR grammar. 27 Amity School of Engineering and Technology Conflict Example-1 28 Amity School of Engineering and Technology Conflict Example-2 29 Amity School of Engineering and Technology LR(1) Item To avoid some of invalid reductions, the states need to carry more information. Extra information is put into a state by including a terminal symbol as a second component in an item. A LR(1) item is: A .,a where a is the look-head of the LR(1) item (a is a terminal or end-marker.) Such an object is called LR(1) item.  1 refers to the length of the second component  The lookahead has no effect in an item of the form [A .,a], where  is not .  But an item of the form [A .,a] calls for a reduction by A  only if the next input symbol is a.  The set of such a’s will be a subset of FOLLOW(A), but it could be a proper subset. 30 Amity School of Engineering and Technology LR(1) Item(cont.) When  ( in the LR(1) item A .,a ) is not empty, the look-head does not have any affect. When  is empty (A .,a ), we do the reduction by A only if the next input symbol is a (not for any terminal in FOLLOW(A)). A state will contain A .,a1 where {a1,...,an}  FOLLOW(A)... A .,an 31 Amity School of Engineering and Technology Canonical Collection of Sets of LR(1) Items The construction of the canonical collection of the sets of LR(1) items are similar to the construction of the canonical collection of the sets of LR(0) items, except that closure and goto operations work a little bit different. closure(I) is: ( where I is a set of LR(1) items)  every LR(1) item in I is in closure(I)  if A.B,a in closure(I) and B is a production rule of G; then B.,b will be in the closure(I) for each terminal b in FIRST(a). 32 Amity School of Engineering and Technology Goto Operation If I is a set of LR(1) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows:  If A .X,a in I then every item in closure({A X.,a}) will be in goto(I,X). 33 Amity School of Engineering and Technology Construction of The Canonical LR(1) Collection Algorithm: C is { closure({S’.S,$}) } repeat the followings until no more set of LR(1) items can be added to C. for each I in C and each grammar symbol X if goto(I,X) is not empty and not in C add goto(I,X) to C Goto function is a DFA on the sets in C. 34 Amity School of Engineering and Technology Canonical LR(1) Collection -- Example 35 Amity School of Engineering and Technology 36 Amity School of Engineering and Technology 37 Amity School of Engineering and Technology S’  S, $ I1 S  C C, $ S (S’ S  , $ C  c C, c/d C  d, c/d S C  C, $ I5 C  c C, $ C S C C , $ C C  d, $ I2 c I6 C c  C, $ C  c C, $ I9 c C  d, $ C cC , $ I7 C d , $ c d C c  C, c/d C  c C, c/d C I8 I3 C  d, c/d C c C , c/d c I4 d 38 C d , c/d Amity School of Engineering and Technology An Example 39 Amity School of Engineering and Technology (CLR) Parsing Tables for the above Grammar 40 Amity School of Engineering and Technology The Core of LR(1) Items The core of a set of LR(1) Items is the set of their first components (i.e., LR(0) items) The core of the set of LR(1) items { (C c  C, c/d), (C  c C, c/d), (C  d, c/d) } is { C c  C, C  c C, Cd} 41 Amity School of Engineering and Technology Construction of LR(1) Parsing Tables 1. Construct the canonical collection of sets of LR(1) items for G’. C{I0,...,In} 2. Create the parsing action table as follows If a is a terminal, A.a,b in Ii and goto(Ii,a)=Ij then action[i,a] is shift j. If A.,a is in Ii , then action[i,a] is reduce A where AS’. If S’S.,$ is in Ii , then action[i,$] is accept. If any conflicting actions generated by these rules, the grammar is not LR(1). 3. Create the parsing goto table for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j 4. All entries not defined by (2) and (3) are errors. 5. Initial state of the parser contains S’.S,$ 42 Amity School of Engineering and Technology LALR Parsing Tables 1. LALR stands for Lookahead LR. 2. LALR parsers are often used in practice because LALR parsing tables are smaller than LR(1) parsing tables. 3. The number of states in SLR and LALR parsing tables for a grammar G are equal. 4. But LALR parsers recognize more grammars than SLR parsers. 5. yacc creates a LALR parser for the given grammar. 6. A state of LALR parser will be again a set of LR(1) items. 43 Amity School of Engineering and Technology Creating LALR Parsing Tables Canonical LR(1) Parser  LALR Parser shrink # of states This shrink process may introduce a reduce/reduce conflict in the resulting LALR parser (so the grammar is NOT LALR) But, this shrik process does not produce a shift/reduce conflict. 44 Amity School of Engineering and Technology The Core of A Set of LR(1) Items The core of a set of LR(1) items is the set of its first component. Ex: S L.=R,$  S L.=R Core R L.,$ R L. We will find the states (sets of LR(1) items) in a canonical LR(1) parser with same cores. Then we will merge them as a single state. I1:L id.,= A new state: I12: L id.,=  L id.,$ I2:L id.,$ have same core, merge them We will do this for all states of a canonical LR(1) parser to get the states of the LALR parser. In fact, the number of the states of the LALR parser for a grammar will be equal to the number of states of the SLR parser for that grammar. 45 Amity School of Engineering and Technology Creation of LALR Parsing Tables 1. Create the canonical LR(1) collection of the sets of LR(1) items for the given grammar. 2. For each core present; find all sets having that same core; replace those sets having same cores with a single set which is their union. C={I0,...,In}  C’={J1,...,Jm} where m  n 1. Create the parsing tables (action and goto tables) same as the construction of the parsing tables of LR(1) parser. 1. Note that: If J=I1 ...  Ik since I1,...,Ik have same cores  cores of goto(I1,X),...,goto(I2,X) must be same. 2. So, goto(J,X)=K where K is the union of all sets of items having same cores as goto(I1,X). 2. If no conflict is introduced, the grammar is LALR(1) grammar. (We may only introduce reduce/reduce conflicts; we cannot introduce a shift/reduce conflict) 46 Amity School of Engineering and Technology 47 Amity School of Engineering and Technology 48 Amity School of Engineering and Technology LALR Parse Table c d $ S C 0 s36 s47 1 2 1 acc 2 s36 s47 5 36 s36 s47 89 47 r3 r3 r3 5 r1 89 r2 r2 r2 49 Amity School of Engineering and Technology Shift/Reduce Conflict We say that we cannot introduce a shift/reduce conflict during the shrink process for the creation of the states of a LALR parser. Assume that we can introduce a shift/reduce conflict. In this case, a state of LALR parser must have: A .,a and B .a,b This means that a state of the canonical LR(1) parser must have: A .,a and B .a,c But, this state has also a shift/reduce conflict. i.e. The original canonical LR(1) parser has a conflict. (Reason for this, the shift operation does not depend on lookaheads) 50 Amity School of Engineering and Technology Reduce/Reduce Conflict But, we may introduce a reduce/reduce conflict during the shrink process for the creation of the states of a LALR parser. I1 : A .,a I2: A .,b B .,b B .,c  I12: A .,a/b  reduce/reduce conflict B .,b/c 51 Amity School of Engineering and Technology Using Ambiguous Grammars All grammars used in the construction of LR-parsing tables must be un-ambiguous. Can we create LR-parsing tables for ambiguous grammars ? – Yes, but they will have conflicts. – We can resolve these conflicts in favor of one of them to disambiguate the grammar. – At the end, we will have again an unambiguous grammar. Why we want to use an ambiguous grammar? – Some of the ambiguous grammars are much natural, and a corresponding unambiguous grammar can be very complex. – Usage of an ambiguous grammar may eliminate unnecessary reductions. Ex. E E+T | T E E+E | E*E | (E) | id  T T*F | F F (E) | id 52 Amity School of Engineering and Technology Sets of LR(0) Items for Ambiguous Grammar 53 Amity School of Engineering and Technology SLR-Parsing Tables for Ambiguous Grammar 54 Amity School of Engineering and Technology SLR-Parsing Tables for Ambiguous Grammar 55 Amity School of Engineering and Technology SLR-Parsing Tables for Ambiguous Grammar 56 Amity School of Engineering and Technology Error Recovery in LR Parsing An LR parser will detect an error when it consults the parsing action table and finds an error entry. All empty entries in the action table are error entries. Errors are never detected by consulting the goto table. An LR parser will announce error as soon as there is no valid continuation for the scanned portion of the input. A canonical LR parser (LR(1) parser) will never make even a single reduction before announcing an error. The SLR and LALR parsers may make several reductions before announcing an error. But, all LR parsers (LR(1), LALR and SLR parsers) will never shift an erroneous input symbol onto the stack. 57 Amity School of Engineering and Technology Panic Mode Error Recovery in LR Parsing Scan down the stack until a state s with a goto on a particular nonterminal A is found. (Get rid of everything from the stack before this state s). Discard zero or more input symbols until a symbol a is found that can legitimately follow A. The symbol a is simply in FOLLOW(A), but this may not work for all situations. The parser stacks the nonterminal A and the state goto[s,A], and it resumes the normal parsing. This nonterminal A is normally is a basic programming block (there can be more than one choice for A). stmt, expr, block,... 58 Amity School of Engineering and Technology Phrase Mode Error Recovery in LR Parsing Each empty entry in the action table is marked with a specific error routine. An error routine reflects the error that the user most likely will make in that case. An error routine inserts the symbols into the stack or the input (or it deletes the symbols from the stack and the input, or it can do both insertion and deletion). missing operand unbalanced right parenthesis 59 Amity School of Engineering and Technology 60 Amity School of Engineering and Technology Module IV LR PARSERS Neetu Narayan ASET(CSE) 1 Amity School of Engineering and Technology Recommended Reading Textbooks: Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, “Compilers: Principles Techniques and Tool”, Second Edition, Pearson Publication, 2007. A.A Putambekar, “Compiler Construction”, Technical Publications, 2009. Reference Book: Des Watson, A Practical Approach to Compiler Construction, First Edition, Springer, 2017 2 Amity School of Engineering and Technology Module Assessment Quiz (Conceptual and Numerical Based) Assignment 3 Amity School of Engineering and Technology Syntax directed Translation As we know that after syntax analysis, semantic analysis of input program is needed. To perform the semantic analysis in syntax directed fashion, we attach some semantic rules with each production in context free grammar. Syntax directed translation scheme is a process of allowing semantic actions to be attached to the productions of Context free Grammar. Two notations are mainly used for associating semantic rules with the productions Syntax Directed Definition Translation Scheme 4 Amity School of Engineering and Technology Syntax directed Definition(SDD) Syntax directed Definition is a generalization of a context free grammar in which each grammar symbol either a terminal or non-terminal has associated set of attributes and each production A-> ά is associated with a set of semantic rules. An Attribute can be anything like a string, a number, a type or a memory location etc. The semantic rules are of the form 5 Amity School of Engineering and Technology Syntax directed Definition(SDD) Attributes are of two types Inherited Attributes An Inherited attribute at node N is defined only in terms of attribute values at N’s parent, N itself and N’s siblings. Eg. A->XYZ { Y.VAL=2*A.VAL} 6 Amity School of Engineering and Technology Syntax directed Definition(SDD) Synthesized Attributes A synthesized attribute is defined in terms of attribute value at the children node and node itself. We can say that synthesized attribute can be evaluated during single bottom up traversal of the parse tree. 7 Amity School of Engineering and Technology Annotated Parse Tree A parse tree containing the values of attributes at each of the tree is called annotated parse tree. 8 Amity School of Engineering and Technology Annotated Parse Tree 9 Amity School of Engineering and Technology Example: 10 Amity School of Engineering and Technology Example: 11 Amity School of Engineering and Technology Example: 12 Amity School of Engineering and Technology Example: Construct Annotated parse tree and also show sequence of moves for the expression (4*7+19)*2$ 13 Amity School of Engineering and Technology Intermediate Code Generation 14 Amity School of Engineering and Technology INFIX TO POSTFIX ALGORITHM Read all the symbols one by one from left to right in the given Infix Expression. 1.Push ‘(‘ onto Stack and ‘)’ at the end of the infix expression. 2.If the symbol is an operand, then directly add it to final postfix expression(Output). 3.If the reading symbol is left parenthesis '(', then Push it on to the Stack. 4.If the reading symbol is right parenthesis ')', then pop all the symbols from the stack until a left parenthesis appears. Discard the left parenthesis and add remaining popped symbols to the postfix expression in the order in which they are popped. 5.If the reading symbol is an operator (+ , - , * , / etc.,), then Check, if the operator on the top of the stack has higher or equal precedence than the one being read, pop the operator and add it to the postfix expression. Repeat the process until a lower precedence operator appears at the top of the stack. Then push the current operator onto the stack. 15 Amity School of Engineering & Technology Infix to Postfix Infix Expression= A*B+C 16 Amity School of Engineering and Technology Postfix Notation 17 Amity School of Engineering and Technology Syntax Tree 18 Amity School of Engineering and Technology Syntax Tree 19 Amity School of Engineering and Technology Syntax Tree 20 Amity School of Engineering and Technology Directed Acyclic Graph(DAG) 21 Amity School of Engineering and Technology Directed Acyclic Graph(DAG) 22 Amity School of Engineering and Technology Three Address Code 23 Amity School of Engineering and Technology Three Address Code 24 Amity School of Engineering and Technology Three Address Code 25 Amity School of Engineering and Technology Three Address Code 26 Amity School of Engineering and Technology Three Address Code 27 Amity School of Engineering and Technology Three Address Code 28 Amity School of Engineering and Technology Implementation of Three Address Code 29 Amity School of Engineering and Technology Implementation of Three Address Code 30 Amity School of Engineering and Technology Implementation of Three Address Code 31 Amity School of Engineering and Technology Quadruple 32 Amity School of Engineering and Technology Triples 33 Amity School of Engineering and Technology Triples 34 Amity School of Engineering and Technology Indirect Triples 35 Amity School of Engineering and Technology 36 Amity School of Engineering and Technology 37 Amity School of Engineering and Technology 38 Amity School of Engineering and Technology 39 Amity School of Engineering and Technology 40 Amity School of Engineering and Technology 41 Amity School of Engineering and Technology 42 Amity School of Engineering and Technology 43 Amity School of Engineering and Technology 44 Amity School of Engineering and Technology 45 Amity School of Engineering and Technology 46 Amity School of Engineering and Technology 47 Amity School of Engineering and Technology 48 Introduction to Optimization Phase of Compiler Sources of Optimization Basic Blocks and Flow Analysis Representation Local and Loop Optimization DAG By: Neetu Narayan, Dept of CSE, ASET, AUUP INTRODUCTION TO CODE OPTIMIZATION PHASE Source Code ( C, C++, Java, Verilog ) Lexical analyzer Syntax analyzer Semantic analyzer Symbol-table Error handler Intermediate Code Generation Code Optimization Code Generation Target Machine Code Code Optimization  Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code.  It includes various transformations on the code-  but semantic of the main program must be reserved during transformations Criteria for applying transformations  Transformation should capture most of the potential improvement without an un-reasonable amount of effort.  Transformation should be done in such a way that meaning of the main program should not get changed.  Transformation should reduce time and space taken by the object code. Methods to perform Optimization ◦ Local Optimization ◦ It involves Transformation of code in a straight line to reduce time and space taken by code. ◦ Loop Optimization ◦ It involves Transformation of code within a loop to reduce time and space taken by code. ◦ Data Flow Analysis ◦ Transformation of useful relationship from all parts of the program to the places where the information can be of use. BASIC BLOCKS AND FLOW GRAPH The flow of control can only enter the basic block Basic blocks are maximal through the first instruction in the block. (no jumps sequences of consecutive three- into the middle of the block ) The flow of control can leave the block at the end address instructions, such that without halting or branching, except at the end. Flow Graph- It’s a graph representation of TAC. Nodes in the flow graph represents computations and edges represents the flow of control. The basic blocks become the nodes of a flow graph, whose edges indicate which blocks can follow which other blocks. Steps to construct Basic Blocks in TAC  Step 1: Determine the set of leaders, and set them as the first statement of basic block.  Rules for finding leaders:-  First statement in a TAC is a leader.  Any statement that immediately follows a conditional or unconditional jump (goto) is a leader.  Any statement which a target of the jump is a leader.  Step 2: For each leader, its basic block consists of instructions upto the next start of a block. Example The given sequence of TAC will create a single basic block… Leader T1= a*a T2-=a*b T3=a*T2 Basic Block T4=T1+T3 T5=b*b T6=T4+T5 Example Basic Blok and Flow Graph  Consider given program Prod=0 TAC i=1 Prod=0 Do Prod=0 i=1 B1 prod=prod+a[i]*b[I] i=1 i=i+1 T1=4*I While (i

Compiler Construction - Introduction Lecture Slides PDF

Document Details

Tags

Related

Summary

Full Transcript