Bottom-Up Parsing

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

In bottom-up parsing, how is the parse tree constructed?

  • Starting at the root and proceeding towards the leaves.
  • Using a top-down approach with backtracking.
  • Constructing all nodes simultaneously.
  • Starting at the leaves and proceeding towards the root. (correct)

What is the primary goal of shift-reduce parsing?

  • To reduce the input string directly to a set of grammar productions.
  • To shift all input tokens onto the stack before reducing.
  • To resolve shift-reduce conflicts by prioritizing shifts.
  • To construct a parse tree for an input string by beginning at the leaves and working up towards the root. (correct)

What is a 'handle' in the context of shift-reduce parsing?

  • A substring that matches the left side of a production.
  • The start symbol of the grammar.
  • The next input symbol to be processed.
  • A substring that matches the right side of a production. (correct)

Which data structure is commonly used to implement a shift-reduce parser?

<p>Stack (D)</p> Signup and view all the answers

In a shift-reduce parser, what is the purpose of the $ symbol?

<p>To mark the bottom of the stack and the end of the input string. (C)</p> Signup and view all the answers

Which of the following is NOT a possible action in a shift-reduce parser?

<p>Reject (B)</p> Signup and view all the answers

What does the 'Shift' action do in a shift-reduce parser?

<p>Moves the next input symbol onto the top of the stack. (A)</p> Signup and view all the answers

What does the 'Reduce' action do in a shift-reduce parser?

<p>Replaces the handle with a nonterminal. (D)</p> Signup and view all the answers

What causes conflicts during shift-reduce parsing?

<p>The use of ambiguous grammars. (B)</p> Signup and view all the answers

How can conflicts in shift-reduce parsing be resolved?

<p>By rewriting the grammar or making an appropriate choice of action during parsing. (A)</p> Signup and view all the answers

What are the two main types of conflicts that can occur in shift-reduce parsing?

<p>Shift/Reduce and Reduce/Reduce (B)</p> Signup and view all the answers

In a Shift/Reduce conflict, what decision must the parser make?

<p>Whether to shift an input symbol onto the stack or reduce a portion of the stack. (D)</p> Signup and view all the answers

In a Reduce/Reduce conflict, what decision must the parser make?

<p>Whether to reduce using one production rule or reduce using another production rule. (D)</p> Signup and view all the answers

For what does the 'L' stand in LR parsing?

<p>Left to right scanning of input (B)</p> Signup and view all the answers

In LR(k) parsing, what does 'k' represent?

<p>The number of input symbols of lookahead used in making parsing decisions. (D)</p> Signup and view all the answers

What is a key advantage of LR parsers?

<p>They can recognize virtually all programming language constructs for which context-free grammars can be written. (B)</p> Signup and view all the answers

Which of the following is NOT a component of an LR parser?

<p>Lexical Analyzer (D)</p> Signup and view all the answers

Which of the following is the LEAST powerful LR parser?

<p>Simple LR (SLR) (D)</p> Signup and view all the answers

Which of the following LR parsers is the MOST powerful and most expensive?

<p>Canonical LR (A)</p> Signup and view all the answers

Which of the following describes the power and cost of LALR parsers in relation to SLR and Canonical LR parsers?

<p>LALR parsers are intermediate in both power and cost between SLR and Canonical LR parsers. (D)</p> Signup and view all the answers

What is the primary function of the semantic analysis phase in a compiler?

<p>To check the meaning and correctness of the program, including type checking and flow of control. (A)</p> Signup and view all the answers

What is the purpose of "static type checking"?

<p>Checking types during compilation. (A)</p> Signup and view all the answers

Which of the following is an example of static type checking?

<p>Reporting an error if a string is added to an integer. (A)</p> Signup and view all the answers

What is 'code motion' in loop optimization?

<p>Moving code inside a loop that computes the same value each iteration to outside the loop. (A)</p> Signup and view all the answers

Which optimization technique replaces expensive operations, such as multiplication, with cheaper ones, such as addition?

<p>Strength reduction (A)</p> Signup and view all the answers

Identify the correct order of compiler phases:

<p>Semantic Analysis, Intermediate Code Generation, Code Optimization, Code Generation (B)</p> Signup and view all the answers

Flashcards

Bottom-Up Parsing

Construction of parse tree starts at leaves and proceeds towards the root.

Shift-Reduce Parsing

Attempts to construct a parse tree for an input string, starting from leaves to the root.

Handle

A substring that matches the right side of a production.

Shift Action (Parsing)

Place the next input symbol onto the top of the stack.

Signup and view all the flashcards

Reduce Action (Parsing)

Replace the handle with a nonterminal symbol.

Signup and view all the flashcards

Accept Action (Parsing)

Parsing is completed successfully.

Signup and view all the flashcards

Error Action (Parsing)

Syntax error detected, calls error recovery.

Signup and view all the flashcards

Shift-Reduce Parsing Conflicts

Ambiguous grammars leading to parsing conflicts.

Signup and view all the flashcards

Shift/Reduce Conflict

Should shift or reduce?

Signup and view all the flashcards

Reduce/Reduce Conflict

Which production should we reduce with?

Signup and view all the flashcards

LR Parsing

Efficient bottom-up syntax analysis using lookahead.

Signup and view all the flashcards

k in LR(k) Parsing

The number of input symbols of lookahead that are used in making parsing decisions.

Signup and view all the flashcards

Advantage of LR Parsing

LR parsers recognize virtually all programming language constructs.

Signup and view all the flashcards

Simple LR (SLR) Parser

easiest to implement but the least powerful of the three.

Signup and view all the flashcards

Canonical LR Parser

Most powerful, most expensive LR parser.

Signup and view all the flashcards

Lookahead LR (LALR) Parser

Intermediate in power and cost between SLR and Canonical LR.

Signup and view all the flashcards

Input Stage (SLR Parser)

Reads grammar, specifies terminals/nonterminals, numbers productions.

Signup and view all the flashcards

Compute First & Follow Stage

Detects First & Follow sets for each nonterminal.

Signup and view all the flashcards

Construct DFA Stage

SLR-parser knows when to shift and when to reduce.

Signup and view all the flashcards

Construct SLR Table

The SLR-table is a data structure consists of rows and columns equal to the number of states in DFA

Signup and view all the flashcards

Sn (Parsing Table)

Shift into state n

Signup and view all the flashcards

gn(Parsing Table)

Goto State n

Signup and view all the flashcards

rk (Parsing Table)

Reduce by production k

Signup and view all the flashcards

Semantic Analysis

Connects variable definitions to uses + checks expression types.

Signup and view all the flashcards

Static Type Checking

Checks types during compile time.

Signup and view all the flashcards

Type Checks

Compiler reports error when operator applies to incompatible operand.

Signup and view all the flashcards

Flow Of Control Checks

Statements leave constructs must have a place to transfer control.

Signup and view all the flashcards

Uniqueness Checks

Enforces unique object definitions.

Signup and view all the flashcards

Type System

Type + notation + assigning rules for language constructs.

Signup and view all the flashcards

Basic Type

Atomic types (no internal structure): Boolean, Integer, Real, etc.

Signup and view all the flashcards

Study Notes

Bottom-Up Parsing

  • Parsing begins at the leaves of the parse tree and proceeds towards the root
  • It is capable of handling a large class of grammars

Shift-Reduce Parsing

  • Aims to construct a parse tree for an input string by starting at the leaves and moving up to the root
  • Reduces a string to the grammar's start symbol
  • At each reduction step, a substring matching the right side of a production is replaced by the symbol on the left side of that production
  • An example grammar is S -> aABe, A -> Abc | b, B -> d
  • An example input abbcde can be parsed bottom-up

Handle

  • Is a substring that matches the right side of a production

Stack Implementation of Shift-Reduce Parsing

  • Employs a stack for grammar symbols and an input buffer for the string to be parsed
  • '$' marks the bottom of the stack and the end of the input string
  • There are four possible actions: Shift, Reduce, Accept, and Error

Shift

  • The next input symbol gets shifted onto the top of the stack

Reduce

  • The handle is replaced with a nonterminal symbol

Accept

  • The parser declares successful completion of parsing

Error

  • The parser identifies a syntax error and invokes an error recovery routine

Conflicts During Shift-Reduce Parsing

  • Shift-reduce parsing is not suitable for all context-free grammars
  • Ambiguous grammars lead to parsing conflicts
  • Resolution involves rewriting the grammar or making appropriate action choices during parsing
  • Two types of conflicts: Shift/Reduce and Reduce/Reduce

Shift/Reduce Conflicts

  • Must decide whether to shift or reduce, as shown in previous example

Reduce/Reduce Conflicts

  • Must decide which production to reduce with
  • Example productions include: stmt -> id(param), param -> id, expr -> id(expr) | id

LR Parsers

  • An efficient bottom-up syntax analysis technique suitable for a large class of context-free grammars
  • "L" stands for left-to-right scanning of input
  • "R" stands for constructing a rightmost derivation in reverse
  • "k" represents the number of input symbols of lookahead (if omitted, k is assumed to be 1)
  • LR parsing is attractive because LR parsers can recognize virtually all programming language constructs for which context-free grammars can be written
  • The LR parsing method is a general nonbacktracking shift-reduce parsing method
  • The class of grammars parsable by LR methods is a proper superset of those parsable by predictive parsers
  • Consists of an Input, an Output, a Stack, a Driver program, and a Parsing table that has two parts (action and goto)

Types of LR Parsers

  • Simple LR parser (SLR) is the easiest to implement but least powerful
  • Canonical LR parser is the most powerful and most expensive
  • Lookahead LR parser (LALR) is intermediate in power and cost
  • The LALR method works on programming-language grammars and can be implemented efficiently

LR Parsing Algorithm

  • The LR program is the same for all LR parsers
  • Only the parsing table changes from one parser to another

Implementation of SLR parser

  • The SLR-parser is extremely tedious to build by hand, so needs a generator
  • Here are the stages used to build a system for implementing the SLR-parser

Input Stage

  • The grammar is read, and grammar symbols (terminals and nonterminals) are specified
  • Each grammar production must be on one straight line
  • The productions are numbered

Compute First & Follow Stage

  • First & Follow are detected for each nonterminal

Construct DFA Stage

  • Using a deterministic finite automaton (DFA), the SLR-parser determines when to shift and reduce
  • DFA edges are labeled by grammar symbols (terminals & nonterminals)
  • The input begins with S'(root), indicating that it begins with any possible right-hand side of an S-production

State0/State1

  • A production combined with the dot(.) shows a position of the parser
  • For each production, there are three cases to examine for the symbol after the dot:

Symbol is Null

  • If dot has occurred at the end of the right side of a production, there is no new state

Symbol is '$' Sign

  • If the symbol is the '$' sign, then there is no new state

Symbol is a Terminal or Nonterminal

  • Start a new state with current production after the dot has been proceeded one step forward
  • If the symbol that has been occurred after the dot(in new position) is nonterminal such as A, add all possible right hand side of A to a new state
  • Any newly built state must be stored in a buffer and compared with previous states in DFA
  • If there is no similarity, the new state is added to DFA and given a new number

Construct SLR Table Stage

  • The SLR-table is a data structure of rows equal to the number of states in DFA, plus columns equal to grammar symbols and the "$" sign symbol
  • Data structures provide fast treatment and retrieving of information
  • Two subtables are created

Action Table

  • Consists of rows equal to the number of states in DFA, and columns equal to the terminals plus "$" symbol

Goto Table

  • Consists of the same number of rows as the action table, and columns equivalent to nonterminals

Kinds of Actions

  • The elements in the SLR-table are labeled with four kinds of actions:
  • Sn: Shift into state n
  • gn: goto state n
  • rk: reduce by production k
  • a: accept
  • error (denoted by blank entry in the table) To construct this table, the actions on the table's cells must pass to each state in DFA individually

Shift Action and Goto Action

  • The shift & goto action can be labeled according the edge which has been moved from the current state(n) to the new state
  • If the edge was terminal symbol (t), the Cell[n-1,t]= sn
  • If the edge was nonterminal symbol (N), the Cell[n-1,N]= gn
  • If there are production in current state has the form A -> β. (the dot in the end of right hand side, β is any string ), then the action is reduce
  • Reduce means Cell[n-1,f]=rk { f in Follow(A), k is the no. of production}
  • If there are production in current state has the form A -> β.$ {the dot occurred before $ sign, β is any string },then the action is accept
  • Accept means Cell[n-1,$]=a
  • Final step indicates, any free cell in row n-1 means an error action
  • Repeat the steps for each state in DFA

Implement LR Algorithm

  • Suppose the input string is "x+x"
  • After inserting the input string, the LR-program is executed

Semantic Analysis

  • This compiler phase connects variable definitions to their uses and checks for correct expression types
  • Static type checking distinguishes this from dynamic type checking during target program execution
  • This phase maintains symbol tables and their identifiers' types and locations

Type Checks

  • Report an error if an operator is applied to an incompatible operand

Flow of Control Checks

  • Ensure statements causing control flow to transfer have a destination
  • A "break" statement in 'C' must have an enclosing while, for, or switch statement or else an error arises

Uniqueness Checks

  • Objects must be defined exactly once
  • 'Pascal' requires unique declaration of identifiers
  • Ensure that a name is the same in multiple cases
  • 'Ada', a loop or block name must match at the beginning and end

Type System

  • Based on language's syntactic constructs, notation of types, and rules for assigning types
  • When both operands of arithmetic operators "addition", "subtraction", and "multiplication" are of type integer then the result is of type integer
  • When the type of operand is T, the type of result is ' pointer to T '

Classifying Types

  • Classify type into 3 categories
    • Basic types include atomic types with no internal structure like: Boolean, Integer, Real, Char, Subrange, Enumerated, and special types such as "type-error, void "
  • Construct, constructs types from basic or complex types: array, struct, set
  • Complex types like: link list, tree, pointer
  • The Type system is a collection of rules for assigning type expressions to the various parts of a program
  • The type checker implements a type system

Specification of a Simple Type Checker

  • The type checker follows a translation scheme synthesizing an expression's type from its subexpressions
  • Each identifier's type has to be declared before its use
  • The grammar represents a represented by P (nonterminal), a queue of declarations D as well as a single expression E
  • Type checker (translation scheme) produce parts that save identifier types

Type Checking

  • Semantic rules include
    • literal is character
    • num is an integer
  • Lookup( e ) to fetch saved types in ST (symbol table), when identifier "e" appears in an expression
  • Applies (mod) to two subexpressions
    • E.type is an integer when E1.type and E2.type are integers, throw an error if not
  • An array reference checks the following
  • E.type is t when E.type is an array with parameters and the other type is an integer, throw an error if not

Statements

  • the following is true
    • S.type is void if the types are equal, else error
    • S.type is boolean if the types are valid, or an error
    • S.type is void if all S types are valid or void, else error

Intermediate Code Generation (IR)

  • IR is an internal form of a program created by the compiler while translating the program from a H.L.L to L.L.L.(assembly or machine code).
  • The backend of compiler uses IR to generate target code
  • A machine independent IR offers the following benefits
    • Compiler for machine created by combining backend for the new machine into the existing backend
    • Optimization strategy better performed on IR than language program
  • IR becomes a more attractive form of target code

Tow Kinds of Intermediate Representations

  • Syntax tree, postfix notation
  • DAG carries same data as tree, expressed in a compact way (common subexpression identified)
  • Postfix notation linearizes the syntax tree
  • The representations above are as follows
  • The expression and representations

Three Address code

  • The code is carried out as a statement
  • The arithmetic functions utilize X-Y op Z
  • The components in the code are compiler generated temporary

Three Address code Statements

  • There are assignment, arthimetic and logic operators in a binary format
    • Unary operators are expressed also
  • Copy statement are present
  • Unconditional, conditional jumps included with param call and code
  • The code is called recursively
  • The memory, data are managed

How the Code gets Implemented

  • The quadruples, the triples, are coded in a four field record, these values help the code work fast

Code Optimization

  • Compilers produce code as good as from the human hand
  • Accomplishes goals with a proper function with a compiler that improves code functions as well
  • Code optimization structures syntax, simplifies and takes advantage of specific machines, its aim is to
    • Run faster
  • Utilize less space
  • Improve space and run time
  • Optimizer should not alter an output
  • Transformations must accelerate processing
  • Optimizations happen at various positions in the machine
    • High level
  • Middle level
  • Low level
  • Optimizer implementations are the following
    • Implement high level explicit operations in memory
  • Middlecode operates independently with all targets

Basic Blocks

  • The code is in sequence, has instructions

Block setting the following

  • Instruction set start
  • Code is first
  • Jumps operate on targets and include everything
    • There's a block beginning
  • Repeats until the end with instructions

Data Flow Analysis Algorithm

  • Needs compiler optimization and to collect data from across the language
  • In order to optimize a compiler, information about a program as a whole needs to be collected and distributed to each block in the flow graph, DFA provides information about how program execution will manipulate data.
  • DFAs help data and information gathering
    • Definations are used and so forth

The Analysis

  • Definations lead to what follows

  • A block that is not defined in B

  • Assignments are in B then it contains all assignments

    • The set of defs hitting start
  • The definations reach the end

    • The algorithm for calculation
  • Examples

    • Shows the defs in blocks

Loop Information

  • Iteration causes repetition that becomes prime point of optimization
    • This determines codes in languages
  • Blocks, headers, are the prime factors

Code optimization methods

  • The transformation occurs locally when the statements are considered

Local Transformation

  • Code is deleted
  • Expressions in code is restructured
    • Equations are expressed cheaply

Global transformations

  • The data is expressed cheaply
  • The code is deleted
  • Data is expressed as code and is removed from the program, what code can get reused
  • The loop is the key optimization step and decreases in memory and processing. Code gets reused and is reused with all steps

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser