Lexical Analyzer Minimizing DFA

TidyWildflowerMeadow avatar
TidyWildflowerMeadow
·
·
Download

Start Quiz

Study Flashcards

18 Questions

What is the approach used to recognize the longest possible string in a lexical analyzer?

Lookahead technique

What is the purpose of minimizing a DFA in compiler design?

To reduce the number of states in the DFA

What is the primary function of a lexical analyzer in a compiler?

To recognize tokens in the source code

How does a lexical analyzer typically handle comments in the source code?

It skips a comment and returns the next token

What is the purpose of partitioning in minimizing a DFA?

To merge equivalent states

What is the purpose of lookahead in a lexical analyzer?

To recognize the end of a token

What is the language denoted by a regular expression called as?

Regular set

What is the result of concatenating the empty string to a string s?

The string s remains unchanged

What is the Kleene closure of a language L?

The set of all possible strings that can be formed by concatenating zero or more strings from L

What is the result of the positive closure of a language L?

The set of all possible strings that can be formed by concatenating one or more strings from L

What is the operation that combines two languages L1 and L2 to form a new language that contains all possible concatenations of strings from L1 and L2?

Concatenation

What is the set of all possible identifiers considered in the context of languages?

Set of well-formed identifiers

What is the purpose of the function nullable(n) in lexical analysis?

To check if the empty string is a member of strings generated by the sub-expression rooted by n

What is the set of positions of the last symbols of strings generated by the sub-expression rooted by n called?

lastpos(n)

How are the followpos of each position computed in the syntax tree?

By making one depth-first traversal of the syntax tree

What type of node in the syntax tree has a child node that is a concatenation node?

Star-node

What is the purpose of the function firstpos(n) in lexical analysis?

To find the set of positions of the first symbol of strings generated by the sub-expression rooted by n

What is the two-rule definition for the function followpos?

One rule for concatenation-node and one rule for star-node

Study Notes

Minimizing DFA

  • Given two groups: G1 = {2} and G2 = {1,3}, G2 cannot be partitioned because move(1,a)=2 and move(3,a)=2
  • Minimized DFA: {1,3} and {2}
  • Example 36: groups {1,2} and {3,4}, minimized DFA: {1,2,3} and {4}

Lexical Analyzer

  • Recognizes the longest possible string
  • Issues:
    • What is the end of a token?
    • No character marks the end of a token
    • Lookahead is needed
  • Skipping comments:
    • Normally, comments are not returned as tokens
    • Lexical analyzer skips comments and returns the next token to the parser

Terminology of Languages

  • Alphabet: a finite set of symbols (ASCII characters)
  • String: a finite sequence of symbols on an alphabet
  • Language: a set of strings over some fixed alphabet
  • Operators on Strings:
    • Concatenation: xy represents the concatenation of strings x and y
    • |s| is the length of string s
  • Examples:
    • The set of well-formed C programs is a language
    • The set of all possible identifiers is a language

Operations on Languages

  • Concatenation: L1L2 = {s1s2 | s1 ∈ L1 and s2 ∈ L2}
  • Union: L1 ∪ L2 = {s | s ∈ L1 or s ∈ L2}
  • Exponentiation: L0 = {ε}, L1 = L
  • Kleene Closure: L* = ⋃Li i=0
  • Positive Closure: L+ = ⋃Li i=1
  • Example: L1 = {a,b,c,d}, L2 = {1,2}
    • L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2}
    • L1 ∪ L2 = {a,b,c,d,1,2}
    • L13 = all strings with length three using a,b,c,d
    • L1* = all strings using a,b,c,d and the empty string
    • L1+ = doesn't include the empty string

Regular Expressions

  • Used to describe tokens of a programming language
  • A regular expression is built up of simpler regular expressions
  • Each regular expression denotes a language
  • A language denoted by a regular expression is called a regular set
  • lastpos(n) - the set of positions of the last symbols of strings generated by the sub-expression rooted by n
  • nullable(n) - true if the empty string is a member of strings generated by the sub-expression rooted by n

Evaluating firstpos, lastpos, nullable

  • nullable(n) - true if the empty string is a member of strings generated by the sub-expression rooted by n
  • firstpos(n) = {i} if n is a leaf labeled with position i
  • lastpos(n) = {i} if n is a leaf labeled with position i
  • firstpos(c1) ∪ firstpos(c2) if c2 is nullable
  • lastpos(c1) ∪ lastpos(c2) if c2 is nullable
  • firstpos(c) =[firstpos(c1) if nullable(c1) else firstpos(c1)] if c = c1*
  • lastpos(c) = [lastpos(c1) if nullable(c2) else lastpos(c2)] if c = c1c2

Evaluating followpos

  • Two rules define the function followpos:
    1. If n is a concatenation-node with left child c1 and right child c2, and i is a position in lastpos(c1), then all positions in firstpos(c2) are in followpos(i)
    2. If n is a star-node, and i is a position in lastpos(n), then all positions in firstpos(n) are in followpos(i)
  • Example: (a|b)*a#

This quiz focuses on minimizing DFA in lexical analyzer, with examples and explanations of the process. It covers the partitioning of states and the resulting minimized DFA.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free
Use Quizgecko on...
Browser
Browser