02-lecture-notes PDF
Document Details
Uploaded by RestoredAzurite2265
TUM
Tags
Summary
These lecture notes cover topics in theoretical computer science, focusing on the concepts of regular expressions, finite automata, and context-free languages. The notes also include administrative information, like schedule and readings.
Full Transcript
Administrivia Lectures, Mondays, start 9:30 sharp Exercises, 90 minutes every week, starting this week Comprehensive final exam: February 11, 14:00-15:30 Moodle page: more info, lecture notes, exercise sheets, etc. Reading: Chapters 0 and 1 Last time: DFAs,...
Administrivia Lectures, Mondays, start 9:30 sharp Exercises, 90 minutes every week, starting this week Comprehensive final exam: February 11, 14:00-15:30 Moodle page: more info, lecture notes, exercise sheets, etc. Reading: Chapters 0 and 1 Last time: DFAs, NFAs, properties of regular languages Today: more properties of regular languages, Regular Expressions, non-regular languages Textbook and Other Reading Course textbook available online from the TUM Library. Information about accessing TUM ebooks. Recall Finate Automata A finite automaton is a 5-tuple (Q, ⌃, , qs , F ), where Q is a finite set called the states, ⌃ is a finite set called the alphabet, is the transition function, qs 2 Q is the start state, F ✓ Q is the set of accept states. Also recall the two special symbols: the empty string ✏, and the empty language ; Recall Regular Languages The language of an automaton is the set of strings it accepts For now computation simply means determining whether a given string is in the language or not (decision problem) Two variants: deterministic and non-deterministic DFAs and NFAs are equivalent in power A language regular language if some finite automaton recognizes it There are 3 regular operations: union, concatenation and Kleene star Recall Simulations of one or more FAs by another We showed how to simultaneously simulate two machines with one, to prove that regular languages are closed under the union operation. We showed how to simultaneously simulate all the branches of a NFA with just one DFA. Closure under Union Theorem The class of regular languages is closed under the union operation. Proof. We proved this with DFAs; now we redo it with NFAs. Closure under Union Theorem The class of regular languages is closed under the union operation. Proof. Let N1 = (Q1 , ⌃, 1 , q1 , F1 ) recognize A1 , and N2 = (Q2 , ⌃, 2 , q2 , F2 ) recognize A2. Construct N = (Q, ⌃, , q0 , F ) to recognize A1 [ A2. Q = {q0 } [ Q1 [ Q2. start state q0 F = F1 [ F2. 8 > > 1 (q, a) : q 2 Q1 > > < (q, a) : q 2 Q 2 2 (q, a) = > > {q1 , q2 } : q = q0 , a = ✏ > > :; : q = q , a 6= ✏ 0 Closure under Concatenation Recall concatenation: A1 A2 = {xy |x 2 A1 and y 2 A2 }. Theorem The class of regular languages is closed under the concatenation operation. Proof. Closure under Concatenation Theorem The class of regular languages is closed under the concatenation operation. Proof. Let N1 = (Q1 , ⌃, 1 , q1 , F1 ) recognize A1 , and N2 = (Q2 , ⌃, 2 , q2 , F2 ) recognize A2. Construct N = (Q, ⌃, , q1 , F2 ) to recognize A1 A2. Q = Q 1 [ Q2. start state q1 accept states F2 8 > > 1 (q, a) : q 2 Q1 , q 6= F1 > > < (q, a) : q 2 F , a 6= ✏ 1 1 (q, a) = > > 1 (q, a) [ {q2 } : q 2 F1 , a = ✏ > > : (q, a) : q 2 Q 2 2 Closure under Star Recall star: A⇤ = {x1 x2... xk |k 0 and each xi 2 A}. Theorem The class of regular languages is closed under the star operation. Proof. Closure under Star Recall star: A⇤ = {x1 x2... xk |k 0 and each xi 2 A}. Theorem The class of regular languages is closed under the star operation. Proof. Closure under Star Theorem The class of regular languages is closed under the star operation. Proof. Let N1 = (Q1 , ⌃, 1 , q1 , F1 ) recognize A1. Construct N = (Q, ⌃, , q0 , F ) to recognize A⇤1. Q = {q0 } [ Q1 start state q0 F = {q0 } [ F1. 8 > > 1 (q, a) : q 2 Q1 , q 6= F1 > > > > < 1 (q, a) : q 2 F1 , a 6= ✏ (q, a) = 1 (q, a) [ {q1 } : q 2 F1 , a = ✏ > > > > {q1 } : q = q0 , a = ✏ > > : ; : q = q0 , a 6= ✏ Regular Expressions R is a regular expression if R is 1 a for some a in the alphabet ⌃, 2 ✏, 3 ;, 4 (R1 [ R2 ), where R1 and R2 are regular expressions, 5 (R1 R2 ), where R1 and R2 are regular expressions, or 6 (R1⇤ ), where R1 is a regular expression. RE Examples Let ⌃ = {0, 1} 0⇤ 10⇤ = {w |w contains a single 1} ⌃⇤ 1⌃⇤ = {w |w has at least one 1} 0+ 1⇤ = {w |w starts with one or more 0s and ends in zero or more 1s} ⌃⇤ 001⌃⇤ = {w |w contains the string 001 as a substring} (⌃⌃)⇤ = {w |w is a string of even length} (⌃⌃⌃)⇤ = {w |w has length that is a multiple of 3} 01 [ 10 = {01, 10} 0⌃⇤ 0 [ 1⌃⇤ 1 [ 0 [ 1 = {w |w starts and ends with the same symbol} (0 [ ✏)1⇤ = {01⇤ [ 1⇤ } (0 [ ✏)(1 [ ✏) = {✏, 0, 1, 01} 1⇤ ; = ; ;⇤ = {✏} RE Identities R [;=R R ✏=R R [ ✏ may NOT equal R – why? if R = 0, then L(R) = {0}, but L(R [ ✏) = {0, ✏} R ; may not equal R – why? if R = 0, then L(R) = {0}, but L(R ;) = ; Regular expressions are useful tools in the design of compilers Programming language tokens (such as variable names and constants) are usually described by REs a numerical constant that may include a fractional part and/or a sign may be described as a member of the language (+ [ [ ✏)(D + [ D ⇤.D ⇤ [ D ⇤.D + ), where D = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} is the alphabet of decimal digits examples of valid strings are 42, 3.14159, +7., and -.01 RE () FAs Theorem A language is regular if and only if some regular expression describes it. Proof. ( Consider some regular expression R that describes language A. We show how to convert R into an NFA recognizing A. 1 R = a for some a 2 ⌃. Then L(R) = {a}, and the following NFA recognizes L(R) RE () FAs Theorem A language is regular if and only if some regular expression describes it. Proof. ( Consider some regular expression R that describes language A. We show how to convert R into an NFA recognizing A. 1 R = a for some a 2 ⌃. Then L(R) = {a}, and the following NFA recognizes L(R) 2 R = ✏. Then L(R) = {✏}, and the following NFA recognizes L(R) RE () FAs Theorem A language is regular if and only if some regular expression describes it. Proof. ( Consider some regular expression R that describes language A. We show how to convert R into an NFA recognizing A. 1 R = a for some a 2 ⌃. Then L(R) = {a}, and the following NFA recognizes L(R) 2 R = ✏. Then L(R) = {✏}, and the following NFA recognizes L(R) 3 R = ;. Then L(R) = ;, and the following NFA recognizes L(R) RE () FAs Theorem A language is regular if and only if some regular expression describes it. Proof. ( Consider some regular expression R that describes language A. We show how to convert R into an NFA recognizing A. 1 R = a for some a 2 ⌃. Then L(R) = {a}, and the following NFA recognizes L(R) 2 R = ✏. Then L(R) = {✏}, and the following NFA recognizes L(R) 3 R = ;. Then L(R) = ;, and the following NFA recognizes L(R) 4 R = R1 [ R2 5 R = R1 R2 6 R = R1⇤ Example of RE to NFA conversion Let R = (ab [ a)⇤ Example of RE to NFA conversion Let R = (ab [ a)⇤ RE () FAs Theorem A language is regular if and only if some regular expression describes it. Proof. ) Given some regular language A, there exists a DFA that recognizes A. We show how to extract an equivalent regular expression, by means of a generalized nondeterministic finite automaton (GNFA). Example of DFA to RE conversion Start with the given NFA and convert into GNFA: start state has transition arrows to every other state start state has no arrows coming in from any other state there is only a single accept state, and it has arrows coming in from every other state accept state has no arrows going to any other state except for start and accept states, one arrow goes from every state to every other state; also from each state to itself Example of DFA to RE conversion This is easy to do: add new start state with ✏ arrow to the old start state add new accept state with ✏ arrows from old accept states If any arrows have multiple labels (or if there are multiple arrows going between the same two states in the same direction), we replace each with a single arrow whose label is the union of the previous labels. add arrows labeled ; between states that had no arrows Example of DFA to RE conversion Now we remove one state at a time from GNFA while ensuring it still recognizes the same language, until only start and accept states remain: let qrip be the state we remove consider every pair of states (qi , qj ) update (qi , qj ) transition label to account for removal of qrip Example of DFA to RE conversion RE () FAs Theorem A language is regular if and only if some regular expression describes it. Proof. ) Start with an NFA for A and convert into an equivalent GNFA. Remove all states one by one, except the start and accepts states. The expression on the arc from the start to the accept state is the regular expression that describes A. ( We showed this by using the inductive definition of a regular expression (a, ✏, ;, R1 [ R2 , R1 R2 , R ⇤ ) to convert the given regular expression for A into an NFA for A. What Makes a Language Regular Consider the languages L = {0⇤ 1⇤ } L = {01} L = {0011} L = {✏, 01, 0011} L = {0n 1n |n 0} Let’s try to build FAs for them... The Pumping Lemma (PL) for Regular Languages Theorem (PL) If A is a regular language, then there exists a number p (the pumping length) so that if s is any string in A of length at least p, then s may be divided into three pieces, s = xyz, satisfying the following conditions: 1 xy i z 2 A, for each i 0, 2 |y | > 0, and 3 |xy | p. The Pumping Lemma (PL) for Regular Languages Theorem If A is regular, then 9p so that if s 2 A and |s| > p, then s may be divided into three pieces, s = xyz, satisfying the following: 1 xy i z 2 A, for each i 0, 2 |y | > 0, and 3 |xy | p. Proof. Let M = (Q, ⌃, , qs , F ) be a DFA for language A. Then we set p = |Q|. Consider some string s 2 A such that |s| > p. While computing, M must repeat some state (by the pigeonhole principle). Let qx be the first state that is repeated. Then this picture completes the proof: Proving a Language Non-Regular - To prove a language is regular: build a FA or give a RE Proving a Language Non-Regular - To prove a language is regular: build a FA or give a RE - To prove a language is non-regular, we use the PL Proving a Language Non-Regular - To prove a language is regular: build a FA or give a RE - To prove a language is non-regular, we use the PL Claim L = {0n 1n |n 0} is not a regular language Proof. Suppose L is regular. Then the PL applies. Let p be the pumping length. Then consider the string s = 0p 1p and all possible ways to break s down into s = xyz subject to the 3 PL conditions. Clearly, y must contain one or more zeros (from conditions 2 and 3). Then pumping y up or down results in strings that are not in L: xz 2/L (fewer 0s than 1s), xyyz 2 / L (more 0s than 1s), etc. Proving a Language Non-Regular Use PL to prove a language is non-regular. Think of it as a game: You are given the language L You are given a value p You choose a string s longer than p Choose carefully in order to break the claim that s satisfies the 3 conditions of the PL Note that you do not choose how s is broken into 3 parts s = xyz You must argue that no matter how s is broken into 3 parts there is always a contradiction. Choosing s poorly can make your job harder or impossible. Proving a Language Non-Regular Claim L = {0n 1n |n 0} is not a regular language Now, let’s choose a di↵erent string s. Proof. Suppose L is regular. Then the PL applies. Let p be the pumping length. Then consider the string s = 0dp/2e 1dp/2e and all possible ways to break s down into s = xyz subject to the 3 PL conditions. y contains one or more 0s; then xy 2 z contains more 0s than 1s y contains one or more 1s; then xy 2 z contains more 1s than 0s y contains 0s and 1s; then xy 2 z alternates b/n 0s and 1s more than once. Proving a Language Non-Regular Claim L = {ww |w 2 {0, 1}⇤ } is not a regular language Proof. Suppose L is regular. Then the PL applies. Let p be the pumping length. Then consider the string s = 0p 10p 1 and all possible ways to break s down into s = xyz subject to the 3 PL conditions. Because of condition 3, y must contain one or more 0s. Then xy 2 z has more 0s in on the left side of the first 1 than on the right, which means that xy 2 z 2 / L. Again note how the careful choice of s reduces the number of cases we need to consider. Proving a Language Non-Regular Claim L = {0i 1j |i > j} is not a regular language Proof. Suppose L is regular. Then the PL applies. Let p be the pumping length. Then consider the string s = 0p+1 1p and all possible ways to break s down into s = xyz subject to the 3 PL conditions. Because of condition 3, y must contain one or more 0s. What can we say about xy 2 z and its membership in L? Better to consider xy 0 z = xz, as this string has less than or equal number of 0s than 1s, which means that xy 0 z 2 / L. Note that here “pumping up” didn’t work, but “pumping down” does indeed work. The Pumping Lemma (PL) for Regular Languages Observe the PL is not a necessary and sufficient condition for regularity! that is, the pumping lemma is not a complete characterization of regular languages. PL: A regular ) p : s 2 A and |s| > p... The opposite is not necessarily true Context Free Languages Regular languages are characterized by FAs and REs CFLs are characterized by pushdown automata (PDAs) and Context Free Grammars (CFGs) Example of a CFL: L = {0n 1n |n 0} PDAs are machines with memory (stack) Unlike FAs, DPDAs and NPDAs are not equivalent in power CFGs are Type 2 grammars in Chomsky’s hierarchy