Finite State Machines and Context-Free Grammars Lecture Notes PDF
Document Details
Uploaded by Deleted User
2008
Prof. Ganesh Ramakrishnan
Tags
Summary
This document presents lecture notes on Finite State Machines (FSMs) and Context-Free Grammars. The notes cover various aspects of these topics, including different types of FSMs, closure properties, and their relationship to programming languages. The document is useful for understanding the theoretical foundations of computer science.
Full Transcript
Lecture Notes: Finite State Machines and Context Free Grammars Prof. Ganesh Ramakrishnan August 4, 2008 2 Contents 1 Introduction 3 2 Finite State Machines...
Lecture Notes: Finite State Machines and Context Free Grammars Prof. Ganesh Ramakrishnan August 4, 2008 2 Contents 1 Introduction 3 2 Finite State Machines 9 2.1 Deterministic Finite State Machines................ 11 2.2 Closure: A Brief Introduction.................... 15 2.3 Examples of Non Regular Sets................... 19 2.4 Non Determinism........................... 20 2.5 Computational Comparison of Deterministic and Non-deterministic Machines............................... 23 2.6 Conversion From Non-deterministic to Equivalent Deterministic Machine................................ 24 2.7 Closure................................ 27 2.7.1 Closure under Union..................... 31 2.7.2 Cosure under Concatenation................ 32 2.7.3 Closure under Complement................. 33 2.7.4 Closure under Intersection.................. 33 2.7.5 Alternating Finite State Machines............. 35 2.8 Equivalent Representations of Regular Sets............ 36 2.9 Non-regular Sets and Introduction to Pumping Lemma...... 36 2.10 The Pumping Lemma........................ 39 2.11 Linear Grammar........................... 44 2.11.1 From Finite State Machine to Linear Grammar...... 44 2.11.2 From Left Linear Grammar to Finite State Machine... 47 2.12 Traingle of Equivalance for Finite State Machines......... 48 2.13 Mimization of Finite State Machines................ 48 2.13.1 Intuition for Minimization.................. 49 2.13.2 An Efficient Algorithm for Minimization.......... 51 2.13.3 Comutational Complexity.................. 58 2.14 Decision Problems about FSMs................... 59 2.14.1 Membership Question.................... 59 2.14.2 Equivalence of Two Finite State Machines......... 59 2.14.3 Is the language of an FSM Infinite?............ 60 2.14.4 Is Regular Set A contained in regular set B?....... 61 2.15 Moving Forward........................... 61 3 CONTENTS 1 3 Context Free Languages 63 3.1 Context Free Grammars and Derivations.............. 64 3.1.1 Example 1........................... 64 3.1.2 Example 1........................... 65 3.1.3 Example 2........................... 68 3.1.4 Example 3........................... 69 3.2 Techniques for Writing Context Free Grammars.......... 70 3.2.1 Example 4........................... 71 3.2.2 Example 5........................... 72 3.2.3 Example 6........................... 75 3.2.4 Example 7........................... 77 3.2.5 Closure under Intersection and Complement........ 78 3.2.6 Closure under Union..................... 78 3.3 Relationship to Compiling and Programming Languages..... 79 3.4 Normal Forms............................. 80 3.4.1 Chomsky Normal Form................... 80 3.5 Does a CFG Generate any String.................. 87 3.6 Pushdown Machines......................... 88 3.6.1 Example 1........................... 89 3.6.2 Example 2........................... 90 3.6.3 Example 3........................... 92 3.6.4 Simulating a stack with a queue.............. 94 3.6.5 Example 4........................... 94 3.7 Conversion from CFG to PDM................... 95 2 CONTENTS Chapter 1 Introduction Theory of computation (TOC) is perhaps one of the most abstract areas in the computer science curriculum, but one of the most fundamental area any computer scientist can know. The really interesting thing in TOC will neither help you write a program nor build a computer per say. TOC will help you understand what people have thought for the past 50 years about computer science as a science. And it is about 1. what kind of things we can compute mechanically, 2. how fast can we do it and 3. how much space in memory does it take us to do it. And because of that it is called the theory of computation. Like lots of abstract areas, and areas that have a lot of cool thinking and neat stuff in it. Especially in computer science, there are lots of applications that come out of TOC. Things that are pretty much serendipitous. All of compiler design and theory about building compilers and writing programs to translate languages comes from theory of computation. It turns out that some of the models of computation we discuss happen to correspond to programs that help us write compilers. There are lots of other applications as well. When you talk about computer architecture, you model a particular process with a finite state machine. The idea of finite state machine comes from the theory of computation. The idea of string matching in any word processor or any kind of editor - that is a finite state machine. When you do any kind of compiling or representation of a language like XML, you describe it with a grammar. And that grammar is the next level of model of computation. This area talks about different models of computations - different kinds of things you can use to compute stuff. WHat does it mean to compute something? It is simply to abstract away from the specifics of the machine and all the specifics of the problem in the following style. You imagine that all you are dealing with is programs or computations that take an input, the program sits 3 4 CHAPTER 1. INTRODUCTION and thinks and calculates and computes and then says a ‘yes’ or ‘no’. Do I accept this input or do not accept this input. Whatever number crunching is involved in only to decide a yes or no. In order to do that, we think of computing in terms of set or alphabets of elements. Example alphabets are the binary alphabet {0, 1} or even the uniary alphabet {1} - consisting on just one symbol. And we consider strings along that alphabet. An example of strings you might want to compute is A set of binary strongs that end in 0. This is not a very hard thing to compute. Given a string, you just need to see the last symbol in the string. If the string is a 0, your answer is ‘yes’, else it is ‘no’. And it is very easy to write a program that does this for you. There are other things that are harder. Suppose every piece of java code is converted into a binary file. And you are asked to determine if a given binary file is a legitimate piece of java code. Or is it just a junky binary file that represents something else. So the question is Does the given binary file represent a legitimate piece of java code or not? How easy is to check if the given binary strings represents a legal Java program. It must be possible - that is why we have syntax checkers that give error messages if the java code is not legitimate. The program that checks syntax in fact, does something more that say ‘yes’ or ‘no’. It actually spits out the error in the code. And that is what compiler design is largely about. Going one step further, we can identify an even more challenging problem: Identify the set of strings that represent legitimate Java programs that never go into an infinite loop. Is there a program that will look at your Java program and tell you ‘yes’ iff it is legitimate Java code and will never go into an infinite loop and a ‘no’ otherwise? In fact, there isn’t any such program. It is impossible. There is no way to compute this, no matter how hard you try - it is not going to work. Even if someone attempts such a program, it might either give you the wrong answer or it might never give you any answer - it might simply go into an inifite loop. Your program that you write to try and do this does not exist. That is what we will be talking about in the following two chapters. What can you compute? And what can’t you compute? And what kind of models do you need to do this kind of computations? Your own sense of how to do com- putation now might be a programming language - C++, scheme, java, etc.. We are going to abstract away all details and get to the root of what a programming language really is. The most popular way to view a programming language with everything stripped away is called a Turing machine, invented by Alan Turing. The Turing machine is a very simple machine which we will not delve into right now. In fact, we will build up to the Turing machine. But Alan Turing, 5 in the 1936’s, when he invented the Turing Machine (he died in the 1950’s), invented a mathematical abstraction that he thought was a complete represen- tation of how we might do computation. To the extent that, anything you do on a normal computer with a normal programming language could be done on his machine – you could write a program for his machine that does the same thing. The machine is very very abstract and simple. But because of its power, we can prove really interesting things about it. Like the fact that you can never have a program to solve the problem stated above. When Alan Turing tried to abstract away what we mean by computation, it was the first attempt to describe it rigorously. He showed the difference between thinking and logical computation. The word computation got a rigorous meaning only in the 30’s and 40’s of the 20th century. What we are going to do in this tutorial is work our way up from a much lower level. One question to be asked when you define something very abstractly with arms and legs and head, etc. is: what if you cut off one of its arms or legs. Will it be able to do everything it did before but just taking a bit longer? Or is it actually handicapped and will not be able to somethings it used to be able to do. For example, consider again the problem of constructing a program to answer the question Does the given binary file represent a legitimate piece of java code or not? We know that we can write a program to accept such strings. Therefore, we know that we can write a program for a Turing machine that would accept those. What if we cut off the left arm of the Turing machine (metaphorically)? Will it be that we cannot do the above computation any more? Is it that we have cut off actual power. Or is it that it will take machine much longer to do the computation? How much can I chop off from a Turing machine and still be able to do the computation. Those are the kind of questions people ask in the Theory of computation. We will start off from the lowest level of abstraction - the Finite state machine and will build our way up to the Turing machine. And then start exploring problems in the twilight zone which cannot be even computed by a Turing machine. Figure 1.1 gives a blue-print of the area of theory of computation. We will summarize the set of machines we will cover using the picture in Figure 1.1, typically referred to as the ‘Bull’s eye’. The Bull’s eye is more or less the Chomsky hierarchy, which Noam Chomsky presented in the mid 50’s. The simplest kind of machine we are going to consider is one that does not have as much power as programming languages - one that is simpler, called the Finite State Machine (FSM). That is the smallest and simplest level lying the center of the bull’s eye. FSMs can compute can compute certain kinds of sets and decide certain kind of things - like whether there is a zero at the end of a string. But they cannot do things mopre complicated, like figure out whether something is a legal Java program. There are many other interesting things 6 CHAPTER 1. INTRODUCTION Figure 1.1: The Bull’s eye: Blue-print of TOC 7 they cannot do. If you add some power to FSM, you get a class of sets called Context Free Language1 (CFL) that can be identified only by a more powerful machine. Context free language includes the language corresponding to FSMs2 The second circle (from inside) that includes the first one, represents CFLs. All the set of strings that correspond to legal Java programs are inside the CFL circle. Then we bump up another level to Turing machines. Turning machines represent full-level computation. Anything that can be really computed by a computer using any normal programming language can be computed by a Turing machine. And problems in the twilight zone (corresponding to the last ring) are called undecidable. There is not program anywhere that figure out the answer, mechanically, to undecidable problems in general - they are no algorithms to figure out the answers to these problems. The work on FSMs was primarily done in the 1950’s as well as early 1960’s, more than a decade after Alan Turing’s work on Turing Machines. The work on context free grammars was done primarily in the late 1960’s and early 70’s. These topics developed more out of mathematical and intellectual interest. And at the same time there was the serendipitous application in the form of compilers and programming languages. This is a blue-print for the area of theory of computation. As we progress, the blue-print will become more refined. There are sub-circles, overlapping circles, all sorts of relationships between circles. Within the turning machine level itself there are hierarchies. P and N P are both inside the turing machine level - P would be an inner sub-circle while N P would be an outside sub-circle. There is an entire hierarchy of complexity theory that takes place in the turning machine world. We will also see the there are different ways of describing languages. One way is description using machines with different amounts of power. Another way is by specifying grammars. A third way that works only on some levels is by expressions. So there are different ways of describing these classes of languages. In particular, the circle of context free languages can be split into two. by throwing in extra power. By throwing in the same extra power into the FSM and Turing Machine circles, though, we do not get a more expressive language. 1 A set of strings is often called a language. This should not be confused with a programming language. A language is nothing more or less than a set of strings. 2 We will later study the language corresponding to FSMs called the Regular language. 8 CHAPTER 1. INTRODUCTION Chapter 2 Finite State Machines We will get down to some details of theory of computation by first addressing the innermost circle of the bulls eye. While trying to get the ideas accross, we might compromise on rigour so that ideas do not seem to be really more complicated than they are. We will see what a finite state machine is by looking at an example and then pointing out in the example what the real traits are. Let us start with an example - the set of binary1 strings that have an even number of 0’s. We will describe finite state machines by actually implementing one for this problem. Problem 1 Construct an FSM that accepts only strings that have an even num- ber of 0’s. An intuitive definition of the class of finite state machines is “Anything you can solve by remembering a finite amount of information”. If you use that gut instinct rule, you will be almost able to identify whether or not a given problem can be solved by a finite state machine. An FSM has states or memory. We will construct our first FSM by giving a semantic meaning to each state. Imagine that we also have a tape on which characters get streamed. The machine will have a head that will look at one symbol at a time and strictly move left to right across the tape. The movement of the head from one symbol to the next, corresponds to the movement of the machine from one state to another. And the movement from one state to another is based on a transition function of the observed symbol. This is the simplest possible computation you could have. The following figure shows our first finite state machine. The state A will capture strings with an even number of ‘0’s. The state B will capture strings with an odd number of ‘0’s. Before we have seen any characters in the string, we have zero (an even number of) ‘0’s. Therefore, we 1 The alphabets can be very large in general. for example, Java programs are over an alphabet of A... Z, 0... 9, punctuations, etc. But for the purpose of understanding the abstractions to be discussed in the class, we will not need to consider alphabets that have more than two symbols. Two symbols give you kind of an exponential jump over an alphabet of one symbol and after that it is kind of convenience. 9 10 CHAPTER 2. FINITE STATE MACHINES Figure 2.1: Our first finite state machine will start at the state ‘A’. Every finite state machine should have one arrow coming out of each state for each symbol in the alphabet. For binary strings, we should therefore have two arrows coming out of each state. When we see a ‘0’ in state A, we switch over to state B. When we see a ‘1’ at state A, we remain at that state (indicated by a self loop). In fact, if you write the structure and design of finite state machines in this way by (a) writing the states down first without any arrows, modeling the problem semantically by giving information in the states and (b) then only later throwing in the arrows, you have a much easier time doing it. And if you have trouble deciding what information is to be thrown into the states, that is where you need to stop and think, not after putting in arrows which could perhaps get you on the wrong track. The state you could begin with is the start state. In Figure 2.1, the start state A is marked with an incoming arrow −→. The subset of states where the machine ends up when it finishes reading the string and tells us that the string has been accepted is termed as the set of final states. In the above example, state A is a final state. And if the machine ends up in any non-final state, the string is not accepted by the finite state machine. We will represent each such state by putting a double circle round that state. That gives us a complete finite state machine - a machine that accepts all binary strings with an even number of 0’s and rejects every binary string that does not have an even number of 0’s. We could run the FSM on any string to determine its membership to the class for which the FSM was designed. Note that this FSM gives only ‘yes’ and ‘no’ answers. It is hard enough a computation to deal with ‘yes’ and ‘no’ questions for many problems. At the Turing machine level, we will discuss the problem of producing an output. In computer architecture, you will find versions of FSMs that have outputs, called Mealy machines, Moore machines, etc. We will find some such examples in exercises. For the most part, we will not need to discuss such machines in our tutorial. 2.1. DETERMINISTIC FINITE STATE MACHINES 11 2.1 Deterministic Finite State Machines We will now provide a formal definition of deterministic finite state machines. Deterministic Finite State Machine A deterministic finite state machine consists of: 1. A finite set. of states, often denoted Q. 2. A finite set of input symbols, often denoted E. 3. A transition function that takes as arguments a state and an input symbol and returns a state. The transition function will commonly be denoted δ. In our informal graph representation of automata, δ was represented by arcs between states and the labels on the arcs. If q is a state, and a is an input symbol, then δ(q, a) is that state p such that there is an arc labeled a from q to p. 4. A start state, one of the states in Q. 5. A set, of final or accepting states F. The set F is a subset of Q. A de- terministic finite automaton will often be referred to by its acronym: DFA. The most succinct representation of a DFA is a listing of the five components above that is a ”five-tuple”: A = (Q, Σ, δ, q0 , F ) where A is the name of the DFA, Q is its set of states, Σ its in- put symbols, δ its transition function, q0 its start state, and F its set of accepting states. Let us look at a few more examples of FSMs. Problem 2 Construct an FSM that accepts every string with an even number of 0’ and even number of 1’s. Let us design such a machine by first outlining our plan. What should the states of such a machine be? One way to go about it is design four states, each for an even/odd combination of 0’s and 1’s. We will represent the even (e)/odd (o) nature of 0 and 1 in a string as a tuple < x, y >, where x, y ∈ {e, o}. Thus, < e, e > represents an even number of both 0’s and and 1’s. Figure 2.2 shows the corresponding FSM. When the FSM gets a 0 in state < e, e >, it toggles to < o, e >. When it receives a 0 in < o, e >, it toggles back to < e, e >. Similarly, it toggles between < e, e > and < e, o > on receiving the symbol 1, between < e, o > and < o, o > on receiving symbol 0 and between < o, o > and < o, e > on receiving symbol 1. The state < e, e > is the start state as well as final state. Notice that there need not always be a single final state. Suppose, we had to design an FSM that accepts only strings having an even number of 0’s and an even number of 1’s or having an odd number of 0’s and an odd number of 1’s. Such an FSM can be easily designed from the previous one by assigning the status of final state to the < o, o > state in addition to the < e, e > state. This yields the FSM in Figure 2.3. 12 CHAPTER 2. FINITE STATE MACHINES Figure 2.2: Finite state machine that accepts only strings with an even number of 0’s and an even number of 1’s Figure 2.3: Finite state machine that accepts only strings having an even number of 0’s and an even number of 1’s or having an odd number of 0’s and an odd number of 1’s 2.1. DETERMINISTIC FINITE STATE MACHINES 13 Figure 2.4: Finite state machine that accepts only strings having an even number of 0’s and an even number of 1’s or having an odd number of 0’s and an even number of 1’s It is interesting to note that insisting on a single final state takes away power from a deterministic machine2 (it is almost akin to cutting off half the brain of a finite state machine). There is no obvious characterization of the subset of the regular language that has finite state machines with the number of final states restricted to 1. Let us consider another variant of the above problem: Problem 3 Design an FSM that accepts only strings having an even number of 0’s and an even number of 1’s or having an odd number of 0’s and an even number of 1’s. Again, such an FSM can easily leverage the FSM in Figure 2.1 by making state < o, e > a final state in addition to state < e, e >. This yields the FSM in Figure 2.3. Is this the smallest machine? What is the logically equivalent description of the set of strings that have: An even number of 0’s and an even number of 1’s or an odd number of 0’s and an even number of 1’s ? It is simply, the set of strings that have an even number of 1’s! Thus, the FSM in Figure 2.4 is not the smallest FSM. The states < e, e > and < o, e > could be compressed into a single state < −, e > to generate the FSM in Figure 2.5. This example helps us realize us that given a set of strings, there could be a set of more than one FSMs that accept the strings. If this set had cardinality more than 1, there will be a canonical minimal FSM. And this is a very nice thing to have about a machine. Because there is no canonical minimal program for the set of programs you write in your computer architecture course or your programming language course. However, for finite state machines, you have this 2 Exercise. 14 CHAPTER 2. FINITE STATE MACHINES Figure 2.5: Finite state machine that is smaller than the FSM in Figure 2.3 although equivalent. nice property. And there is a very rigorous dynamic programming algorithm that does that minimization (we will talk about that later in the tutorial). Next, we will look at another problem: Problem 4 Design a finite state machine that accepts only binary strings that are divisible by 4. How do we solve this problem in daily practice? One bad way is to convert the binary number to base 10, divide the number in base 10 by 4, observe the remainder and if the remainder happens to be 0, we know that the binary number is divisible by 4. It is like tying one’s hands behind and then trying to tie the shoe. It is much easier to look at the binary number directly and decide divisibility by 4 - in fact any division by a power of 2 in binary is obvious. It is as easy as determining if a number in base 10 is divisible by 100 - you just look at the last two digits and if they both happen to be 0’s, you know that the number is divisible by 100. Similarly, you have to just look at the last two symbols of the binary number to determine its divisibility by 4 - if the last two symbols are 0’s, then you know that the binary number is divisible by 4. Thus, we are looking for an FSM that identifies binary strings ending in 00. We will try to derive some hint from a slightly different problem; it is easier determining an FSM that accepts only binary strings beginning in 00 (Fig- ure 2.6). Problem 5 Design a finite state machine that accepts only binary strings be- ginning with 00. You need four states - one for no zeros (A), another for the first zero (B), the third (accept state) for the second zero (C) and a fourth dead state where you go if you encounter a ‘1’ as one of the first two symbols. To construct an FSM that accepts only binary strings ending in 00, we will start by retaining states A and B above, while slightly changing their semantics; state A captures all strings that do not end in 0, state B captures strings that 2.2. CLOSURE: A BRIEF INTRODUCTION 15 Figure 2.6: Finite state machine that accepts only binary strings beginning in 00. Figure 2.7: Finite state machine that accepts only binary strings divisible by 4. (i.e., ending in 00). end in one 0 and state C captures strings that end in two 0’s. If the FSM encounters a 1 in any of the three states, it goes back to state A. Figure 2.7 shows the FSM that accepts only binary strings that are divisible by 4. Note that state A is an accept state because every empty string (represents 0) is divisible by 4. Also note that since we want the empty string as well as the string with a singleton 0 to be accepted by the FSM, we have made state C the start state3. 2.2 Closure: A Brief Introduction We could have done the design of the last FSM faster if we had known something in advance. Let us say you knew that if you could accept a particular language, you knew how to accept its reverse, that is, any time you knew how to design a machine for a language, we had a mechanical process for designing an FSM that 3 One might think - how will one know to design FSMs for problems harder ones than these? The answer we will give straight up in the beginning. This is a design process. There is no algorithm for developing an FSM for a given description of strings. There is practice involved in designing FSMs! The discovery process could take more time for some problems and less time for some others. 16 CHAPTER 2. FINITE STATE MACHINES Figure 2.8: Finite state machine that accepts only binary strings containing the pattern 110110. accepts only strings that are reverse of strings in the first language. For example, we were able to easily design an FSM for the set of strings that start with 00. Is there a mechanical process for designing an FSM for the set of strings that end in 00? The answer is fortunately yes. In fact there a bunch of operations, called closure operations such that if there is an FSM for a set of strings, there is an FSM for the set of strings that are obtained by applying the operation on each member of the set. Examples of closure operations are (a) string reversal (b) complement (c) union and (d) intersection. The closure operation is a key property that is studied under models of computation, not just because it is interesting mathematically, but because it gives an entire repertoire of tools to determine if a set of strings will be accepted by any model of computation and how to construct such a model. For instance, what if toggle all the final and non-final states of the FSM for problem 5? Whatever used to be not accepted will now be accepted and vice versa. That actually is a good enough truth - FSMs are closed under complement. This is an informal closure proof pertaining to complementation. Some of the big applications of FSMs are in string searching tools. Once the FSM is designed, it could be implemented as a program. There are no obvious ways to do the programming - there are clever and bad ways to do it. There are a host of programs to string matching, the first of which was by Knuth, Morris and Pratt. Next, we will consider a problem in this domain. Problem 6 Design a finite state machine that accepts only binary strings con- taining the pattern 110110. We can have a series of states that represent how much of the string we have seen. When we have seen the entire string, we will put a double circle (representing the accept state). We will use an to represent an empty string (some books instead use λ), which means we have no string at all. State A represents empty strings, state B represents the fact that the FSM has seen a 1, state C a 11 and so on. Figure 2.8 shows the corresponding FSM. Given, this how do we construct an FSM that accepts only those strings that do not contain the string 110110? Problem 7 Design a finite state machine that accepts only binary strings that do not contain the pattern 110110. The way to do this is to flip the accept and non-accept states of the FSM for problem 6. It is very hard to explicitly construct an FSM for problem 7 from 2.2. CLOSURE: A BRIEF INTRODUCTION 17 Figure 2.9: FSM that accepts only strings that have at least two 0’s following every 1. scratch. And it actually happens to contain 6 accept states. If we try giving meanings to each of these accept states, it would take a lot more time! We will keep this example in mind, because when we later add on an ex- tra arm called non-determinism to the above machine, we are going to give it power4. If we give it power, it turns out not to help it all, but just helps us write easier machines. We will need a proof that a machine with the power of non-determinism can always be converted back to something without the power. There is always a way to write some of the fancier machines to something with- out the fancy stuff. Let us do another problem. Problem 8 Design a finite state machine that accepts only binary strings such that every 1 is followed by at least two 0’s. We are doing this example, because in this machine, it is difficult to give semantics to the memory of each state right away. It is much more of a real time, left to right processing machine - it is not a recursive idea. In Figure 2.9, we give a first cut solution to this problem. Notices that states A and D, are for all purposes, identical; state A embodies the semantics of state D which is: the machine has not yet seen a 1 that has not been followed by two 0’s. We are pointing it out because that is the idea of minimizing machines (which we will discuss later). Minimization involves noticing groups of states that actually do the same thing, irrespective of where you start and merging them. Thus merging states A and D, we get he minimized machine as in Figure 2.10. Keep in mind that there is no single finite state machines that alone accepts a set; there are infinite number of machines that will accept a set if at all there is one. There is a unique minimum machine though. We will do one last example, before moving on to the next topic of non- determinism. 4 Exercise 18 CHAPTER 2. FINITE STATE MACHINES Figure 2.10: Minimized FSM that accepts only strings that have at least two 0’s following every 1. Problem 9 Design a finite state machine that accepts only binary strings that are not divisible by 3. Now you really have to do division in some sense on the binary string. Typ- ically, in division, you have to store a lot of information; you have to remember results that are proportional to the size of the string. And that does not seem like a constant! Real division of a longer string will take a larger amount of space and hence cannot be done with a finite state machine. So how do we go about solving this problem? We really do not have to perform division. We just need to figure out whether or not the string is divisible by 3. In order to do that, we just have to figure out what the remainder is (which is not generally taught). This essentially implies that you need as many states as there are remainders. But do you need to convert the binary number to base ten to get the remainder? It turns out that the answer is ‘no’. This is a much harder problem and you might sit on it for a few hours if you do not get the idea. Is the binary number ‘1’ divisible by 3? So if the machine sees a 1 and the input stopped with that, the remainder will be 1. So that is what we will remember. Continuing from there, let us say we see a 0. So what we have seen now is 10, which is 2 in base 10 and is not divisible by 3. Let us not calculate this way. Rather, let us calculate how the remainder changes when we put a 0 at the end of a string which already leaves a remainder 1 when divisible by 3. Putting a 0 at end of a string, the value doubles and so does it remainder! Thus, the remainder of the new string will be 2, which we will store in another state. Let us move on to the third spot. Suppose the next symbol we see is a 1. The new remainder (for 101) is obtained by doubling the previous remainder and adding 1. This yields 2 × 2 + 1 = 5 which implies the new remainder will 2.3. EXAMPLES OF NON REGULAR SETS 19 Figure 2.11: FSM that accepts only strings that are not divisible by 3. be 2 (2 = 5 mod 3). Now let us say the fourth symbol is a 1. The remainder for 1011 can be again determined by only using information about the previous remainder, which happens to be (2 × 2 + 1) mod 3 = 2. Thus 1011 is also not divisible by 35. In Figure 2.11, we implement the above logic in an FSM that will determine when an input binary string is not divisible by 3. There is beautiful symmetry in this machine. We did a whole bunch of examples that are accepted by finite state machines. A set of strings that can be accepted by a finite state machine is called regular. They are called regular because they tend to grow in regular intervals (we will learn more about the regularity property in the following sections). Regular sets are really well understood. There are machines and grammars that describe them. And there are also regular expressions that describe them. You encounter regular expressions very often in web development tools where you need to search for patterns such as specific urls, etc. Thus there are different ways of describing these machines. We will show equivalence between these different representations and nail down regular sets as best as they can be. 2.3 Examples of Non Regular Sets Let us try to come with some simple examples that are not regular. The reason we are touching upon this topic is because we will not be able to appreciate moving up to context free grammars or turing machines, if we are under the impression that finite state machines could do everything. Why have unbounded memory if all you need is finite memory? Why make up a turning machine if finite state machines represent a computer? Finite state machine is actually like a computer with half its brain cut out. What kind of strings cannot be accepted by finite state machines that are easier to describe than just Java programs? Let us look at some famous ones. What we cannot do with finite machines is count! We cannot count to an arbitrary number. In the FSM for problem 9, we counted up to 3. Even in the case of substring matches, we had a finite number of conditions and therefore had a finite number of counts. But determining a machine that accepts only binary strings having an equal number of 0’s and 1’s requires counting and hence cannot 5 You could confirm that the number 1011 in base 10 is 11, which leaves a remainder 2 when divided by 3. 20 CHAPTER 2. FINITE STATE MACHINES Figure 2.12: The strange FSM that accepts only binary strings containing the reverse of pattern 110110. be modeled by a finite state machine. Even a simpler set of strings represented by 1n 0n n ≥ 0 cannot be modeled by a finite state machine6. How do we prove that there is no FSM for a given set of strings? Do we keep trying to come with one till we are exhausted? We need to prove this formally. And there is a really wonderful proof that there are sets that are not acceptable by finite state machines. This is proof technique is called diagonalization and it is used over and over again at the higher levels. The proof technique comes with a very abstract set. We will get to that in great details some time later in this chapter. However, for finite state machines, there is an even easier method of proof that even strings as simple as 1n 0n cannot be accepted by any FSM. This proof will be very constructive7. 2.4 Non Determinism What if we reverse all the arrows of the FSM in Figure 2.8? And switch the roles of the start state and accept state? We observe that there are two outgoing arrows on a symbol ‘0’ for state G. We get a machine in Figure 2.12 that does not even make any sense! And how wrong can it be when it does not even make any sense? This nature of a machine can be actually ascribed semantics, and is called non-determinism. Similarly, where would the start state for the FSM in Figure 2.11 be if we reverse the machine? The start state could be either of the two existing accept states B and C! This brings up another difficulty with reversing machines. We could define a dummy start state D and have transitions from that state to B and C. That is, you could go from the start state to either of states B or C without seeing any symbols. These are called -transitions (or sometimes λ-transitions). They also make a machine non-deterministic, because you do not know what the machine is going to do when the machine is at such a state. The above idea of reversing really begs us to ask the question: What does it mean to have a choice on the same symbol? If we can make some sense of what 6 Exercise: Another example set of strings that cannot have an FSM representation is the set of Fibonacci numbers. The nth Fibonacci number is computed as F (n) = F (n − 1) + F (n − 2). Since determining the nth Fibonacci number entails some computation, these numbers cannot be modeled by any FSM. 7 In general, the proofs in this tutorial are very constructive. They do not involve a lot of difficult mathematics but instead use very straightforward arguments. 2.4. NON DETERMINISM 21 Figure 2.13: The strange FSM that accepts only reverse of binary strings that are not divisible by 3. that means, may be the method actually does work? And it turns out that it does work. We just need to explain what it means to have the same symbols on two outgoing arcs on the same state. This idea again leads to non-determinism. Non-determinism helps us realize that we can compute things faster than we used to realize without it. Non-determinism however does not give any new power; you could take any non-deterministic machine, and convert it back to a deterministic machine without having these funny duplicates. This is a nice thing to know because it lets us use these facilities without taking us out of the universe of finite state machines. It is the equivalent to learning machine language as your first programming language. And you understand program- ming through that, to the extent that you can completely program anything. And then somebody teaches you Scheme. And then you realize that you can do recursion much easier using Scheme. But there was not anything you could not have done with machine language that you can do with Scheme, though it is a lot easier to use. They are equivalent in power, though not equivalent in ease of use. That is the same case here. If we tack on non-determinism to a finite state machine, they have equivalent power, but one is much easier to use and can save us a lot of time. We will now provide a formal definition of non-deterministic finite state machines. Non-deterministic Finite State Machine An NFA is represented essentially like a DFA: A = (Q, Σ, δ, q0 , F ) where: 1. Q is a finite set of states. 2. Σ is a finite set of input symbols. 3. q0 , a member of Q, is the start state. 4. F , a subset of Q, is the set of final (or accepting) states. 22 CHAPTER 2. FINITE STATE MACHINES Figure 2.14: The non-deterministic FSM that accepts only binary strings that contain the pattern 110110. 5. δ, the transition function is a function that takes a state in Q and an input symbol in δ as arguments and returns a subset of Q. Notice that the only difference between an NFA amt a DFA is in the type of value that δ returns: a set of states in the case of an NFA and a single state in the case of a DFA.. As an example, we will design a non-deterministic machine for problem 6 with much ease. We restate problem 6 here for convenience. Problem 10 Design a finite state machine that accepts only binary strings that contain the pattern 110110. Somewhere in this string, if there is an occurrence of 110110, we should accept it (without explicit knowledge of where in the string this pattern occurs). We we will keep reading in the symbols till we get to the beginning of that pattern. When when we do get to that pattern, we will read it using states B through G and accept that string in state G. That is how simple it can become using non-determinism! Figure 2.14 shows the non-deterministic machine for problem 6. We need to rigorously define what this strange machine means and what it really accepts. How do we really interpret what strings coming into the machine should be accepted or rejected? And then we will convince ourselves that we can turn such machines into regular deterministic machines. The definition is: If you look at any string that is given to you as a candidate and figure out some choice through the machine that ends up in a final state, then you accept that state. There can be lots of choices that do not end up in a final state. But if you can find one set of choices that end up in a final state, then we accept the string. For instance, for an input string, 1111011000. What if you tried the following sequence of state transitions: AAAAABCD - you are dead when you reach the 8th symbol. That was a bad choice and in general, you can potentially hit upon many bad choices. But that does not mean that the string should be rejected; ending up in a rejection state on one choice of moves does not mean you reject. Here is one choice of the sequence of state transitions that will accept this string: AABCDEF GG. The only way you reject a string is if none of the choices get you to the final state; if a choice gets you to the final state, you accept the string. If none of the choices get you to the final state, you reject the string. In the above example, 2.5. COMPUTATIONAL COMPARISON OF DETERMINISTIC AND NON-DETERMINISTIC MACHINES23 since there was a transition of states AAAABCD that lead to the final state, the machine can be understood to accept the string 1111011000. In general for any string that contains the pattern 110110, the machine can keep looping around in state A, till the pattern is observed and transition through the sequence of states B through G, when the sequence 110110 is observed. Thereafter, the machine keeps looping in the state G. On the other hand, if any string does not contain the sequence 110110, the machine cannot sneak to the final state G, using any possible sequence of state transitions and hence will not accept the string. 2.5 Computational Comparison of Determinis- tic and Non-deterministic Machines Non-determinism takes a little time to get used to. Lot of things that work with determinism, do not get quite analogous with non-determinism. Here is one. What if we toggled all the states in a non-deterministic finite state machine? That is, we change all final states to non-final states and all non-final states to final states? When we do this in deterministic machines, we get a machine that accepts only the complement of the set that was accepted by the original machine. What does a non-deterministic machine such as in Figure 2.14 accept if you toggle the final and non-final states? It will accept every possible string! Therefore complementing in non-deterministic machines does not work by toggling the states. The only way you could get the complement for a non- deterministic machine is to first convert it into a deterministic machine and then complement it. The reason is that the semantics of a non-deterministic machine is not symmetric; you accept something if there exists a way to get to the final state and do not reject it if there is a way that does not lead to the final state. So it is this asymmetry in a non-deterministic definition that makes this complementing not work. For the example problem 6, we will pictorially compare the computations on the string 1110110 of the deterministic finite state machine in Figure 2.8 with the equivalent non-deterministic finite state machine in Figure 2.14). The com- parison is shown in Figure 2.15. Note that the string has 7 symbols. The finite state machine moves sequentially from one state to another through exactly 7 states till it reaches the accept state. The equivalent computation for the non- deterministic machine can be represented as a tree, where a 2-way branching at each node represents the choice of move that the machine has. The com- putation for the non-deterministic finite state machine looks like a tree that represents all the parallel computations that can be going on at the same time; non-determinism lets you do ’or’s in parallel. The string will be accepted if there is a path somewhere down that ends up in a final state. You do not accept the string when all the paths from the root have no accept state at the bottom. In all this non-deterministic machine need to perform 20 computations. 24 CHAPTER 2. FINITE STATE MACHINES Figure 2.15: Comparison of the number of computations by the deterministic (left) and non-deterministic (right) finite state machines (for problem 6) on the input string 1110110. 2.6 Conversion From Non-deterministic to Equiv- alent Deterministic Machine As another example, consider problem 11. Problem 11 Design a finite state machine that accepts only binary strings that end in 00. We will make a non-deterministic machine for the above problem. And then we will convert it into a deterministic machine by a very mechanical process that we could write a program to do. This will illustrate that any non-deterministic machine can be converted into a deterministic machine and therefore that a non- deterministic machine gives us convenience but no extra power. This machine will be similar to what did for the previous problem; it will have one state (the start state) A that will chew up as many zeros as it wants8. And whenever the non-deterministic machine sees a sequence of two ending zeros, it will land in the accept state C. Note that it is often easy to construct a non-deterministic machine that will accept all strings it is supposed to and also accept some strings that it is non supposed to. One has to be very careful about this point while constructing non-deterministic machines. In the machine in Figure 2.16, since there are no outgoing arrows from state C, we can be rest assured that the machine cannot accept any string that does not end in 009. We will now present a mechanical process for converting the above non- deterministic machine (NDA) to an equivalent deterministic machine (FSM). 8 This is similar to trying them all at the same time and therefore non-determinism is very often referred to as guessing. 9 Leaving out an out-going arrow from a particular state in a non-deterministic machine means that any symbol observed while at that state would lead to a non-accepting state and implies that the sequence of guesses that lead to the state was bad. The symbol corresponds to something like an ‘interrupt’ or an ‘unknown instruction’. 2.6. CONVERSION FROM NON-DETERMINISTIC TO EQUIVALENT DETERMINISTIC MACHINE25 Figure 2.16: Non-deterministic FSM that will accept only strings that end in 00. Figure 2.17: Deterministic FSM that will accept only strings that end in 00 by keeping track of the set of states the equivalent NDA in Figure 2.16 might be after reading any number of symbols. The way to do this is to first label the states so that you have some way to refer to them. The states in the above machine are already labeled. We will construct an FSM that we keeps track of the set of states the NDA might be in after reading any possible string. Let us say we start at state A in the NDA. Where will the NDA be when it sees a 0? It could be in either A or B. We will remember that in a new state in the FSM, denoted A, B. When the NDA gets a symbol 0 in A, B, it could transition to either A or B or C. We will represent this possibility in a new state A, B, C in the FSM. On observing a symbol 1 on A, B, the NDA could transition only to state A. If in A, B or C, the NDS gets a 0, it will go back to one of A, B or C and therefore, the FSM will have a loop on the state A, B, C for the symbol 0. Similarly, on a 1 while in state A, B, C, the FSM will transition to state A. Figure 2.17 is a deterministic machine for problem 11 by keeping track of where the NDA in Figure 2.16 might be after reading any number of symbols, assuming it does not crash. To make the FSM accept exactly and only those strings that are accepted by the NDA, all those states in the FSM that include at least one accept state from the NDA are designated as accept states. This is because of the semantics of acceptance of a string in a non-deterministic machine: If there is any sequence of state transitions that could lead to an accept state in the non-deterministic machine, the string is accepted by the non-deterministic machine. 26 CHAPTER 2. FINITE STATE MACHINES Note the subtle difference between this machine and the machine in Fig- ure 2.7 for problem 4. While the FSM in Figure 2.7 accepted even empty strings, this FSM does not. Let us run through an example string 0110. Running this string through the NDA, the only states you could land up in are A or B. Running the same string through the deterministic machine, the only state the machine will possibly be when it is done is in A, B. We see a correspondence here. The above process of converting a deterministic finite state machine to an equivalent non-deterministic machine can always be done. When we did this process, the number of states remained the same and therefore there was no trade off. We got the advantage of determinism owing to the conversion. There could be a trade off though and the number of states in the deterministic ma- chine could explode. By how much could the number of states explode? This can be answered by examining, what is represented in the states of the deter- ministic machine; the states in the deterministic machine represent all subsets of states that the non-deterministic machine could be in. If the non-deterministic machine has n states, the number of possible subsets are 2n. In the above ex- ample, we had only three subsets from a collection of 23 = 8 subsets which was very modest. In general, the payoff for going from non-determinism to determinism is an exponential growth in the number of states. We do not really care about this explosion if it is finite. But when we talk about machines in the Turing Machine layer, the exponential trade off in states turns into exponential growth in time, though it is the same idea. That is why, we convert non-deterministic algorithms to deterministic algorithms, we end up getting exponential time deterministic algorithms when we had polynomial non-deterministic algorithms to start with. In this chapter, we saw this trade off in its most trivial form. Let us formally prove the equivalence: Every NFA has an equivalent DFA: PROOF: If a language is recognized by an NFA, then we must show the existence of a DFA that also recognizes it. The idea is to convert the NFA into an equivalent DFA that simulates the NFA. Let N = (Q, E, δ, q0 , F ) be the NFA recognizing some language A. We construct a DFA M = (Q0 , E, δ 0 , q00 , F 0 ) recognizing A. Before doing the full construction, let’s first consider the easier case wherein N has no arrows. Later we take the arrows into account. 1. Q0 = P(Q). Every state of M is a set of states of N. Recall that P(Q) is the set of subsets of Q. 2. For R ∈ Q0 and a ∈ Σ let δ 0 (R, a) = {q ∈ Q|q ∈ δ(r, a) f or some r ∈ R}. If R is a state of M , it is also a set of states of N. When M reads a symbol a in state R, it shows where a takes each state in R. Because each state may go to a set of states, we take the union of all these sets. Another way to write this expression is 2.7. CLOSURE 27 [ δ 0 (R, a) = δ(r, a) r∈R. 3. q00 = {q0 }. M starts in the state corresponding to the collection containing just the start state of N. 4. F 0 = {R ∈ Q0 |R contains an accept state of N }. The machine M accepts if one of the possible states that N could be in at this point is an accept state. 2.7 Closure Closure A family of sets is called closed under an operator if the application of the operator on all the elements of a set will yield another set in the same family. Regular sets are closed under some operations. The more are the number of closed operations10 , the easier it will be to determine whether or not a set of strings is regular. We touched upon closure under complementation in a previous section. We will delve on it in some more details in this section. We will also address closure under string reversal as well as under union and intersection of regular sets. Recall problem 8: Problem 8 Design a finite state machine that accepts only binary strings such that every 1 is followed by at least two 0’s. The reason we want to go over this is because it gives another review of an example of a finite state machine. And then, we will try to make up a machine that accepts the reverse of this language. In English, the reverse of this language can be described as: Reverse Language of Problem 8 Set of binary strings such that every 1 that shows up has at least two 0’s preceding it. Instinctively, if we started with the above reverse problem, it would seem a lot harder than the original problem. But that instinct is actually wrong; it is easy to do the reverse problem though one might not be able to figure 10 Exercise: You will find that for every new exam, the instructor makes up a new operation. Other operations are union, intersection, min, max, prefix, suffix, half, log, square root. At one point in the late sixties or early seventies, there was a paper that basically enumerated a list of operations under which finite state machines are closed under and then tried to characterized them. For your own knowledge, finite state machines are pretty much closed under any operation you can think about except a few. The bottom line is that you are pretty safe with finite state machines. 28 CHAPTER 2. FINITE STATE MACHINES Figure 2.18: Minimized FSM that accepts only strings that have at least two 0’s following every 1. that out at first glance. However, we had found the solution to problem 8 very naturally in Figure 2.10. The fact that knowing the FSM for a set helps you figure out the FSM for its reverse makes the second problem a lot easier and mechanical. This is an advantage of understanding the closure properties. The problem of reversing a machine, is a very general idea. But in doing this, we will come up with a non-deterministic machine again, which is a good motivation for non-determinism. Below (in Figure 2.18), we reproduce the FSM for problem 8 from Fig- ure 2.10. For the following pedagogical reason, we will refer to the non-minimized bigger machine (Figure 2.9) instead of the equivalent minimized machine (Fig- ure 2.10) - the example reversal procedure we will do next becomes a little more involved and general if the machine is bigger. We will now go ahead and try to reverse the above machine, using it as a template to show the general procedure for designing a machine that accepts the reverse of a language. It is not straightforward doing this surgery to get a deterministic machine straight up. However, 1. reversing the arcs of the deterministic machine in Figure 2.18 and 2. swapping the start and accept states is a completely general mechanical process for reversal that yields a non- deterministic machine, as in Figure 2.19. Does the presence of two start states give any extra power over the deter- ministic machine? No, and that is what we need to ensure. We will replace the two start states with a single one and enable transitions to these two states from the new start state on empty transitions. Moreover, the vertex Dead with its accompanying transition arcs is its own strongly connected component, i.e., it has no incoming edges from any of the remaining vertices and there is no need to draw it. We modify the non-deterministic machine in Figure 2.19 along these lines to the non-deterministic machine in Figure 2.20. 2.7. CLOSURE 29 Figure 2.19: Non-deterministic machine that accepts only strings that have at least two 0’s preceding every 1. Figure 2.20: Modified non-deterministic machine that accepts only strings that have at least two 0’s preceding every 1. 30 CHAPTER 2. FINITE STATE MACHINES The reversal of a deterministic machine yielded a non-deterministic machine. Mimicking the method from the previous section, we will try to convert the non-deterministic machine in Figure 2.20 to a deterministic machine that is equivalent. We will then try to interpret the deterministic machine. By the mechanics of the reversal process, we will come up with a deterministic machine that actually accepts the reverse language of problem 8. This idea can get very deep when you can actually come up with solutions to problems that you would not have otherwise thought of. There is a really interesting thing about reversing machines and it relates to the fact that some times when you reverse machines, pieces get cut off. Note that by applying the mechanical procedure for reversal to the minimal finite state machine for this problem (as in Figure 2.10 we could have constructed a non-deterministic machine that has fewer states than in Figure 2.20. We will mention here is that there is no notion of a minimal non-deterministic machine as is there with a deterministic machine (where you have a unique minimum). Because, in the case of non-deterministic machines, you could have several equivalent machines that have the same number of states. Next, we will convert the non-deterministic machine in Figure 2.18 into a deterministic machine using the completely mechanical process described in the previous section. The procedure involves keeping track of exactly where the non- deterministic machine might be after reading any sequence of symbols. Where might the machine end up if the first symbol it sees is a 0? Depending on where it chooses to start from (one of states A or D), it will land up in a possible one of any state from the set {A, D, C}. When it sees a starting symbol of 1, there is none of the states were it might end up. Or in other words, on seeing a starting symbol 1, the machine will end up in one of the states in the empty set . This empty set is what corresponds to the dead state φ. The only place to continue exploration from is {A, C, D}. Thus, keeping track of the possible states of the non-deterministic machine, we can come up with the equivalent deterministic machine in Figure 2.21. Since A is an accept state, the state corresponding to the set {A, D, C} will also be an accept state. This is very important to realize; the deterministic machine should accept a string whenever the non-deterministic machine could be in one or more final states on seeing the same string. Along similar lines of reasoning, the start state should also be an accept state. One could also associate semantics with each of the states in the deterministic machine in Figure 2.21: (a) State {A, C, D} represents that one 0 has been seen, (b) State {A, B, C, D} represents that two 0’s have been seen. Here is an interesting folklore about theoretical computer science that deals with the reversal technique11. If you want to minimize a minimal finite state machine, reverse the machine, convert the non-deterministic machine to a de- terministic machine, reverse it again and finally convert the non-deterministic machine again into a deterministic machine; the final machine is equivalent to the machine we started with, it is deterministic and is always minimal! As far 11 Exercise 2.7. CLOSURE 31 Figure 2.21: Deterministic machine that accepts only strings that have at least two 0’s preceding every 1 and is equivalent to the non-deterministic machine in Figure 2.20. as it is known, this fact is not published anywhere. However, you have to be careful to chop off all the disconnected components at every step of the con- version process. Reversal can introduce the loss of dead states as well as other things. Characterizing exactly what it is that you lose, you end up losing states that are identical. Though this is not a very efficient process for minimization, it always works nevertheless! There are polynomial time algorithms for mini- mization that we will deal with in the remainder of this chapter and that will give us a more efficient recipe for minimizing deterministic machines. 2.7.1 Closure under Union The proof for closure under reversal is a tough proof that lead us into all sorts of alleyways. We will now do some that are easier (closure under complementation was already covered and was quite easy). Suppose we are given two sets of strings Set1 1 Set of strings having even number of 0’s. Set2 Set of strings containing the sequence 101. Both the above sets are regular sets and have finite state machines. You could easily draw them and let us say they are F SM1 and F SM2 for Set1 and Set2 respectively12. F SM1 should have 2 states and that for the latter should have 3 states. What if we want a finite state machine F SM3 for the following set, Set3 ? Set3 Set of strings that either have an even number of 0’s or contain the se- quence 101. 12 Exercise 32 CHAPTER 2. FINITE STATE MACHINES Figure 2.22: Non-deterministic machine that accepts the union (Set3 ) of the set of strings (Set1 and Set2 ) by utilizing the finite state machines F SM1 and F SM2 as black boxes. If you try to do this from scratch, you need to keep track of two things simul- taneously. And you can get confused. But if you raise your level of abstraction a little bit and think of Set3 as a union of Set1 and Set2 , you can conjure up the non-deterministic finite state machine F SM3 mechanically without caring for what F SM1 and F SM2 actually look like as in Figure 2.22. One only needs to know what the initial and final states of F SM1 and F SM2 are to accomplish this. When the machine is done with any of the two individual finite state ma- chines, it jumps to the final state of the combined machne (concentric circles situated outside the yellow boxes). Converting this to deterministic machine might make the resulting machine look a little ugly. If you went through the trouble of converting this non- deterministic machine to a deterministic machine, you will do exactly what is called the product13. All that product does is pair states together from the two machines in every possible way. But if you take the simple mechanical route above, you need not bother about the product of two machines! There is one more power of non-deterministism. We can see that union is a closed operation without dealing with any complicated arguments, such as the fact that non determinism is equivalent to determinism, etc. 2.7.2 Cosure under Concatenation Consider the problem of constructing a finite state machine that accepts only the following set of strings: Set4 Set of strings such that each string is a concatenation of a string from Set1 and a string from Set2. 13 Exercise: Tutorial 2.7. CLOSURE 33 That is, we require an FSM that accepts only those strings that have a prefix with an even number of 0’s and a suffix containing the sequence 101. The FSM can be constructed easily as follows: Construct the FSMs F SM1 and F SM2 for Set1 and Set2 respectively. Connect all the accept states of F SM1 with an move to the start states of F SM2. The move is very powerful; it lets you stay in F SM1 as long as required and then jump from the accept state of F SM1 , looking for a string in Set2. The accept state of the ‘concatenation’ machine will be the accept state of F SM2. If you get rid of the non-determinism, the machine can become really complicated14. 2.7.3 Closure under Complement Complementat is very straightforward, in which you just toggle all the final and non-final states. This was already discussed earlier. 2.7.4 Closure under Intersection For any sets A and B, let us recollect the DeMorgan’s law: A∩B =A∪B The law is all about boolean algebra. The key thing is if we have two machines, one for A and another for B, we can get a machine that does their intersection by a simple application of the DeMorgan’s law. 1. We can complement the FSMs for A and B (since we already learnt how to complement deterministic machines) 2. then union the resulting two FSMs to get a non-deterministic machine (we just did union!) 3. convert the non-determinstic machine into a deterministic machine (which we studied earlier) 4. and finally completement the deterministic machine. Note that step 3 is required because we learnt to complement only deter- minstic machines. There is more work for intersection, but it can definitely be done. Thus, regular sets are closed under intersection. We should know that for any language, we cannot have closure under just two out of the three properties of union, intersection and complement, by virtue of the DeMorgan’s law. Thus, if the union and intersection of two sets in the language belongs to the same family, so will complement and so on. Either the language is closed under all of them or just one of them. And this is going to happen in the levels above FSMs. As mentioned earlier, FSMs are closed under most operations. 14 Exercise: construct the non-deterministic and deterministic machines for this problem 34 CHAPTER 2. FINITE STATE MACHINES Figure 2.23: Deterministic machine that accepts the set Set1 of strings. Figure 2.24: Deterministic machine that accepts the set Set2 of strings. The above method for intersection is not the nicest way to execute it though. Any time we have to move from non-determinism to determinism, there is an exponential potential. We will see a better and direct way to perform intersec- tion15 using the notion of product of FSMs16. This will reinforce the notion of non-determinism one more time. Figures 2.23 and 2.24 show two FSMs. We will keep track of what both these machines are doing as they read symbols. When they start, these machines are in the pair of states, < A, X >. Let us forget about final states for now and simulate the condition of the two machines at any point of time. On encountering a 0, the machines transition to the pair < B, X > and on a 1, they transition to < A, Y >. From < B, X >, on a 1, it will transition back to < A, X > and so on. When you finish this way, you get a deterministic machine. How do we interpret this to do union or intersection in a way that is very different from how we did before. The above procedure keeps track of where both the machines are. If you want a machine that is a union, that is, it accepts any string that is either accepted by machine 1 or 2, then the final state of the resultant machine should be any state that has either a final state from the first machine or a final state from the second machine (Figure 2.25). But if you wanted the intersection of the two, the final state of the resultant machine should be all the states whose ‘representation as a pair’ corresponds to a pair of accept states in the two respective machines (Figure 2.26). How big might this get? Every new state in the new machine will correspond to a pair of states in the old machine. If there are M states in the first (here 2) 15 Tutorialand Exercise 16 Pedagogicaltools for Theoretical Computer Science at Duke University developed with NSF project. You can download the tools and play around. 2.7. CLOSURE 35 Figure 2.25: Deterministic machine that accepts the union of Set1 and Set2 of strings. Figure 2.26: Deterministic machine that accepts the intersection of Set1 and Set2 of strings. and N in the second (4 in this case), there will be M × N (here 8) states in the worst case with the resultant machine. Note that if you build a non-deterministic machine for the union or intersection operators using the technqiues discussed earlier, the number of states will most probably explode exponentially when you convert the resultant (from union or intersection) non-deterministic machine to a deterministic machine. The key is in noticing that while performing intersection or unions, the subsets always pairs. Figure 2.26 shows the resultant machine. 2.7.5 Alternating Finite State Machines A salient feature of non-determinism is that it mixes well with ORs (union) and not with ANDs (intersection). And this going to come all the way up in Turning machines. Recall that in non-determinism, you could have two outgoing arcs from a single state on the same symbol. And we defined the semantics as: you accept if there is some way to get to the final state. In other words, every state is an OR state. This definition makes non-determinism more ameanable to ORs. If we define it in terms of getting to a final state from all the paths (by defining every state as an AND state), the machine will bear more natural relationship with ANDs. People do definte finite state machines with ”AND” states. Some times people mix the two nodes; some states are labeled OR and some are labeled AND. These machines are called alternating finite state machines. And you define whether the machine accept a language in the following terms: for any 36 CHAPTER 2. FINITE STATE MACHINES string that goes through an OR state, there should atleast be one path that leads to a final state and for any string that goes through an AND state, all the paths passing through that state should lead to the accept state. And it turns out that these alternating machines are no more powerful than the deterministic finite state machines. This alternating idea goes up to Turning machines about which we may talk much later in the tutorial. 2.8 Equivalent Representations of Regular Sets To summarise the discussion so far, we discovered that non-deterministic finite state machines can be converted into deterministic finite state machines. Either of them are appropriate ways of looking at the set of finite state machines. We will sketch a bigger picture of where we are heading for the rest of this chapter. In web programming and pattern matching, we see regular expression very often. Regular expression is another way of defining the set that finite state machines can accept. This is not obvious. We will see how any deterministic finite machine can be converted into a regular expression. But this does not complete the equivalences. We will complete the triangle by showing that any regular expression can be converted into a non-deterministic machine. These are three different windows to the same picture. And in computer science you desparately want different views to the same thing. Depending on the requirements, one of the alternatives will seem easier. And more the number of tools, the easier it will be to cater to a requirement. But there is a fourth thing to these three, called regular grammars, which are sometimes also called linear grammars. These are also equivalent to all the three other representations. We will fit regular grammars into the picture by showing their equivalence to deterministic machines. Thus, we will be able to look at finite state machines from all the different possible viewpoints. There is no such thing as an analog for regular expressions as we go up the hierarchy - it simply disappears. But there is such a thing as non-determinism and their grammar as well as machine analogs. Grammar and machine analogs rise all the way up the hierarchy up to the Turning machine. And this grammar and machine parallel all the way up to Turning machines is called the Chomsky hierarchy. The interesting thing is that grammars and machines do not look alike but they always come in pairs! 2.9 Non-regular Sets and Introduction to Pump- ing Lemma What about characterization of the set of strings that cannot be modeled by a finite state machine? Fibonacci numbers in binary cannot be acceptable by a fi- nite state machine. Fibonacci numbers form a sequence defined by the following recurrence relation: 2.9. NON-REGULAR SETS AND INTRODUCTION TO PUMPING LEMMA37 Figure 2.27: Equivalent representations of regular sets. 38 CHAPTER 2. FINITE STATE MACHINES F (n) := 0 if n = 0 1 if n = 1 (2.1) F (n − 1) + F (n − 2) if n > 1 Similarly, strings containing an equal number of 0’s and 1’s cannot be ac- cepted by a finite state machine. Anything that requires counting and arith- matic with more than finite storage cannot be captured using FSMs. Then the question is what can be captured by finite state machines? Definitely, problems for which you can write regular expressions can be modeled using finite state machines. It will also be nice to show an example which cannot be accepted by a finite state machine. Proving you cannot do something is much harder than proving that you can do it. Because potentially we will require an infinite number of trails. Hence, to do this efficiently, you will need a trick. Consider the set17 0n 1n |n ≥ 1. This set includes the empty string , 01, 0011, 000111, etc. Our claim is that there is no finite state machines that accepts this set. We can try to make one but keep in mind that the FSM should not include strings it is not supposed to accept. Remember that in general, the more an FSM includes, the more it is likely to make a mistake. You will find that you will have to make the finite state machine more and more powerful in order to accept this language. And what do you get when you make an FSM more powerful by allowing an infinite number of states (by giving some external data structure to work with)? If the data structure happens to be an array, you get a turing machine. If the data structure is 2 stacks, you again get a turing machine! If you give it a single stack, you get a push down automaton. Can we systematically convince ourselves that there can be no finite state machine that accepts the set 0n 1n |n ≥ 1?18. Let us say some one (Adversary) does come up with a machine for this problem. Then we will refute this claim through a simple dialogue: Question: Tell me the number of states in your machine. Adversary: The machine has 24 (or any other fixed number) of states. Question: Ok do me a favour. Take the string 024 124 and run it through your machine. You would admit that since there are 24 0’s here, somewhere in those 24 symbols, you have to encounter the same state twice. This is because when you start from the initial state and keep going to new states, you have only 23 moves before you encounter a state for the second time. Can you tell me how many zeros got read before the loop (path between the first and second encounters of the state). Adversary: There were 17 zeroes before the loop, 3 zeroes within the loop (note that the loop could be just one 0) and 4 zeroes after the loop. 17 Inthis tutorial, exponentiation will almost always mean repetition. 18 If a student goes home and spends all night trying to build a machine for this problem and informs his friends that he has obtained a machine for this problem, he is probably deluded because of lack of sleep! 2.10. THE PUMPING LEMMA 39 Question: So let us split the string as 017 03 04 124. There are three parts to the computation of the string; the part before the loop, the loop itself and the part after the loop. Now that you have admitted, try this string on your finite state machine: 017 03 03 04 124. What will your machine do? It is going to do the same thing it did before on the first 17 symbols, loop on the next 3 just like it did earlier and then again loop on the next 3 (since it sees the exact string again) and thereafter continue with the earlier path. If the earlier string leads to the final state, the new string should also lead to a final state. What is wrong with this? You were supposed to accept only those strings that have an equal number of 0’s and 1’s. You started out with 024 124. And now I have convinced you without even looking at your machine that the machine has to accept the string 017+3+3+4 124 that is 027 124. Therefore your machine is bogus - not because it does not accept all strings 0n 1n but because it accepts more than it is supposed to. Note that the dialogue is important - the number of states are asked apriori and cannot be changed when faced against the wall. The argument here forces somebody to come up with a hypothetical machine and in a dialogue forces them to admit: Oh! I might have accepted all the strings that you want. But you seem to have convinced me that the machine accepts more than I really want. And this idea is the heart of something called the puming lemma which we will discuss next. It is called so because here you pump up the loop. In the same vein that regular sets are termed so because you want to pump up substrings at regular intervals and obtain other members of the set. Regular sets string out at very linear intervals. Therefore anything that grows faster than linear is 2 never regular. Following this logic, we can guess that 0n is not regular because n2 is not linear. This pumping lemma is also referred to as diagonlization and is a jack ham- mer of a tool that always can spit out a set outside of its collection. We could apply diagonalization on FSMs and discover that there is a set of binary strings that represent FSMs that do not accept themselves. And that set could not possibly be accepted by any FSM19. 2.10 The Pumping Lemma Pumping lemma is used to prove that a set is not accepted by any finite state machine; that the set exists outside the horizon of finite state machines. And it works in the following way. It turns out that if a set is accepted by a finite state machine, it should have the property that we should be able to pump it up somehow. In what follows, we will get more specific about this point. Any set that does not have the pumping property, it could not have come from a finite state machine. This lemma is usually written down in the forward way, that is, regular set⇒pumping property. Pumping Lemma: Let L be a regular set, and let F SM be the corrresponding 19 This idea is akin to the case of a barber who can shave everybody except himself 40 CHAPTER 2. FINITE STATE MACHINES finite state machine with number of states given by |F SM |. The pumping lemma states that ∀z ∈ L where |z| (the number of symbols in z) is greater than or equal to |F SM |, ∃v, w, x such that |vw| ≤ |F SM | and |w| ≥ 1, and ∀i ≥ 0, vwi x is also20 in L. Note that exponentiation here is simply repeated concatenation. It is much harder to understand the lemma this way than through the dialogue form dis- cussed in the previous section. In words, the lemma states that there exists a way to split z into three parts where the first part v is the sequence of symbols that occur before the first loop, w is the substring that corresponds to the first loop (the first loop cannot get past the first |F SM | symbols)21 and x is the rest of the string such that repeating w any number of times yields a string in the regular set L. In practice, the lemma is used in its equivalent contrapositive form: If you present a set, and if it does not have the pumping property, we can conclude based on the pumping lemma that it cannot be a regular set. That is, ¬pumping property⇒¬regular set. The contrapositive can be precisely stated as follows. Recall that pushing a not sign into an existential quantifier yields a universal quantifier. Contrapositive of Pumping Lemma: ∀n which equals the number of states in any machine for a set L, if the following conditions hold, L can not be a regular set. The conditions are: ∃z ∈ L, |z| ≥ n, such that ∀v, w, x with z = vwx, and |vw| ≤ n, |w| ≥ 1, ∃i ≥ 0 such that vwi x ∈ / L. In other words, no matter how you split up z into three pieces, the prover does not get to choose where to split, but the adversary gets to choose the split. In the proof, we have to be able to make our argument based on the fact that the loop might appear anywhere, and the adversary has to tell us where the split is. We need to corner the adversary so that he do not have too many cases to choose from22. The pumping lemma is used in the above form to prove that a set is not regular; any set that cannot be pumped up at regular intervals is not a regular set. Note that there are non-regular sets that have the pumping property. Thus, one cannot always use the pumping lemma to prove that a set is not regular. It is not enough to examine the pumping property of a set to prove that it is regular. An if and only if condition exists only for regular sets that have an alphabet of a single symbol (say 0); there is a linear characterization for all regular sets with a single symbol alphabet23. The characterization is that 20 i = 0 means you avoid the loop altogether. There are occassional examples where the pumping number i you use is 0. 21 There are versions of this lemma statement that do not insist on w corresponding to the first loop 22 It is like a boxing fight where you want to give the opponent the least maneuverability possible. The opponents represent the universal quantifiers while the boxer in first person represents the existential quantifier. 23 Exercise: Extra Credit Problem. 2.10. THE PUMPING LEMMA 41 each string should look like 0mx+b , where x is a variable and m and b are real constants. Example 1: Palindromes We want to show that the set of palindromes, strings that read the same from both left to right and right to left, does not satisfy the pumping property. We will follow the dialogue as illustrated earlier. Question: Tell me the number of states in your machine. Adversary: The machine has k states (for any k). Question: Consider the string 0k 10k. This choice nails the adversary be- cause the only choice of vw is 0p where p ≤ k. For any choice v = 0p−p1 , w = 0p1 , where w is a loop and p1 ≥ 1, vwi x = 0(p−p1 ) 0(i×p1 ) 0(k−p) 10k is not a palindrome24. Thus, by the contrapositive of the pumping lemma, the set of palindromes is not regular. Example 2: Quadratic Strings Consider the set of strings over the single symbol alphabet {0}, given by L = 2 {0k |k ≥ 0}25. We will show that this set is not regular. Question: Tell me the number of states in your machine. Adversary: The machine has n states (for any n ≥ 1). 2 Question: Consider the string 0n , which is in the language L and which is 2 2 longer than n symbols (|0n | ≥ n). Let 0n = vwx, where |vw| ≤ n and |w| = m such that n ≥ m ≥ 1. The value of m is determined by the substring of vw that contains the loop. We need to pick an i and pump it up and our choice will be 2. 2 2 2 vw2 x = 0n +m. Claim: n2 + m is not a square, or in other words, 0n +m 6= 0k for any k. We will prove this by proving that n2 < n2 + m < (n + 1)2. That is n2 < n2 + m < n2 + 2n + 1, or 0 < m < 2n + 1, which is true because 0 < m ≤ n. Example 3: Strings of composite length Consider the langauge of strings L = {0n |n is a composite number}. The language actually obeys the pumping lemma and therefore we will never be able to prove the contrapositive of the pumping lemma, even though the language is actually not regular. So this is an example of a language that is not regular and yet obeys the pumping lemma. You can however use the pumping property to prove that the language L0 = {0p |p is prime}, is not regular26. You can next note that the L and L0 are complements of each other. That is the idea k k 24 Note that 0d 2 e 10d 2 e is a bad choice for trying to prove that the set of palindromes does k k not satisfy the pumping lemma. Because a choice of v = 0d 2 e , w = 1 and x = 0d 2 e , respects k k |vw| ≤ k, |w| ≥ 1 and implies that all strings 0d 2 e 1i 0d 2 e are palindromes (that is they belong to L), ∀i ≥ 0. 25 The equivalent of identity in string concatenation operations is the empty string = 00. 26 Exercise 42 CHAPTER 2. FINITE STATE MACHINES of closure. We know that the complement of a regular set is regular. Since the complement of L is not regular, we can infer that that set L we are dealing with is also not regular. So we have used proof by contradiction using the closure property. This illustrates another use of the closure property. Example 4: Strings with equal number of 0’s and 1’s Let us consider the language L consisting of strings that have an equal number of 0’s and 1’s. We note that {0n 1n |0 ≤ n} = L ∩ {0n 1m |0 ≤ m, 0 ≤ n}. Since regular sets are closed under intersection and since we know that {0n 1m |0 ≤ m, 0 ≤ n} is regular while {0n 1n |0 ≤ n} is not regular (we proved that as our first application of pumping lemma), the set L cannot be regular. For, if the set L were regular, its intersection with another regular set {0n 1m |0 ≤ m, 0 ≤ n} should have been regular, which is not the case. Thus, we used a combination of pumping lemma and closure under intersection along with the technique of proof by contradiction to prove that L is not regular. Example 5: Strings that represent legitimate FSMs We could represent a finite state machine as a binary string as follows: 0n 10(state[1,0]) 10(state[1,1]) 1... 10(state[n,0]) 0(state[n,1]) 110(acceptState) 1... 10(acceptState[k]) where, n is the number of states in the machine, k is the number of ac- cept states, 1 is the default accept state, state[i, g] returns the id of the state that the FSM goes to on a transition from state i on seeing the symbol g and acceptState[u] returns the id of the uth accept state. For instance, the string representation for the FSM in Figure 2.23 is 001001010100110 So the above string represents a legal finite state machine. But does 03 10101 represent a legal finite state machine? No, because it is supposed to have three states, but does not specify what the transitions for the second and third states should be. Nor does it specify the accept states. There are lots of strings that are legal FSMs and many that are not. Let LF SM = {M | M is a legitimate F SM }. Can we figure out using a program whether a given string belongs to LF SM ? Can we have an FSM that does the job27 ? In other words, does there exist an FSM that recognizes one of its own kind (another FSM)? As a mundane example, a human being can recognize which of the entities in a room are actually humans28. 27 This is different from a previous problem which addressed if an FSM recognizes itself or not. Here we are concerned with the problem of determining whether a FSM can recognize another legal FSM. 28 Or a sane can identify who in a class room are sane and who are not. 2.10. THE PUMPING LEMMA 43 The answer is that the set LF SM is not regular. Can we prove this formally? We will prove it again on the strenght of the contraposition of the pumping lemma in the form of a dialogue. Question: Tell me the number of states in your machine that accepts LF SM. Adversary: The machine has n states (for any n ≥ 1). Question: Consider the string 0n 101010 | {z... 10} 11. You should split the n pairs of 01 string into three parts vwx. No matter what you do, the first v will be stuck in the first n symbols, i.e., in 0n. Suppose v = 0p , 0 ≤ p < n, w = 0q , 1 ≤ q ≤ n − p. Then pumping up w by a factor m, we get the string vwm x = 0p 0(mq) 0(n−p−q) 101010 | {z... 10} 11 which is 0 (n+(m−1)q) 101010 | {z... 10} 11. This string n pairs of 01 n pairs of 01 is not a legal FSM because the string is prefixed by (n + (m − 1)q) 0’s indicating n + (m − 1)q states ∀m ≥ 1 and ∃q ≥ 1, whereas, it has only n pairs of ‘10’s following. So this encoding of FSMs is too hard for FSMS to accept. So it is not surprising that it is hard to construct a FSMs that accepts itself. There is however the Turing machine that could do all this. The higher levels can not only understand and recognize FSMs but also simulate them, answer questions about them. The higher levels may not be smart enough to answer questions about their own kind though. Turing machines are powerful enough to recognize their own kind. That is what compilers are. What they are not power enough to do is to recognize any interesting properties about automata of their own kind. They cannot, for example say that ‘Here is a Java program that never infinite loops’ or ‘Here is another Java program that accepts exactly 5 inputs’ or ‘Here is another Java program that does not accept anything’. The compiler could simulate the program but the program might run forever and we might never know the answer. If the answer is ‘yes’, we obtain it after some time. But if the answer is supposed to be a ‘no’, we will wait forever. Example 6: An alternative encoding of legitimate FSMs There is an encoding of FSMs that makes them recognizable by themselves. Every FSM has a binary number associated with it. Let us order all FSMs according to size. The first one is the smallest one, the one with a single state. And we will re-label all FSMs such that the smallest one is called 0, the next will be 1, the next will be 10 and so one. Thus, every single FSM has a binary number associated with it. And every single binary number (0 + 1)∗ is taken care of has an associated FSM. In such an event, it is easy to develop an FSM that accepts them all; a single state that loops back on 0 as well as 1 and is also the accept state. Thus, the discussion in example 5 entirely depends on the encoding scheme. It is normal to construct encodings so that every single encoding means some- thing. As an example, suppose we map every string that does not correspond to an encoding of FSMs discussed on example 5 to an FSM that does not have any accept states (that is it accepts the empty set), then every binary string 44 CHAPTER 2. FINITE STATE MACHINES Figure 2.28: Finite state machine for L(odd 10 s). will again correspond to a legitimate FSM. That is every single binary string will have some semantics as an FSM. 2.11 Linear Grammar This is a self-contained, short topic. It connects ‘regular expressions’, determin- istic FSMs and non-deterministic FSMs to linear grammars. It adds in a new way of looking at sets based on ‘grammars’. As the levels of sets go up, the ‘grammar way’ of looking at a set becomes at least as important, if not more important than the ‘machine way’ of looking at a set, which is very different point of view. Finite state machines are not generally expressed at the level of grammar, whereas, compilers and programming languages are mostly repre- sented using grammars. The grammar point of view is very easy and transparent to make sense out of it. The best way to define a linear grammar will be start with an example. 2.11.1 From Finite State Machine to Linear Grammar Consider the language L(odd 10 s) L(odd 10 s) = {x|x is a binary string consisting of odd no. of 1s} Figure 2.28 gives the FSM for L(odd 10 s). We will create a grammar, which is a formalism that neither accepts nor rejects strings; rather it generates strings. Any string that is accepted by the FSM can be generated by the grammar while any string that is rejected by the machine cannot be generated by the grammar. As in the FSM, we will have a start symbol, A. And A can generate strings that the FSM can accept. Hence, A should be able to generate a 0 followed by a string that A can generate again. This production can be succintly represented as A → 0 A. Secondly, we should also be able to generate a 1 and then continue with B, that is, A → 1 B. Once in B, we should be able to generate a 0 and stay 2.11. LINEAR GRAMMAR 45 in B or generate a 1 and go back to A, which are represented by B → 0 B and B → 1 A respectively. The candidate set of production rules are listed below. A → 0 A A → 1 B (2.2) B → 0 B B → 1 A Let us use these productions to generate some strings using a sequence of substitutions in the grammar. A sequence of substitutions in the grammar to create a string, is called a derivation. We will start and when we find that we get stuck up, we will complete this production table. The start symbol is A (though the usual notation for the start symbol is S). Starting from A, we could use the production A → 0 A. The A on the right hand side can be further expanded using the production A → 1 B. A legitimate next substitution for the B on the right hand side is, B → 0 B. A→0A→01B→010B What about termination? We are not generating any string as long we have some capital symbol left over on the right hand side. The capital symbols are called non-terminals, and they should not be present in a terminal string. The terminal symbols are symbols in the alphabet Σ = {0, 1} and the final string should consist only of terminal symbols. Note that B is an accept state and therefore, it could disappear in the production sequence. Based on this observation, we add a production rule B → (remember that is the empty string. The updated set of production rules is listed below. A → 0 A