 Our final topic for regular languages will be minimization of states of DFA's. What we mean by that is that suppose I have a DFA M and it has a number of states, then we will provide an algorithm to obtain another DFA M dash such that these two DFA's are equivalent in the sense both of these DFA's accept the same language. Further this new DFA M dash is such that there can be no other DFA with states smaller than the number of states in this machine M dash to accept the same language. So, in that sense we will be able to minimize the number of states required for a DFA to accept a particular language. And will that is an algorithm, but what we will first prove is a very interesting theorem called my hill narrow day theorem. And this theorem is the theoretical underpinning for the algorithm that we are going to describe for minimization of states, but this theorem says something beyond that what I just told you. Roughly what it says is that for any regular language L let us say is regular what my hill narrow day theorem states there is a unique best DFA to accept L. Now best in the sense that there cannot be a DFA which has lesser number of states than the DFA that our theorem is going to suggest. However, more importantly or equally importantly that DFA is unique. In other words what we are saying it is not possible to have two different DFA's essentially different DFA's. So, of course you can have different DFA's by changing the names of the states, but if you do not bother yourself with such trivial changes there is precisely one DFA with minimum number of states to accept any regular language. Now let me spend a minute to understand what I am saying take for example a problem like sorting. We know that for sorting I have an optimal algorithm which is order n log n time complexity such an optimal algorithm will have n log n time complexity when n is the size of the input that you are you are interested in sorting. Now you will remember that we just do not have one optimal algorithm for sorting there are essentially different algorithms which will give me the same n log n time complexity and yet they are very different. Example is Mert sort is one n log n algorithm and heap sort where you create a heap and that heap essentially repeatedly tells you what is the smallest number or the largest number. Both these algorithms are optimal, but yet they are very different you cannot say that Mert sort is same as heap sort although they are doing the same job of sorting. Here what we are trying to say when we are saying there is a unique DFA first of all you can think of a DFA also as an algorithm which solves the so what I am saying is a any DFA is a algorithm to solve the membership problem. So any DFA let us say n it solves the membership problem of the language accepted by n. Now if you think of this class of algorithms which which which can be given by DFA for one once you fix a language L and if you restrict yourself to DFA as a class of algorithms then there is a unique optimal algorithm in the sense not so much in the sense of time complexity, but in the sense of the number of states that you would use. So this is a very interesting theorem, Myhl-Nerode theorem and to understand Myhl theorem even to state it I would require certain terminology in terms of equivalence relations. We will need to refresh our understanding of equivalence relations and we all remember an equivalence relations on is what is a subset of A cross A. In other words any relation for that matter is a set of pairs satisfying reflexivity, symmetry and transitivity. For example, you think of the students in your institutions so our capital A is the set of students in an institution in let us say and then we say that two students here let us say S 1 and S 2 are related by the relation if S 1 and S 2 both stay in the same aspect. So clearly this relation satisfies reflexivity because if you are talking of S 1 related to S 1 that of course that is true, S 1 lives in the same hostel as S 1 that is true of everybody. Then symmetry if the student S 1 lives in the same hostel as the student S 2 that of course means the student S 2 also lives in the same hostel as the student S 1 so symmetry is satisfied and transitivity is also similarly satisfied. Suppose S 1 and S 2 they live in the same hostel and S 2 and S 3 they also live in the same hostel therefore all three live in the same hostel therefore S 1 and S 3 also live in the same hostel. So transitivity is satisfied the important aspect of an equivalence relation is that it partitions the set on which the relation is defined in a number of equivalence classes by partition of course one means that together partitions are subsets of A such that these subsets together will become the entire set A and no two such subsets have anything in common. So again in this example it is easy to see what the partition that will be induced by this particular relation on the set of students in that institution of our interest. By the way these partition classes will be what this these are called equivalence classes and what is an equivalence class the set of all elements which are related to each other. So in this case all the students living in one particular hostel there of course related they form one equivalence class. So let us say another hostel hostel number two the students who are there in that hostel they form the second you know the another equivalence class and so on and together if you take this equivalence classes that is if you just take the hostels then of course they are they will cover the set of all students in the institution of course assuming the institution is residential all students are supposed to live on campus in a hostel or the other. So the important fact about equivalence relation let me write it an equivalence relation on a induces a partition of the set a and this partition comprises of the equivalence classes. So as you can see in this example now our theorem my hill narrator theorem will be about some equivalence relations but on sigma star. So remember sigma is your alphabet sigma star is the set of all finite strings that you can build using the symbols of sigma and so I can in the similar way I can think of this sigma star as a set of course it is an infinite set and we will consider certain equivalence relation on sigma star. Now let us start with something fairly familiar which is that suppose we have a DFA m right and m uses with alphabet sigma and let me define this relation R m as we say that x is related by this relation R m to y if both x and y take the machine m from its initial state to the same state. So if m was q sigma delta q 0 and f then of course what we mean is another way x R m y if and only if delta hat of q 0 which is the initial state of x which gives me a state is same as the delta hat of q 0 y. Now it is fairly simple to see that this relation R m is also an equivalence relation we just have to verify that the relation R m will satisfy reflexivity symmetry and transitivity. Further it of course being an equivalence relation and being defined on the set sigma star the relation R m induces a partition on sigma star what are these equivalence classes the partition classes clearly these are this is the set of all strings which takes the machine from the initial state to one particular state this is to another state from the initial state and so on. So essentially if you take sigma star and take the set of states then all strings which take the machine to one state constitutes one of this and so on. Now there are two other definitions I three other definitions that I need concerning equivalence relation on sigma star. So first of all one thing is another definition is let me write it here and equivalence relation R on sigma star from now on we are just concerned with certain equivalence relation on sigma star is write invariant definition if for all x y in sigma star x if x and y are related then for all z x z will also be related to y z is it clear that what we are saying is if x and y are related then you take any string z you concatenate that string z to x and z to y we will get two new strings x z and y z you will find that x z and y z they are also related in case of course the relation is write invariant. Now you can check that this relation R m is write invariant why. So let us say let me suppose x R m y that means what that both x and y take the machine from q 0 to the same state right. So in fact we had written that here and now you consider so this implies of course that delta hat of q 0 x is same as delta hat of q 0 y and let us say this state is the state p and now you concatenate a string z to both x and y right. So what is delta hat of x z suppose delta hat of x z q 0 x z if we know it is going to be delta hat of delta hat of q 0 x first see which state the machine goes to on x then from that state which is of course is p you use the string z to come to a particular state. But since delta hat q 0 x is same as delta hat q 0 y I could have replace this q 0 x part by q 0 y and that would mean this is of course same as delta hat of q 0 y z. So therefore both x z and y z take the machine from q 0 to some same state and this of course implies that x z and y z are related. So I can see x z related by this relation r m to y z so important point is the relation r m is right invariant there is another definition I need and which is index of an equivalence relation. Now this is a very simple definition it says an equivalence relation r is of finite index the number of induced equivalence classes is so what it is saying that remember you start with an equivalence relation that will partition the set on which the relation defined is defined on a number of equivalence classes and this number could be finite it could be infinite if it is finite then we say the relation is of finite index that is what we said clearly in this example the students in an institution and they being in the same hostel is our relation since the number of hostels in an institution is finite so this relation is of finite index and more importantly or more relevantly for us the relation r m is also of finite index why because you can see that I have an equivalence class for each state in the machine m that is the set of strings that state is those that equivalence class consists of those strings which take the machine from the initial state to that state so after fixing a particular state and so these you know each one of these equivalence classes as induced by r m each one is associated can be associated with a state and since the number of states in a DFA is finite so therefore the number of equivalence classes is also going to be finite for r m and therefore r m is of finite index and one more definition before I can state the Michael Narrow de theorem is now this concerns two equivalence relations on the same set if now we say the relation r 1 so let us say r 1 and r 2 are equivalence relations on a we say r 1 is a refinement of r 2 if the following condition holds x r 1 y implies x so this is the definition of when two relations two equivalence relations are such that what equivalence relation is a is a refinement of the other equivalence relation very simple definition it says that r 1 is a refinement of r 2 if whenever you have two elements related by r 1 then they will also be related by r 2 so of course true for all x y all x y in the set a let me let me give you an example of this relation that we had studied or we just mentioned for you know students in an institution living in the same hostel being related by the equivalence relation now let me define another relation r dash this says that x r dash y if the two students x and y stay in the same block you know I am assuming that a hostel typically as in my institution has a number of blocks sometimes these are numbered a b c d e f etcetera so let us say this is hostel 5 and it has these blocks block a block b block f now we what we are saying two students they are related if they are in the same block to be in the same block they have to be in the same hostel so you see this relation r dash is a refinement of the relation r of being in the same hostel and and what happens in such a case when the relation r dash or in this case r 1 is in refinement of r 2 in this particular example r dash is in refinement of r 1 all that happens if you look at in fact if you look at this it is saying that suppose you your relation is such that new relation that it breaks the old equivalence classes in may be number of finer classes because old relation everybody in the same hostel were clubbed together in the same equivalence class but now we are going down to the level of blocks and that is the reason we are saying or that is the reason it is the term is used that it is a refinement you are defining the equivalence classes to finer so we have given in terms of sigma star we have seen some terminologies one is that of that finite index the other is right invariant and now I said another notion that of one equivalence relation being a refinement of another equivalence relation and of course we given example particular example that r m which is a very natural equivalence relation associated with a d f a m on the set sigma star I am now ready to state the theorem the theorem states the following says the following three statements are equivalent and this three statements are l the language l over the alphabet sigma is accepted by a finite automaton in other words l is regular l is a language over sigma and that is regular in the second statement says l is the union of some of the equivalence classes of a right invariant equivalence relation finite index and third says r l is of finite index where you define r l of course I need to say what r l is x r l y if and only if two strings remember that all these things are over these are these are relations over sigma star if for all z either both x z and y z are in l or neither is in l. We will try to understand this definition a little more clearly when we come to this part now this is the theorem and as you shall see that this will imply that there is a unique minimum state automaton unique to the extent of you know modulo names of the states first of all I do not know whether you have seen this forms of results what it is saying is that all these three statements are equivalent what does it mean to say two statements are equivalent remember the little bit of logic that we had learnt in courses like discrete maths we say statements a and b are equivalent a implies b and b implies b remember a statement is something which is either true or false so what we are saying is if a is true then b must be true as well as if b is true then a is true in other words it basically it is the same statement in its intent so you know many examples of equivalence now what this theorem is stating that all these statements are equivalent now if you say the statement one is remember that you also write two equivalent statements as a is equivalent to b the statement equivalent a is equivalent to b so what this is saying is one the statement one that a language L is regular is equivalent to saying that this particular language L will discuss what it is saying but you look at the look at the way it is and it is saying one is equivalent to two two is equivalent to three and therefore of course one is equivalent to three and so on so typical way people prove this is this that they will show that the statement one implies statement two then they will show that the statement two implies statement three and then finally they will show that the statement three implies the statement one so you know it is a cyclic dependency one implies two two implies three three implies one therefore these are true of course that one is equivalent to two why because one is implies two two implies three right and two implies three and three implies one so therefore two implies one so you can see this is how one way of proving a number of statements to be equivalent and this is what we are going to show before before we start look at what is the second statement is saying second statement is saying that the language L which is over sigma star is the union so let me let me do it here remember that our set on which we are interested is of course sigma star and what is the language L L is always a subset of sigma star and now it is saying that L is the union of some of the equivalence classes of some particular equivalence relation which has a certain property what is that there are two properties that equivalence relation is right invariant and it is of finite index now see it is saying that L is the union of some of the equivalence classes so first of all that if that relation is that equivalence relation is R that will induce as we have seen on a on sigma star partition now what it is saying is this statement that L is the union of some of the equivalence classes so maybe L comprises of this equivalence class and this equivalence class all the strings in this equivalence class you take union with this equivalence class and that gives you the language L further it is saying that this equivalence relation R satisfies two properties one it is right invariant and two it is of finite index right so the relation R since it is of finite index the number of equivalence classes induced will be finite there will be some particular number but only finite although the set sigma star is infinite the number of equivalence classes will be finite of which this of this finite number of equivalence classes we take the union of some of the equivalence classes that gives me the language L further we are also saying in this statement that the relation R is of is right invariant so actually we have already done this proof that one implies two proof of this one implies two what is that proof so I say consider R m remember two strings where and this that that R m is going to give me this but you see how it is one implies two since one so we we say suppose one is true that means L is accepted by a finite automaton therefore we can say let L is accepted by a d f a m since one holds L let d f a m L and now I claim the relation R m is satisfies this property we had shown that R m the relation R m recall what the relation R m was we had said that if my d f a m is q sigma q 0 delta f we said x R m y if and only if delta hat of q 0 x is same as delta hat of q 0 y that was the definition of R m and first of all it is easy to see this relation R m is of finite index right because it is equivalence classes they they correspond to the states of the machine m and m being a finite state machine it has only finitely many states and therefore this R m satisfies this this we had seen it is easy to see we had proved a little earlier that R m is right invariant right and now what is L so you see each of this suppose I am now talking of R m each of this equivalence classes each of them corresponded to a state of the machine m so this may be q y this may be q j this may be q L and so on and these are all the strings which take the machine from its initial state to this q i state q i this particular set of strings which constitutes which will give me this equivalence class they take the machine from q 0 to the same state q j and so on now the language L is what language L we know is the all those strings which will take the machine to one or the other final states so you see suppose this f consists of q f 1 etcetera q f 1 q f 2 and so on so now I will have an equivalence class in R m corresponding to q f 1 and all those strings in that class there in the language because the machine will take all those strings to q f 1 starting from the initial state but q f 1 is a final state so therefore there in the language and therefore it is easy to see that the language L is nothing but the union of the equivalence classes corresponding to those states which are the final states of the machine right so once we understand that then 1 implies 2 this part of the proof is immediate we will continue with the other two parts 2 implies 3 and 3 implies 1 in the next lecture but let me end by saying what are the concepts that we have learnt essentially we are saying we are trying to view the regular language in a certain way in terms of certain equivalence classes of certain equivalence relations and we learnt certain concepts about equivalence relations first of course we remembered that an equivalence relation will induce a partition on the set on which it is defined that is first fact that we recalled and then we defined notion of finite index for a equivalence relation we said an equivalence relation is a finite index if it induces a partition where there are only finitely many equivalence classes and then another concept we learnt was that of right invariance we said an equivalence class over sigma star is right invariant if whenever x and y the two strings are related then if you add concatenate another string same string z to both x and y in the two new strings x z and y z they will also be related that was the notion of right invariance and then finally we talked about refinement which we have not used so far but which we will use in the proof of 2 2 3 this statement implying this statement and we said a relation r is a refinement of the relation r dash if whenever two strings are related by r they will also be related by