 So, we are proving a Myhill-Nerode theorem and the theorem says that the following three statements are equivalent and we propose to prove this theorem by first showing 1 implies 2, then 2 implies 3 and then 3 implies 1. Last time we completed the proof of 1 implies 2 and just to remind ourselves what the proof was? We start with the assumption that L is regular, L is regular means there is a D F A M which accepts the language L and then we considered R M for that machine M. We defined the equivalence relation R M last time and we found that time the R M, the equivalence relation R M for M accepting L, we found that of course it was an equivalence relation and R M that equivalence relation was right invariant we proved that and it had finitely many equivalence classes. So, therefore it is a finite index and of course the language is the union of some of its equivalence classes. So, 1 implies 2 we can say proved by R M just considering the equivalence relation R M. Now, we would like to prove 2 implies the statement 3 here we start with the assumption that L is such a language satisfying the conditions here and then we would like to show that the equivalence relation R L for this language is of finite index. We defined R L last time and it was like this again R L is a equivalence relation on sigma star. So, it relates to finite strings you know in sigma star this relation is pairs of such strings and we said that x R L y if and only for all z in sigma star may be I will write it here. So, when will x R L y hold it will hold only when for all z to take any string z in sigma star then you take the 2 strings x z and y z by concatenating z first to x and then to y you get these 2 strings. Now, you see what are the possibilities x z in L and y z in L x z in L, but y z is not in L x z is not in L and y z in L and both x z and y z both of them are not in L right these are the only possibilities that can happen when you get these 2 strings either both in the language or exactly one of them in the language or none of them in the language. Now, for x R L y to hold these 2 cases must not be there. So, you see for 2 strings x and y if it is the case that for every z when you form these 2 strings x z and y z either both of them will be in the language or both of them will not be in the language only in such a case will say that x is related to y. How do we check whether x is related to y of course, here is the definition, but that is not the point right now the point is to understand the definition. First of all let us quickly see that R L is an equivalence relation see reflexivity and symmetry are quite obvious. In case of reflexivity you need to check the condition for x z and x z of course, you have only one string x z and. So, therefore, the conditions will be true this things that we have written symmetry is again you see this the way the definition is either both of them x z and y z in the language or none of them is in the language. So, this definition the way we are saying is symmetric. So, this is true and finally, transitivity let us check suppose that x R L y and y R L z and what I need to show that I would like to show that if x R L y and y R L z then x R L z. Now the reason for that is you take any string w now because. So, x R L z will be true if both the strings x w and z w satisfy this condition. Now let us say x w is in the language right then from this fact what do you get since x is related to y. So, suppose x w is in the language implies y w is also in the language and now from the fact that y is related to z I get z w is also in the language right and actually this is an if and only if statement x w in the language because of this condition I can say if and only if y w is in the language and y w is in the language and because of y being related to z if and only if z w is in the language. So, what I have is x w is in the language if and only if z w is in the language. So, x R L z. So, it is clear that R L is an equivalence relation and just to fix our ideas let me just consider an old language that we had considered before. So, let us say L is the set of finite binary strings such that x has even number of zeros and even number of. So, now of course, sigma is the binary alphabet 0 1. So, now consider x to be some string let us say 0 1 1 0 1 and y to be 1 0 1 0 will they be in the language or rather of course, we know that both of them 0 1 1 0 1 this is not in the language because this has even number of zeros, but odd number of ones and this has odd number of zeros and even number of ones. So, both are not in the language. So, that is ok, but we would like to know if x R L z. So, you see just consider the string sorry x R L y these two strings x and y we are considering and we would like to know if x R L y holds or not for this language L. Now actually x R L y is not true does not hold why because just consider this string z to be 1. So, when you get x z that will be the string. So, write x z here 0 1 1 0 1 and then 1 x z is in the language because it has even number of ones and even number of zeros this is in the language and now take y z y z is what 1 0 0 1 0 1 0 0 and then z again 1 this has odd number of ones and odd number of zeros. So, this is not in the language. So, that shows that these two strings 0 1 1 0 1 and 1 0 1 0 0 these two strings are not related by the relation R L for the language. On the other hand if you take another example for the same language L that is the context supposing I say 0 1 0 and 1 1 1 it is you need to argue, but you should be able to see for example, any string that you take see the point is this has even number of zeros and odd number of ones this string also has even number of zeros and odd number of ones. Now for this string to become a string in the language after augmenting some z that z must have even zeros and odd ones. So, that together they have even number of zeros and even number of ones. So, now if you add the same string here what will happen this already had odd number of ones you added through the string some more ones. So, two odd number of ones when you add them their number becomes even. So, the total number of ones will become even here and since here it is even of no zeros and only zeros are here. So, again the together this entire string will have even number of zeros and even number of ones. So, whenever this string is going to be in the language by adding some w that same w also will make this string when added this w when added to this string to the right that string also will be in the language. And similarly we should be able to argue that any string if it you know if you take any w if that w does not make the entire string in the language that case is will be the case if and only same w when added will not make the string in the language. So, you see what I am trying to say for this language l the two strings 0 1 0 is related by this relation r l. So, you are more or less clear about the relation r l and you should be able to see that this r l is defined for any language l whatsoever because you know we just follow the definition. So, and this is an equivalence relation therefore, it will partition sigma star in some equivalence classes. Now, what the statement three is saying that this r l is of finite index. So, let us prove this that if two is true that implies three we now prove that the statement two implies statement three. So, you see statement two is saying that we have a language l in sigma star such that there is so this is sigma star. What this language l is l is the union of some of the equivalence classes of an equivalence relation on sigma star satisfying these two conditions that that equivalence relation is right invariant and it is of finite index. So, let that equivalence relation be r dash. Now, first of all it says that one of the things it says that r dash is of finite index and r dash is an equivalence relation. So, therefore, it partitions the set sigma star in some manner right. So, these are the equivalence classes induced on sigma star by the relation r dash. Two things we know already one that the very fact that it is it is inducing this partition of course means r dash is an equivalence relation we have made use of that two other things is r dash is right invariant and which means the number of equivalence classes for r dash r dash is finite. So, this number of such equivalence classes is finite and now we are saying that l is the union of some of the equivalence classes. So, the language is solely consists of we take some equivalence classes entirely may be this one also this is just a picture of course. So, the statement two is saying that my language consists of the union of these strings and these strings and these strings and now from here I would like to show that that language l which is the union of some of the equivalence classes of r dash where r dash is an equivalence relation which is right invariant and it is of finite index that language l for that language l if I consider r l r l is of finite index this I need to now prove and which will show the implication two to three well the proof is by showing that any such r dash is a refinement we defined refinement last time and to prove that r dash is a refinement of r l what I need to show that is if x r dash y that means whenever two strings are related by r dash that means take any two strings in any one of these equivalence classes that would imply these two strings are also related by this is what I need to prove. So, of course, I know from the definition so, let us that this will be true if I can show remember it is just the definition x r l y holds if and only if for all z x z and y z these two strings either both in n or neither is. So, now consider the two strings x z and y z because r dash is right invariant I immediately have x r dash y if I take any z it is the case because the relation r dash is right invariant what I have is x z is also related by r dash to y z that follows that is something which follows directly from the definition of r z. Now, suppose x z is in the language that means what that x z is either here or it is here or it is here because remember the language l is the union of some of the equivalence classes of r dash to suppose x z is here. Now, what is r dash saying r dash is saying that x x z and y z of course, since they are related that means y z also is in the same equivalence class. So, if x z falls here being we have assumed x z is in the language so, it is in one of these dashed equivalence classes and now since it is all related by r dash y z is also in the same equivalence class which means if you assume x z is in l that implies y z is in l it cannot be the case that x z is in l and y z not in l and of course, this is true the other way also that if you assume y z in l in the same argument shows that x z will also be in l therefore, if x z is not in l y z also cannot be in l so, that shows what that whenever x is related to y by this r dash I can find no z such that exactly one of x z and y z will be in the language the other not in the language. Therefore, clearly x is related to y by r l as well and that proves that r dash is a refinement of r l now remember our notion of refinement. If something is a refinement of some other equivalence relation what does it mean we discussed that last time it just means that r dash at most breaks up some of the equivalence classes of r l into new equivalence classes. In other words the equivalence classes of r l are made up of by made up by combining some of the equivalence classes entirely of r dash now what does that mean that number of equivalence classes of r l is less than equal to the number of equivalence classes of r dash because r dash equivalence classes are made up by breaking apart some of the equivalence classes of r l. So, now if since r dash is of finite index that is what 2 says r l also has to be of finite index and that proves this that is indeed the statement 3 that r l is of finite index. So, once more the our idea was in this proof we made this assumption and let r dash be such an equivalence relation then we show that r dash is a refinement of r l and since r dash is a refinement of r l number of equivalence classes of r l is no more than the number of equivalence classes of r dash and since r dash itself was of finite index that is it has finitely many equivalence classes r l will also have finitely many equivalence classes and therefore, r l is of finite index. To complete the proof of my hill Nerode theorem we now need to show that 3 implies 1 the statement 3 implies. So, what is statement 3 saying that I have a language l and the relation r l is of finite index from here I would like to show to show r l is the language l is regular. So, from the assumption that r l is of finite index we need to show the language l is regular and the way we will show the language l is regular by actually constructing or defining a DFA for l using this relation r l. So, here is a definition of m a DFA using the equivalence relation r l and we will show that this DFA m which we are going to define now will accept the language. So, first of all my DFA m is going to be as usual q sigma delta q 0 and f I mean I need to tell what this q delta q 0 and f r sigma is of course, I know that sigma is the language sigma is the alphabet l is defined over. Now, we define q to be the set of equivalence classes the equivalence relation r l. Now, let me use this part of the book remember that we our assumption is r l is of finite index. Now, r l is an equivalence relation on sigma star the set of all strings over sigma and what more r l is of finite index. So, the number of equivalence classes r l induces on sigma star that number is finite. Now, what we are saying is that I will have an equivalence class standing for a state of my DFA you might like to think of this way that for every equivalence class this is r l induced partition sigma star and these are the equivalence classes an example of course, just a picture and. So, I would like to say that these I have a state of the DFA for this for this for this for this. So, in this case here there are one two in this picture there are one two three four five six seven equivalence classes. So, in such a case q will consist of seven states q will have seven elements. Now, I think you know to what each of these equivalence classes consist of they consist of strings. Now, how does one denote an equivalence class one way of denoting is that take any element here let us say x which is a string and you might remember that x denote. So, let me in fact write this. So, for any x in sigma star this is the standard notation that this denotes which is this stands for the equivalence class to which x belongs. My definition for q therefore, is of this machine m can write q to be like this in this manner although it looks that for every x in sigma star I am doing something, but the number of these equivalence classes is finite because that is what I have my statement three said that r l is of finite index therefore, there are only finitely many equivalence classes. So, this set the right hand side set is finite and therefore, we can take that set to be the set of states. So, after that how do I define delta now remember the way we define the transition function is what it was delta for any d f a delta was a mapping from q cross sigma to q. In other words q if it if delta will look like this that it will take a state it will take a symbol and it will tell me what does the d f a which state the d f a goes to from state q on symbol a will define delta. Now, as I said the delta takes two arguments one is a state and one is a symbol. So, our states are of this kind equivalence classes of r l. So, let us see I have this state and the symbol is a then we define it to be that state to which the string x a belong. Now, this definition looks all right however we need to ensure one thing remember that equivalence class x consists of many string. So, suppose y is also here. So, in other words the equivalence class x is same as the equivalence class of y. So, the way we are doing it of course, that this equivalence class we are representing by the string x you know well we write like this equivalence class we are representing by an element of the equivalence class. And then we said that this is the equivalence class and the state corresponding to that is the state to which my machine will go if it is in this state and the symbol a comes. The natural question therefore is that since the state is same whether you represented it as this or this that is the equality is the same state because this is the same equivalence class. Then this for this definition to be what we call this definition to be well defined it must be the case that if I had taken some other representative for the class let us say y in this case that delta y and then a of course by this definition I will get y a d must be same x a and y a must be same only then this definition is meaningful otherwise it is meaningless as it turns out that will be the case why remember what do we know of x and y that they belong to the same equivalence class of r a. So, in other words let me now write it here I am trying to prove my definition this is the definition of delta and that definition is done properly it does not depend on which representative of the equivalence class I take and that is why I am saying in order to make this particular definition you know to convince you that this is a proper definition I need to show this that given that there is another string y in the same equivalence class. So, that is why we are writing the equivalence class of x is same as equivalence class of y this I should show that the equivalence class of x a is same as the equivalence class of y given that x and y belong to the same equivalence class. Now x and y belong to the same equivalence class means this that is the definition they are related by R L and this will be the case if x a also I can show to be related to y a. Now to show this that x a is related by the same relation y a related by the same relation to y a I need to show that for any z x a z and y a z whenever x a z is in the language this means if and only if y a z is also in the language. Now remember x a z x a z in the language now consider this as the entire string a z which you are appending to which you are concatenating to x. Now since x and y are related x with this string appended if that is in L since x is related to y then y a z also will be in the language and vice versa. So therefore, it is indeed the case that x R L y implies x a R L y a and therefore, this definition is well I mean it is well defined. So, we have done that so my definition for delta is complete and in order to show what the machine is I need to provide the definitions for q 0 and f. Now what is q 0 q 0 is the state initial state where the machine begins. Now it should not be too difficult to see intuitively that if I take q 0 to be that state or that equivalence class to which the string epsilon belongs. Remember epsilon is the empty string and epsilon will be in one of these class one of these equivalence classes that equivalence class is the initial state of the machine. See we are you know we have considered that equivalence classes of R L they constitute the states of the machine m. So, please keep that in mind so at one level we are talking of equivalence classes which are strings but at the other level we are thinking in terms of this entire equivalence class is a state and takes a little while may be to appreciate this, but really there is no difficulty think of the way we have done it here at least q you can see there will be finitely many elements delta this is the way we have defined makes sense. So, this also makes sense that q 0 is something q 0 is a state of the machine m and. So, therefore, it has to be an equivalence class of R L because that machine m the states of the machine m are the equivalence classes of R L and we are defining the initial state to be that equivalence class to which the empty string belongs. Now, will show of course that this makes sense but before that let me complete the definition and finally, the set f f is a subset of q remember this is the final states. So, this I put it this way all those equivalence classes such that you see we are what we are saying again x is in the language of course language is infinite that is fine, but you think of you these strings which are in the language the there are only finitely many equivalence classes. So, they will distribute to some of these equivalence classes those are the classes they when thought of as states of the machine m are my final states of the machine. So, this is a definition, but does it make sense first of all does this machine as we have as we have defined does it accept the language L that is what I need to show we need to show that 3 implies 1 and in that starting from the definition of RL I defined this machine m, but ultimately what I need to show that this machine m the language accepted by this machine m I need to show that this language is nothing but the language L well look at it this way. So, first of all let me show that if I take a string in the language then that string will be accepted by this machine m let see how to suppose x is in the language and the string x consists of a 1 a 2 a n these are the symbols the string is of length n and this string x is consists of this symbols a 1 a 2 up to a n. Now, how does the machine behave on this string initially the machine is in this state epsilon then this symbol a 1 comes. Now, look at the definition of delta if it is in that state epsilon then a came what is the next state it will be in this state the state to which the string a 1 belongs then the machine m will be in this state. Now, a 2 came the machine will be in this state by definition by definition of our delta. So, now finally the machine after scanning all these symbols these n symbols it will be in this state this string a 1 through a n is accepted provided the state is a finals but of course, that is the case because since this is an element the string is in the is an element in l is a is one of the strings in the language. So, you replace a 1 a 2 a n here that is in the language. So, therefore, this state this state will be a finals. So, this proof is very simple we just followed the definition and what we see finally we are in this state and that state is of course, a final state. So, this is accepted now what what else to what actually I have shown is that this language l is a subset of l m because that is what I just showed that you take any string in the language that string will be accepted by definition. Now, what about the other way I need to show that l m is a subset of the language l one way of proving that is it if I take a string which is not in the language then I should show this machine the DFA that I have defined will go to a state which is not a final state. So, this is like this I am trying to show this part. So, what I need to show that starting from x in l I mean starting from x in l m I should show that the string x is in the language and now use the contra positive of the statement that this I can show one way of showing this is saying that if x is not in the language I should be able to show that x is not in l m. In other words if I take a string which is not in the language then I if I manage to show that string does not take the machine to a final state then I am done. So, again the thing is very simple supposing that string now let us say a 1 a 2 a n is not in the language l. So, how would the machine behave on this string the same way that we had shown at the end of this string when this string is scanned by my machine m that will be in this state, but this is a string which is not in the language and. So, therefore, this is not one of the final state if this equivalence class to which this string belongs is not one of the final states of the machine because by definition the final states of the machine is all those equivalence classes which are consist which consist of strings which are in the language. Now you might wonder see this is again it looks kind of we are just you know waving hands and proving things, but this proof is rigorous. You might wonder what if I have a string I have a equivalence class of R L like can it happen that x is in the language and y is not in the language can never happen because consider this just the string epsilon x epsilon will be in the language, but y epsilon will not be in the language and that would mean that x and y could not be in the same equivalence class. So, really one of the ways R L one of the implications of the definition of R L is the language L consist entirely of the union of that is you know some of these equivalence classes we take their union and that gives me the language L. So, it cannot be that some strings from here and some strings from here will be in the language L that we can prove and therefore, all that make sense what we said and that completes the proof that three the statement three of my hill narrowed theorem implies one just to just to wrap up we started with the assumption for a language L to have R L which is of finite index that means it has finitely many equivalence classes from there I define a DFA and then I showed that DFA accepts that same language L. So, therefore, three statement three was R L is of finite index and statement one was L is regular and that is proved because my DFA that I define this M that I define this is the definition of the M that indeed accepts the language L and you see the nice thing about one of the very important things this theorem tells me is what is that the notion of minimization of states let us see why so my hill narrowed theorem tells me what is the minimized automata what will be the minimized automata minimize DFA for a regular language L is it supposing L is regular then that means that I have a DFA M such that L M the language accepted by that machine is the language L by definition and then if you consider R M what do I know that R M is a refinement of R L that was the statement of the basically one implies two that is what we said now then that means what whatever machine that you may take that machine the states number of states cannot be more than sorry cannot be less than equal to the number of states of number of states of the machine of the DFA that we define through R L once more you recall that from the definition of R L gives me a DFA for M dash for L and if you if you if you just go back to the statements of the my hill narrowed theorem what you know what you can see immediately that any machine M accepting L the number of states in that machine has to be equal to or more than the number of states defined through R L DFA remember we just used a DFA definition from R L now minimization of states will be that coming to the DFA which corresponds to R L from R L that definition of the DFA that we got that is indeed the best in terms of number of states and this is what we will follow up in our next lecture.