 We will continue our discussion on context free grammars and languages in this class 2. First of all what I would like to do is to complete an argument which partially I did last time and that is about recall this grammar g which had three non-terminals S A and B and the terminal symbols were just two small a and small b and we had these production rules. And what we said was that the language generated by g this grammar which of course by definition and that is why I put this symbol which reads as language generated by g is by definition is equal to the set of all terminal strings which are each of which is derived that is all the terminal strings that you can derive from the start symbol S that is of course the language associated with the grammar g which we call the language generated by g. And the claim was that this language is precisely those strings over a and b where the number of a is equal to number of b this is what I am writing here. So, this is of course is something would be interested in proving because that is what we say is about the language just looking at the grammar it may not be immediately clear we would like to prove what the language is now we also said last time that to do so I need to prove two other claims and there is a derives all strings see there are two non-terminals other than S a capital A and capital B and we would like to claim that this non-terminal generates all strings over terminal symbols small a and small b where the number of a's is exactly one more than number of b's. So, for example, a string like a b a a b here the number of a's is exactly one more than number of b's because there are three a's and two b's. And similarly, the set of strings which are derived starting from the non-terminal b is precisely those terminal strings where number of b's is exactly one more than number of a's. So, for example, if I had written something like this a b b a b here this particular string if you notice that there are three b's and two a's this string therefore, should be in the set generated from the non-terminal b and there are I have said this even in the context of when we are when we are when we discuss regular languages that you see that I would like to prove the equality of these two sets that is the set of all strings which are generated from the start symbol which is by definition in this case of course, language generated by g and in some manner I am describing a set of strings over a and b which is this. This is the predicate which is satisfied by w and therefore, I can talk of this is set of strings over a and b and point is that we would like to show the equality of these two sets and when you show equality of two sets and this is the point we have made in our discussion many times that see this is of the form that I have this set let us say s 1 and another set s 2 and we would like to prove the equality typically this is done by showing s 1 is a subset of s 2 and s 2 is a subset of s 1 what I definitely I hope I have been able to convince you last time that it is the case that this set of all strings which have equal number of a's and b's they would be generated by s how did we do that we said you know similarly, if you look at this that let me let me see this set is let me call this set as you know I am just giving a name p 1 and this right hand side is p 2 and let me say this left hand side this set is q 1 and right hand side set is q 2 and let me say left hand side of this equality this is also equality between two sets this is r 1 and right hand side let me just call it r 2 for the time being I think you would agree that we showed last time was that p 2 is a subset of p 1 similarly, q 2 is a subset of q 1 and r 2 is a subset of r 1 this we proved how did we prove it just we will remind ourselves what we said was that we proved all these three assertions simultaneously by means of what is called simultaneous induction. So, we are using an induction, but over all these three are being proved by induction, but simultaneously we are doing it. So, let us say that this something is true up to a certain length all these three statements are true up to a certain length let us say k what it means is all strings of length k or less which have equal number of a's and b's k will be generated by the start symbol s and so on. Then the induction is over this k we showed that if they are that I take as the induction hypothesis then you know the next bigger string also is something I will be able to generate that is you know these assertions I can prove for the next you know assuming about k and then I can prove for k plus 1 and so on. And it is not difficult to see in order to prove the assertion this assertion for the next length I needed to use these two also and so on to make to prove this for the next length I needed to use the other two for the. So, that is that is what is simultaneous induction is all about simultaneously we prove all these three assertions. So, here our point was that the induction is over length of the strings that we are talking about in this you know these sets have strings and each string has a length and our induction we are inducting over lengths of these strings. Now just so can I say that this we have already done this part now today let me just indicate how do I do the other part so that I can prove the equality. So, basically today I need to at least indicate the so P 2 now I said that P 1 contains as a subset P 2 this is something we prove. Now therefore, I need to prove assertions like this right R 1 is a subset of R 2. If I manage to prove these three also then all six together means P 2 is equal to P 1 and which of course means what we wanted to assert about the language generated by P 2. Now so what is this saying for example, this is saying that every string which is derived from S every string over the terminal alphabet which is over the symbols a and small a and small b every such string has the property that it has equal number of a's and b's is it not. This is what this means that every left hand side is about strings which are terminal strings which are generated from S. Similarly, this is about terminal strings generated from a and this is about terminal strings generated from R. Now again we would do in order to prove these three assertions you would again prove them simultaneously again we will use induction, but now we shall use induction on the length of derivation. So, let me see for example, the induction is now what we mean by this see for example, starting from S I get alpha 1 then alpha 2 that is from alpha 1 I get alpha 2 then alpha 3 so on alpha n and this is an element of the terminal alphabet suppose I have. So, what is the length of this derivation? This is a derivation starting from S ending in alpha n this such a derivation the length is clearly how many times I have used this derivation one step derivation you can see we have used it n times the length of this derivation is n. So, let us let us try to look at the base case which is kind of very very obvious suppose by the way does do you get any string at all in one step any terminal string in one step from S you actually do not if you see what happens is in if you start from S you can only use these two and in one step you can write this or you can write this in an either case you do not get any terminal string. So, we can for S the base case will be let us say 2 so there are 2 strings of length 2 which have equal number of a's and b's which are a b and b a and you can verify that both of these can be derived from S and for now if you want to do everything from length. So, let us say that for A also capital A also we can derive in one step terminal string small a similarly for b one step I can derive a terminal string b in two steps starting from b can you get anything I get you cannot because this will be like if you start from A in one step of course you can get this, but you will see the two steps you cannot, but whatever it is the base case can be handled and I leave it as an exercise what should be the proper base which you can do this. So, let us say that base we have handled and now some what is the assertion that I can I want to say the assertion is that induction hypothesis is that up to k length of derivation all lengths of derivation all derivations whose lengths are k or less they for this these three properties will be satisfied which these properties are of course we have said here. Now, how do I go one more step? So, let us say I am talking of derivation whose length is k plus 1 for S. So, that such a derivation so let us say consider derivation of length k plus 1 starting with S. So, you start with S now the point and that is something which is very simple to observe, but once we observe this the proof becomes very clear. So, this will the first is something alpha 1 and then I have alpha 2 from alpha 1 and here I will have alpha k. Now, here what I have that here is first step what could had could it have been either let us say this sorry this or this. So, let us say I have my first thing is first step in that derivation. So, take whichever derivation of length k plus 1 and then I have this. Now, what we are saying see that alpha 2 how do I did I get alpha 2? Alpha 2 therefore, necessarily is something a string which is of the form can you see this that alpha 2 has to be a string which is of the form A and then let us say alpha 2 dash and what is this alpha 2 dash alpha 2 dash is a string which can be derived from the non-terminal B. And now I use the induction hypothesis this string necessarily will be of the total this derivation if I just take out this A which will always appear in alpha 2 alpha all the way up to alpha k. If I take these A's out then I am talking of a length k derivation starting with the non-terminal B. By induction hypothesis this will be some string which has one more B's than the number of A's which will be a string over A's and B's because that is what we are assuming that alpha k is a string over A's and B's and alpha k is of the form A alpha k. So, if I take the leave that first A out what is left if I call that alpha subscript k dash that string has to be generated is something which is derived from B in steps k because the total you know starting from here we are doing it. So, that is total thing was I am sorry this was let us say k plus 1. So, the length was k plus 1 and now we are doing from here alpha 2 2. So, which is which is from here to this which is a length k derivation of B when I do a put a dash taking the small A out and that is a string which has got one more B's than A's and now in front of that there is a small a. So, total number of A's and B's will be equal because so you can see that we can this in this manner simultaneously taking all these three assertions at the same time and we can do a induction point I am saying is now we do the induction on lengths of derivation whereas, in this part we did the induction again we did simultaneous induction, but on lengths of strings. Here is an example of a context free language if you think for a second you will realize that this is not a regular language. This language the set of all strings over A B where the number of A's is equal to number of B's very easily you can show that this set is not a regular set. So, L G is not regular I have at least one example of a language which is not regular, but which is context free because here is this context free grammar which generates that language what about regular languages themselves can we generate every regular language by a context free grammar is it the reason is suppose why this question is important the reason for that the quake the importance is this is the set of all languages which are context free. So, this is the class of context free languages and here for example, I got this set which is not a regular set. Now, they could have been two possibilities when some elements here are not regular is either that this is also a possibility if I say that this set is regular could be that there are some regular languages which are not context free. For example, if this is the picture then this is the regular languages which is not context free what we are going to show that is not the case. In fact, all regular languages will be context free languages. So, that statement in picture what does it mean here is the class of the regular languages when I say every regular language is also context free that means this set this class is a subset of context free class what more I know there are examples of languages which are context free, but not regular. So, this containment is actually proper and this is really the right picture I need to show you this. So, therefore, I need to show that this is the claim that every regular language is a context free language. If I manage to prove this along with the example that I have here of one context free language which is not regular that means the picture or the relationship between these two classes is that the class of context free languages properly contains regular languages. In that sense we have progressed if I manage to show that you know we had some set we described its properties we did many things with regular languages, but we also saw that you know certain simple languages could not be regular languages and here I have a larger class and you know overall what this this course is trying to do is to ultimately be able to capture all languages which are computable in the most general sense. So, you would like to prove this let me give you an example of how would the proof go and then we can formalize that example to see the proof exactly. So, let us actually take a context free language and of course context free language you can describe by a DFA. So, why not take a DFA? So, let me in particular consider this DFA it has two states and it should be able to almost look at this and immediately see what is the language accepted by this DFA and that language is very clear that it is the set of all binary strings where the number of ones is odd. So, let us I will show you how to get a context free language for that language context free grammar for that language that is the set of all binary strings where the number of ones is odd do not care anything about zeros if you notice in this example and the way we are going to do this is will associate a non-terminal with every state of this DFA. So, let me call this state as a this state as b and what we are going to do is if you see what does it say? So, let me first write something that it is very easy that from every non-terminal that I have written which is of course, corresponds to a state my rules are going to be if you consider a transition. So, here one transition is this. So, I will put like that. So, a 0 a we will justify this is a little later a and again starting from a I have another arrow going out which is one followed by b. So, and similarly I would write for b b is you know one arrow is like that. So, 0 b b is also 1 a. Now, that is how I have taken the 4 transitions 4 arrows you know 4 transitions which were there I got these 4 rules plus I need to take off consider one more kind of productions rules which will correspond to all those arrows which are ending in a final state here there is of course, one final state you see. So, starting from a using one I could have gone to b which is a final state in such a case I will also a I will just write one and not write this state you see this arrow this transition I used to define this particular rule a goes to 1 b, but since b is also happens to be I mean b happens to be a final state of the DFA I will also use I will also define another rule a goes to 1 where I just do not. So, this is something which ending not in there is no on the right hand side there is no non-terminal similarly you see this b. So, b can go to see right because b can be rewritten as 0 b, but b happens to be a this kind of removal of the state or the non-terminal symbol I do only for only on the right hand side only that makes sense right. And now I claim the grammar where I have let me define the grammar now b is v and then of course, 0 1 p and the start symbol here is a because that is the state at which was the initial state and these are my productions. Now, notice this in this example why should I how can I claim that this grammar the language for corresponding to that grammar is actually the set of all strings which are accepted by this finite state. So, the idea for that is not difficult we would like to show what is our problem we would like to show that the language which is accepted by this DFA is equal to the language generated by this context free grammar. By the way that it is a context free grammar it is clear because every production rule is of that form that left hand side is a non-terminal and right hand side is a string over the union of the two sets terminal and non-terminal. So, formally we would like to show that the language accepted by the DFA M is same as the language generated by the grammar G and as before again there are two parts in this proof normally that language accepted by M is a subset of the language generated by G. So, and this is the other ways that L G is a subset of L M. So, let us say proof sketch for A for part 1 I need to show that every string which is accepted by the DFA M can be also generated by this grammar G let us do that. So, if I have a string which is accepted by this machine M that such as string supposing that string is if you say a 1 a 2 a n which is where each a i is 0 or 1 and where do you start as a in the initial state is a. So, let me just write it a and then this a 1 came you will be in some state here either a or b. So, this is you know whatever state you are here and finally, there is only one accepting state and here you would have gone to b. So, let me show you the idea with a simple example may be then you should be able to do it yourself. So, let us say I have this 0 0 1 0 1 0 1 0. Now, this is a string which will be accepted by a b and let us see how it how for the string the state transition for the machine is going to be. So, let me just writing the string is a little separate I mean the symbol separately separated. So, that I can write out the states clearly 0 0 1 0 1 0 1 0 here we will start at the initial state a and on 0 remember the machine stays in the state itself. So, a a and on this one from a it will go to b and then b again if you are in state b and a 0 comes you remain there. So, b and then on this one you will come back to a and then here you are going to remain in a here you will go to b and on this also you will go to b. Now, looking at this and looking at this set of production rules I can show you that what will be the corresponding derivation. See I would like to show that since this string is accepted because this string takes the machine from a to the accepting state initial state a to the accepting state b. So, this string is accepted by the machine m and the claim is that this string can also be generated by the grammar g and you just follow. So, this derivation will go like this a you know we will start because that is the start symbol here a. Now, I see what happens here 0 came and then going to state a. So, let me just write this is allowed because if from a I can use this 0 a and then. So, do you see what is happening a this was the old state symbol new state. This is the old state where do you go 0 a, but this is already there. So, now you can see what is happening basically we are traversing this string and keeping track this derivation is keeping track in a way what is the state in which the machine would have been having scanned the string 0 0. It would be in state a now comes a. So, from a on 1 the machine m goes to b and for that I have to capture that I have that that in non-terminal a can generate 1 b. So, this I rewrite this a as 1 b again you can see the same invariant which is in our mind now as we do it holds when that invariant is the you know that partially if you if you come up to a point and what whatever wherever is the state of the machine m is then the grammar also generates the first part of the string and that corresponding non-terminal. So, this is how it goes. So, 0 0 1 b and now 0 0 1 and this b remember that this is seeing 0. So, it is 0 b and you see what is happening. So, this way it will just go on and when you are here after scanning 0 0 1 0 1 0 1 easy to see the machine is in state b and you will see that you can also generate our derivation will generate 1 0 1 and now the non-terminal at the end is b and you see what is happening is here from b on 0 we are going to b which is fine in for so far as the machine is concerned. But now the derivation must end from this b I will just derive since b is a final state of the machine m and corresponding non-terminal b if you notice I have this production. So, this b I will rewrite it as 0 only 0 and there is and then the derivation stops. So, basically this derivation also mimics the way the string is recognized I mean as you present the string to the machine the machine is going from state to state and after scanning some first initial part of the string this machine is in some state and what we are claiming is our generation generative device this grammar starting from its start symbol it would generate that prefix and then it will be in that the last part what last symbol of that generation is a symbol which non-terminal which corresponds to the state in which the machine would have been after generating that prefix, but we have to end somewhere when we end the final state we end in a final state so far as machine is concerned and that we would use a production like this to say that you know there is no more as in this case no more non-terminal in the string that we have so the generations of the of in this particular case stops. So, we have a terminal string I will not formally prove this is not too difficult to prove both these parts. So, I kind of sketched that this side that language accepted by the machine every string which is accepted by the machine name can also be generated by the grammar g and it is not too difficult using the same intuition that every string which is generated by the grammar g is also accepted by machine name for this particular example I will not prove that, but let me indicate what I should do in general right this is the statement which we would like to show which would like to prove and I give you one example given a regular language how in this example at least we know how we could get a context free grammar which generates the same language. The way we can prove this statement is by taking a general regular language. So, let L be regular let since it is regular let m q sigma delta q 0 f be a DFA to accept L and what we will show is we will give you a construction that we define the context free grammar grammar g and let me use this subscript m that means this grammar g is obtained is defined using the machine m from the definition of m such that the language accepted by the machine m which is of course, L is same as the language generated by this grammar context free grammar g subscript m. And let me show you the construction and the proof that it is indeed the case is something we can leave because the ideas are very simple for the proof the construction is nice. So, it essentially it is the generalization of the earlier example. So, this g m remember it is a context free grammar. So, it has to define it I need to define four components v that set of terminals notice already I have got one component the set of terminals for the machine m is sigma and that is the set of terminals for our grammar also p and s. Now, what is v v is let me let me write it this way v q 1 v q 2 instead of writing v let me just q n. So, it has n non terminals v is this set where q 1 in fact, let me just make it q 1 because I am using. So, the start state is q 1 itself equal to q. So, that means that what I am trying to say is that suppose this machine has n states and I have named them q 1 through q n then for every such state I have a non terminal and which simple way of saying stating the correspondence would be that you subscript a non terminal name with this state name and s is actually a q 1 what is q 1 q 1 was the start state of the machine m and your start symbol of the non terminal which is the start symbol of the grammar is the non terminal which corresponds to the start state of the DFA. Now, I have said what v is what sigma is what s is and so I need to say what p is and we will go by what we said that if delta q y a is q j delta is the transition function for the DFA. If whenever I have such a thing it will be for every symbol I will and for every state I will have such a thing we add the production we have that a q y goes to can be written as can be replaced by a q j. In addition we also have a q y goes only to a if q j happens to be q j is an element of the set of final states of the machine m. If you see that is exactly what how we define that example context free grammar is easy to see with these rules what I have is of course is a context free grammar and the claim is that grammar precisely generates the language accepted by the machine m and the proof is some the intuition of the proof is again like that you consider either derivation in this grammar in this grammar the derivation will we will know we will start with a q 1 and you will keep generating strings where there will be a non-terminal always at the end and finally, we will replace that non-terminal by something you know that by a symbol of the of this and that is precisely one way you can see that particular string would have been accepted by the machine m. The machine m would have gone through the same sequence of states as the sequence of non-terminals which appear at the end is in the derivation and as we had seen an example. So, it is not too difficult to prove that this construction the grammar that we get is this l g m is precisely the language accepted by this machine m once I prove that then I have proved that you give me any context free language I will be able to if you how will you give me I can I can ask you since l is a context free language I am sorry if you give me any regular language I can ask you that let me start you give me the DFA for that regular course which which generates that or which which which accepts that regular language once I have that DFA here is a construction which through which I get a grammar and that grammar is precisely the language accepted by the DFA and. So, therefore, that regular language is also a context free language and therefore, we manage to prove this.