 We will start with a very quick recap of what we did last time, called we define first what a context free grammar is and we said a context free grammar G has four components V, sigma, P and S, where V was what we call is a finite alphabet, V was the alphabet of non terminals, sigma was the alphabet of terminals and P was the set of productions, where each production is of the form that there is a left hand side, an arrow and a right hand side, where this left hand side has only one symbol and that symbol is an element of V. So, essentially you take a non terminal and you read it as to be replaced with or can be rewritten as alpha, where alpha is a string over V union sigma star, what it means is that right hand side can be a string over terminals and non terminals and a fourth component S is a special symbol from V, what we call the start symbol or start non terminal if you wish and which is an element of, then what we defined was a notion of derivation and for that first of all we talked of something we use this notation, we said that alpha derives in one step beta provided this will hold provided alpha is of the form, let us say alpha 1, A alpha 2, where A is a symbol from V, alpha 1 and alpha 2 are strings over V union sigma right and beta then, so basically this alpha is some string over V union sigma terminals and non terminals, where at least one non terminal must be there. So, suppose call that A and A goes to beta is in P, suppose this is one first that alpha is of some of this form that you can write it as alpha 1, A alpha 2 and A goes to beta is a production then and then beta is of the form, this alpha 1 I should have said beta 1, because this beta and this beta we should not get confused, so alpha 1 then this A where this A is replaced by beta 1 and alpha 2, so basically we are saying that these two strings alpha and beta can be related in this manner that alpha in one step derives beta provided alpha has this form, where the non terminal A is you are replacing with the right hand side of a production of that non terminal and then the string that you get is obviously the string beta. So, notice that this is really a binary relation over this set, set of all strings over V union sigma and then we said that sigma star will hold between two strings. So, let us say alpha 1 and alpha 2, if I can find alpha 1 in one step goes to alpha 2 goes to alpha 3 and so on and may be up to alpha n and then this alpha n in one step let me say this is beta and this relation again is over strings over V union sigma. So, this is also a binary relation which relates two strings and such a relation will hold between two strings provided the left hand string after a series of this one step derivations lead to the string which is in the right hand side. So, therefore, this particular relation is called derivation in 0 or more steps and here of course, that it is more than one steps that is how I have shown it, but this is a particular case the other two other cases have that that will say that alpha 1 also relates to alpha 1 this is in 0 step and in one step is this itself. So, this holds for all alpha and then if alpha in one step goes to beta then also we say alpha in 0 or more steps goes to this same string beta and this is the more general case that you are using more than one steps to go to the right hand string and once I have this notion then it was very easy for me to define what the language generated by a grammar G was. So, which was L G for this grammar is going to be set of all terminal strings which can be derived in general we just say this is derived, derives alpha 1 derives beta S derives I should have I will write it this way that L G is the set of all strings X in sigma star such that S derives the string X. In other words what we see the language generated by the grammar G consists of all terminal strings we see X in sigma star sigma is the set of terminal. So, all terminal strings which can be derived from the start symbol S. So, that is the notion of the language generated with a grammar you see once I define a grammar then that uniquely defines a language over which is over sigma and that is this language. Last time we gave examples of three grammars and in the last grammar was you recall this was the grammar and I showed a few examples at least one example of a derivation. Now, here is an example of a derivation using this grammar. So, you can see that from S I am using this rule first rule to come here and then this B is rewritten as with this particular production and so on and finally I am getting this string. So, therefore, this string is in the language of this grammar. Now, what and I told you that time that we when we have a string in the as starting from S when you get a string something like this then consider this if I have two non-terminals in here. Here first I chose to rewrite this the one terminal non-terminals this particular non-terminal B you know if you see that this B is rewritten as small b and then this particular B was rewritten as b s. I can derive the same string by always following one particular convention that whenever I have a number of non-terminals in my string that I have derived so far there I will always choose to rewrite the left most non-terminals. Such a derivation is called left most derivation and this same string using the same production, but if I now use this convention that whenever there are more than one non-terminals that first I will always rewrite the non-terminals which is left most in the string that I have. So, in this case I have two non-terminals I will rewrite this particular B and not this particular B. It is not too difficult to see if you think about it that if I can derive a string at all then that string will have a left most derivation. Now, why it is happening the main reason for that is the reason why this why this grammar is called context free grammar why why do we call it context free? We call this grammar context free because you see when I choose to rewrite a particular non-terminal the production that I use need not consider at all the context in which that particular non-terminal is occurring. So, every non-terminal is rewritten independent of the context. Now, what is meant by context for example, this particular B is occurring in this context in that that there is you know some a in front of it and something after it or you know whatever you can see this is the in the left I have this in the right I have this. So, that is the context in which this particular B is occurring in this string. Now, clearly the way we have considered our grammar my production has rules of this kind non-terminal to be rewritten by something else that is the right hand side of the production. It does not say anything it does not constraint me in any manner where how that non-terminal is occurring and that is why such a grammar is called a context free grammar and when this particular liberty is not there then we get a grammar called context sensitive grammar that is a separate topic altogether. And as I said because of this nature of our productions that left hand side is always a single non-terminal and that is the main reason why I can claim that. So, let me write it that if s derives x then so any what I mean is this x let me say it x is in sigma star then there is a left most derivation is true of all context free grammar not particular to this example as you as you can see. So, essentially here I could have said that an in particular I could have even constrained it saying that this derivation is a left most derivation I will not change the language in any manner. Now, one more thing we should notice or I have been using an convention which is always generally followed that capital letters like A capital A capital B capital S etcetera these are used for non-terminals. And small letters from the beginning of the alphabet they are usually used as terminal symbols elements of sigma. So, capital letters are elements of v and small letters from the beginning of the English alphabet that is usually these are considered to be elements of the sigma which is the set of terminal. And also I am using some Greek letters alpha beta small case Greek letters these are strings which are over v union sigma if you notice this is the kind of convention I am following. So, let me write this convention here capital letters non-terminals that is these are elements of v second convention is letters like A B C you know 0 1 these are elements of sigma alpha beta etcetera you know Greek letters lower case Greek letters these are considered these are strings and these strings are over v union sigma. So, that means strings over this such as alpha is a string which can have both terminal and non-terminal. And finally, the other usually we say x y z u v w. So, small letters from the end of the English alphabet these are typically used as terminal strings. So, these are elements of sigma star and that is the reason I am saying these are these are not really very important you need not follow, but usually in the literature in books this convention is there. So, every time I write something I do not have to say whether it is a terminal or non-terminal unless otherwise specified we use this convention. Now, there are several things we need to you know go beyond this we have defined something called a grammar we have associated a language with such a grammar. Now, one more thing you notice that this grammar is a finite object in the sense v is a finite set sigma is a finite set p is a I am not written it explicitly, but these are again it is a finite set of these kinds of production rules and s is of course, is just an element. So, g in itself all together all its if taking all its four components in account is a finite object. However, the language associated with g of course, need not be finite. In fact, in general l g language associated with a general context p grammar is going to be infinite. So, we are back to our old concern if we see what was our old concern when we talked of regular languages that I have this infinite set. How do I represent it finitely and there we used several notions like we used the notion of a machine machine again. So, finite state machine was a finite object and that finite state machine with such a finite state machine we associated a language and uniquely like here also if you see with a grammar g what is the language associated or the language generated by the grammar g that is unique once you specify what the grammar g is. So, instead of describing this language in whatever way you have buying in English or whatever I give you this finite object g and I said look the language that I am talking of is a language generated by the grammar g. The way for a regular language I could have given you again a finite description which was which could have been a machine which could have been a regular expression. So, these are the ways we talked of or finitely specified an infinite object like a typical regular language. So, in that sense again these objects these grammars they are finite objects, but they help us specify languages which in general could be infinite. Now, coming back to this derivation of course, I can talk of a derivation or you know specify a derivation like this. There is a more convenient way of specifying or seeing how a string is derived in a context free grammar and that is through what is known as parse tree. So, let us now see what parse trees are first of all parse trees is a graphical way of seeing what how a particular string is derived. In fact, let us take this example what we will do is let us start with s and I look at it as the root of a particular tree which we are going to see. And then if you see in this derivation s was rewritten as a b. So, what we will do is let me complete this and then we will discuss. So, then this b was rewritten as a b b and then the last b this b was rewritten as and this b was rewritten as b s and this s was rewritten as a b and then this particular b was rewritten as this is a rooted tree. And notice each node is labeled with a symbol that symbol could either be a non-terminal or a terminal. However, one there if a node is labeled with a terminal symbol then it is a leaf node. In this tree the leaf nodes are this, this, this, this, this, this. So, in this is a derivation tree or parse tree and in this parse tree the internal nodes are always are labeled non-terminals. So, because a node is internal it is a non-terminal and its children spell out a rewrite rule for or a production. So, here you see that b this non-terminal its children are if you look at it from left to right a b b this non-terminal is b and this children are a b b right. And notice that this was one of the productions. So, another way of saying the thing would be that internal node will have as children the symbols of the right hand side of a production whose left hand side is that symbol. See that is what is happening this is an internal node which is labeled with s. So, this internal node it is labeled with s its children as you read it from left to right or the labels of its children as you read these labels from left to right they must spell out the right hand side of a production whose left hand side is s. So, here s goes to a b indeed this is one of the productions that we have. And what more the frontier of the tree what is the frontier of the tree the way you see and the way we know is basically read the all the leaf nodes from left to right. So, how when you read the leaf nodes from left to right they are going to be a a b a b b. So, first of all remember in a parse tree every leaf it has to be labeled with a terminal and an internal node has to be actually a one of the non terminals because its children will depend on the right hand side of a production starting with as left hand side whose label was that particular node. So, clearly as I said internal nodes are labeled with non terminal extra or leaf nodes are terminals right the root is the start symbol s and I explain to you let me not write it that what is the relation between an internal node and its children. Now, if you if you see this is clearly a graphical description of a of a derivation now what about the left most derivation and I you agreed that this or this derivation this was the left most corresponding left most derivation corresponding in the sense that we use the same productions. However, whenever there was a choice of expanding one of the two or more non terminals we always choose to rewrite the left most non terminals, but we would have rewritten here the same way as that particular non terminal was rewritten in the derivation. So, in that sense it is a corresponding left most derivation and as I said that it is not too difficult to see that if a string can be derived at all if a terminal string can be derived at all there it can be derived through a left most derivation. Point I am trying to make is if you if you say that if you try to draw the parse tree corresponding to the left most derivation you will get the same thing because S goes to a b that is fine from this b was rewritten as a b b that is what you did here and so on you see. And then when you read out after all the term in non terminals have been rewritten you will be left with the terminal string and if you read that I mean the frontier of the tree will be all terminals and if you read that frontier that is the terminal string generated. So, in a way you can see like there can be many derivations corresponding to which there is a unique there is a left most derivation for a number of derivations and left most derivation and a parse tree they are again kind of one to one. If you give me a left most derivation I can give you a parse tree and if you give me a parse tree I can of course, just look at go through this parse tree and tell you what the left most derivation was. But you see given a parse tree I may not be able to say whether this was the derivation you actually did or this derivation you did, but both generate the same string and I will give you the left most derivation given the parse tree. So, parse tree is left most derivations they are kind of equivalent either you give a parse tree or a left most derivation. We will most of the time deal with parse trees because somehow you will see the intuition about context free languages come out better once you consider in these that a string the derivation of a string can be seen as a tree. Now, I want to spend a little time about the languages that we have generated so far using the grammars. One example was this the other two other examples were the first one was the simplest and the remember that our that g 1 if I call it that had only one non-terminal and it had terminals 0 and 1 let us say. And the p was simply consisting of s goes to 0 s 1 or s goes to 0 1 these are the only two productions that I had and of course, there is only one non-terminal and that is s the start symbol. Now, I even claimed that look this grammar generates this language. So, all strings consisting of 0 s and 1 s 0 s followed by 1 where the number of 0 s is equal to number of 1. Can I prove this how will I prove this remember that I this again an equality of two sets here the this set is the set of all strings which are generated in this grammar g by through this grammar g and the right hand side is this set. So, I should be able to show that equality of two sets the again the similar I mean similar things I have said before that I should show this as well as that this l g 1 usually that is how set equalities are proved. Now, that means I would like to show that so let let us just show that every string of this form 0 n 1 n can be generated through the grammar g 1 how do I show that. So, which of this I am proving I want to prove that every string of this type can be generated by the grammar g 1. So, essentially I am trying to prove this one. So, you see this such a proof we will be through induction that will be the most straight forward way of doing it. So, what is the base case base case is when n is equal to 1 then the simple one step derivation that s just rewrite at 0 1. So, s derive 0 1 which is a rule which I use and I get. So, this is the base case what is the induct induction that suppose I assume that strings of the skind 0 n 1 and all strings up to some n can be generated of this form then I should be able to show that 0 n plus 1 and 1 n plus 1 also can be generated. So, what I can prove a little stronger thing that from s I can derive 0 n s 1 n always for every n. So, again we can do it through induction this particular part we can do it through induction and it is fairly clear. Now, suppose I go to the induction step that I can derive this and then I would like to prove assuming this to prove that see s can also derive, but that is easy is not it. So, once you assume this one then in another step this s you can rewrite as 0 s 1 then of course, this 1 n which is nothing but 0 n plus 1 s 1 n plus 1 and then this particular s finally, you can rewrite as 0 1 and this way you can see that you know this particular proof essentially show shows that all strings of this skind 0 n 1 n where n is greater than equal to 1 can be derived by our grammar and the this part is basically saying that this grammar does not derive anything other than strings of the form 0 n 1 n, but that is again the same thing you see is not it that in one step it just derives 0 1. Now, assume in you have carried out many steps and then derived this and one more further step it will derive only 0 n plus 1 1 n plus 1 or 0 n plus 1 s 1 n plus 1. So, therefore, if you easy to see this grammar will generate only strings of this kind. So, put together that this grammar generates this language. Let us take this example which is little more interesting see this is the grammar g where the set v this non-terminal there we had only one non-terminal, but we here we have 3 s a b and terminals are a and b and s of course, is the start symbol and these are the production. Now, what is the language generated by this grammar let me write it what does it generate this is all strings over a b star length of x is greater than equal to 2 such that x has equal number of a's and see for example, this was the string that we generated in this one particular string that we generated a a b a b b. So, this particular string has 3 a's and 3 b's. So, number of a's same as number of b's. What I want to claim is that this grammar generates all strings with equal number of a's and b's with this with the with at least one a and therefore, one b. So, it will generate a b b a can you see how will it generate b a for example, yes s will start with b use this as the first production and then rewrite this a as small b a you can generate. If you if you play with it you can see it should be you will convince yourself that you can generate indeed looks like I mean we can generate all strings, but how do I prove that this is indeed the case. Let me let me make one point I will not spend too much time on this is you see I will understand what are the terminal strings that you can generate starting from a's, but then I need provided I also understand what are the terminal strings generated by the other non terminals a and b and let me let me make this claim. So, here of course, this is stating that s generates if s generates x in sigma star if this is the case then x has equal number of a's and b's. So, consider starting from a and deriving strings. So, if a from starting from a in after some steps you get x which is again a terminal string then this x has one more a's than the number of b's. So, a generates all strings of a's and b's where the number of a's is one more than number of b's and similarly if the b cases capital B case that if b generates a string which is a terminal string then that string has one more than the number of a's it has. So, for example, from a you should be able to derive a a b a this has two more a's. So, you need one more a sorry one more b. So, this string has three a's and two b's. So, the difference one more a than the number of b's. So, I am claiming that a should be able to derive it. Can we see that if you believe this is true then that is this is easy to show because from a I use this production this where is this a I use the first this production a's and now what I have is equal number of a's and b's and using this fact if I believe that I should be able to claim that this a's can be rewritten as a b a b. So, therefore, this string can be generated through it. Now, the point is to prove any one of these I require to use it should not be too difficult to see this that I require the use of the other two possible and how do I prove such things. So, basically through simultaneous induction all these three statements I prove using induction, but simultaneously. So, what are the base cases? Base cases is for this the base case will be a b does a from a's can you generate a b that is simple that you can just see that from a's I go to this and then this I write that or and of course, a's generates b a this is the two base case strings for this and now consider this is the basic idea is something like this of this proof through simultaneous induction. Now, consider a string which has n a's and n b's where n is larger than 1 if that string. So, consider x and x is in a b star and x as n plus 1 is and n plus 1 b's I am just trying to sketch a proof very quickly through induction hypothesis I can see that a's can generate strings of this kind y where y has every any y which has equal number of n a's and n b's that is my n b's. Now, you see supposing x I would like to show that then a's can also generate this x which has n plus 1 a's and n plus 1 b's. So, that string x is either of the form a y 1 or that means either it starts with a or it starts with b if it starts with a then what is this y 1 y 1 is a string with 1 or b see x has equal number of it has n plus 1 a's and n plus 1 b's. You take out the 1 a so then it will have this string y 1 will have n plus 1 b's and n a's. So, such a string by induction hypothesis I would claim that can be generated by b. So, what I am going to do that I will start with a's and say a's deriving a b and now through b using the induction hypothesis I say I can generate the string y 1 because it has 1 more b's than the number of it. Now, similarly you will see for example, let us take the more interesting case. Now, how do I prove these again I will of course, require the help of the other two let me show that. So, for example, I want to show that if a string how the induction will go through for a string which has 1 more a than the number of b's it has. So, let us say x as n plus 1 is and n b's such a string x can start with small a or also can start with small b's. So, this x supposing starts with a then it is of the form a y 1. So, y 1 then has equal number of a's and b's therefore, it can be generated by a's. So, you see the point is that a I will use this one production to say a's and now I use the induction hypothesis to say that a's generates y 1. So, therefore, a generates a will derive x, but x could have started with b as well. So, let us take that case that is more interesting y 2. So, x has n plus 1 a's and n b's and now x is starting with b. So, what can you say about y 2? y 2 is a string which has you see you have taken see a x had n plus 1 a's and n b's and you are taking one of the a's out. So, now y 2 will have n plus 1 a's and n minus 1 b's. So, this y 2 is a string which has 2 more a's than the number of b's it has. Now interestingly clearly such a string y 2 can be rewritten as rewritten in the sense there must exist. So, let me put it this way there must exist y 3 y 4 such that y 3 has y both y 3 and y 4 as one more a's than number of b's and this y 2 is nothing but y 3 y 4. So, what I am saying such a string which has got more than or not more than exactly 2 more a's than number of b's it has you see what you can do such a string you know you start from left and you should be able to find it is it is not difficult to argue this that if it totally it has 2 more a's than number of b's. So, there will be a part where it has one more a's than number of b's and here it will have one more a's than number of b's in this part. So, therefore such a thing can be generated by this can be generated this has one more a's than the number of b's here. So, that can be generated by a and this part also can be generated by a and then I so what is happening such an x I can I want to claim that this x can be generated can be derived starting from a. So, I will start from a and use this production b a a and this a I will use to generate y 3 and this a I will use to generate y 4. So, in this manner I can prove that all strings which has one more a's than the number of b's that can be those such strings can be generated using starting with a is a if you if you I would ask you I would request you to actually formally use this intuitive idea and prove that indeed this grammar g generate all strings precisely generate all strings with equal number of a's and b's.