 So, far we have seen a class of languages called regular languages. We had studied their properties, they are very interesting, but we also saw that there are many languages which this class of languages did not contain. Today, we start looking at another class of languages and this class is called context free languages, the class of context free languages. And we will see that they are very natural class of languages, in a way we will be familiar with in some context or the other. And these languages are defined through context free grammars, that is the way to define these class of languages. So, what we are trying to say is that we are going to define a class of languages called context free languages. Recall, when we say this is a class of context free languages, that means this is a set where each set is a language or a set of strings. And if you go back when we considered regular languages, the way we defined the set of languages, the class of regular languages was that for each regular language, we had an automaton, a finite automaton and the class of strings accepted by that automaton was the language considered. But here instead of a machine or an automaton, we are going to look at a totally different way of capturing a language, describing a language and that notion is that of the grammar. Now, grammar is a notion of course, we are familiar from our school days. For example, let me what I am trying to do is notion of grammar and at this point this notion is informal. We consider a sentence like this that good students regularly. This is of course, an English language sentence and if you recall in a school, we said this is a grammatical sentence that this is an acceptable English language sentence, because we can parse it according to a grammar, grammar of the English language. So, the way we would parse it in school may be something like this that we will say that a sentence, an English sentence it has two parts noun phrase and a verb phrase. A noun phrase may again have an adjective and a noun and a verb phrase can have a verb and let us say a noun phrase preposition followed by a noun as well as an adverb. The details are not important, but I am trying to remind you that we did something like this in school and an adjective, there are many possible adjectives in particular one adjective is good and one noun of many nouns, one noun is students and verb is go preposition is to school and regularly. So, we justified that this is an English language sentence, good students go to school regularly by what we called in school by parsing it according to the rules of English grammar and what were those rules which we have used here that any sentence or a sentence is composed of two parts. So, we say a sentence this thing is a noun phrase followed by a verb phrase then for example, so we have we had rules like this that a sentence can be rewritten this thing can be seen as a noun phrase followed by a verb phrase and noun phrase if we expand further then we had it can be adjective followed by a noun also possibly a noun itself and an adjective good etcetera. So, again the details of English language grammar is not important, what is important here is the abstract notion of what is happening. So, firstly how do we see English language sentences, we say that English language sentences are composed of words, but any juxtaposition of you know set of words will not be a sentence unless they follow certain rules as we derive that sentence you know all these notions will come naturally to the definition of context free language context free grammar. And the set of all grammatical English language sentences are those which can be derived which can be generated using the rules of the grammar with some of these rules that I have said here. So, as an example of a formal language which is in the same style which can be described by means of a grammar let us take this example which is a very simple example here I am considering a particular language over 0 and 1 and very similar to what we had done there let me write in this fashion that an s can be either an epsilon or an s can be 0 s 1. So, these are the rules of this grammar I will describe these things more carefully define these things a little more carefully, but let us understand this example very simple example the way it is what we are saying that look the thing that you can do is to generate strings using these rules why because for example, you say I start with s and then apply this rule s you can this arrow will let us read it as s can be rewritten as 0 s 1 the way same way we can show that or you know we can think of a sentence this whatever the entity is this can be rewritten as two things noun phrase and verb phrase same way I am saying this rule is basically saying that this symbol s can be rewritten as 0 s 1. So, let me do that why stop once applying this rule let me apply it once more right now I have 0 s 1 why not apply the same rule once more. So, that would mean 0 now this s I am rewriting as 0 s 1 and the this old one is there may be I want to rewrite this s again using the same rule. So, then what will happen I will get 0 0 0 s 1 1 1 and then now I finally decide that I will rewrite s using the first rule which is s goes to epsilon, epsilon recall epsilon is an empty string. So, when I apply that I will get 0 0 0 1 1 1 can I apply any more of these rules no because the only way we could have applied these rules if I had an occurrence of this symbol s all right. So, it is very clear that just by using these two rules starting from this symbol and this rules basically rules of rewriting what I have done I have come to a string and that string is over 0 and 1 is this the only string I can generate of course, not this is very easy to see I can generate all strings in this using these rules I can generate all strings of this kind say 0 n 1 n where n is greater than equal to 0 right. So, this set of strings over the binary alphabet 0 1 can be generated simply using these two rules right. Now, what have I got I have got a set of strings right over 0 1 in other words this is of course, is a subset of set of all finite strings over 0 1 and therefore, this is a language in our terminology it is a language it is a formal language this thing is a formal language over 0 1 and I have described it very simply I can describe this set very simply by saying that an element of this set an element is the string over 0 1 1 is a string which can be obtained by repeatedly applying these rules of rewriting the symbol s starting from just one occurrence of that symbol s. So, let us now formalize this idea a little more carefully we starting from this very simple example will try to obtain the notion of a context free grammar. Now, again let us look at this example we see two kinds of things here immediately two kinds of symbols ultimately we are generating a language over 0 and 1 remember the language was 0 and 1 n where n is greater than equal to 0. So, it is a language over this alphabet 0 1, but other than that we are also using a symbol in this case which is this capital S right. So, right away we see there are two classes of symbols one is of course, sigma right sigma is the alphabet of the strings which will constitute the language that we are going to define. So, let us look at it once more clearly suppose I want to define a language over sigma some alphabet. So, of course, I will use that alphabet, but besides that alphabet I am also using here one symbol which is s. So, let me use a set v to say this is the set of non-terminal and this sigma is of course, I let me say it as set of terminals. We will see immediately a little later why one we are calling you are using this terminal as a terminal and non-terminal besides this two of course, these two these two are set of symbols we also have these what we call you know some rules of rewriting. So, let me say this p to be a set of rewriting rules right. So, in this example I have already taken care of s I have already taken care of the binary alphabet 0 1 as well as I have taken care of these rules is there anything else in this example which we need to you know generalize for the notion of grammar s and that is I have to say that look we have how do we start this process of rewriting right. We start we said that the strings will derive will start getting starting from just writing the symbol s. Now, here I have a set of these non-terminal I must say which of these will be the start symbol let me write this s this s is going to be an element of v which will say the start symbol which of course, is an element of the set of non-terminal these four things together let me call to be a grammar g in particular when the rules are like this we will call them context free grammar. So, in this example let us just write this that v was the set s sigma was the alphabet 0 1 p the rules of production incidentally these are also called rewriting rules also called rules of production production alternatively p was just these two rules s goes to epsilon and s goes to 0 s 1 and s there is only v consists of only one particular symbol one non-terminal and the start symbol has to be an element of v and therefore, s is by default the start symbol in this example. Now, we will also we should say that this v and sigma they are disjoint in other words there is their intersection is empty because they serve two different purposes as we shall see little more liberately now and now what we are doing exactly the same thing I will now say in terms of our terminologies that what we are doing is derive or to get such terminal strings we start with the start symbol s and then use a production whose left hand side is s in this case both the production at the same left hand side. So, I instead of this s I replace the left this s with a right hand side of a production whose left hand side is s and we did all this and in the process finally, at some point of time when I have a string in which there is no occurrence of any symbol from v. So, that means the string that I have consists solely of terminal strings or what we are our alphabet sigma then we say that is one string that we have generated. So, using the rules as and when as we wish little let me let me get give a couple more examples before we proceed with the formalism. So, this particular grammar gave me a set of strings as we had seen here which was the set of all strings in which the zeros occur before and the ones occur later and number of zeros is same as number of one. Come to more about this language generated, but let me tweak this a little bit same example or to get something like this let me again. So, this was our example one grammar and what is another example which is also very simple again I will have only one non-terminal s let me use a different alphabet a, b of terminal string terminal symbol and for production I will have this s goes to a, s, a, s can also go to b, s, b and let us say s again can go to epsilon. So, this these three rules will constitute capital P the production rule what kind of strings do I generate from using these rules again we start with s. So, let me let me choose to use this particular rule. So, I will get a, s, a then why not do it once more use the same rule once more so a, a, s, a, a. Now, let me use this rule so a, a, b, s, b, a, a and once more let me use the first rule a, a, b. So, this particular s is being rewritten as a, s, b, s, b, s a, a, s, a these things these three things will follow after that and now I use the final last rule to get a purely terminals string of made truly of this this terminal strings terminal symbol. So, a, a, b, a, a, b, a, a what can you do say about this string that this string reads the same from left to right as well as from right to left is not it a, a, a, a, b, a, a, b, a, a. So, a, a, b, a, a, b, a, a, a whether you read it from the left or you read it from the right you get the same string such a string is called a palindrome as you know we have seen in other context and does it generate all palindrome not really I mean palindromes which are whose lengths are even. So, this is another example now in both these cases what is happening we have you know this v sigma p s we start with s and keep rewriting using p you have seen examples of rewriting that whenever what is rewriting that I have a string in which presumably there will be some terminal symbols and non-terminal symbols a non-terminal symbol in such a string can be replaced by the right hand side of a rule whose left hand side is that particular non-terminal. So, for example, what I am doing this particular non-terminal I am replacing by this because s go s arrow b s b. So, s is the left hand side and b s b is the right hand side of a rule. So, this is what we have been doing all through. Now, let me describe this process little more formally. So, what I am doing is that suppose I have a string which is of this kind alpha 1 then x alpha 2 and x alpha 2. So, this is the arrow gamma is a production rule then from then I mean. So, basically what I am saying is that if I had this and using this rule I can get starting or not starting from this string using this we get we get alpha 1. Alpha 1 gamma alpha 2 right basically I have just replaced this x by the right hand side of this production which now in general x may have many productions. There are many productions whose left hand side is x I choose any and just replace whatever whichever production I choose the right hand side of that is used to replace this particular right. Now, do you see that if I do this then I say that alpha 1 x alpha 2 derives in one step alpha 1 gamma alpha 2. So, is this clear? So, I have again let us let us look at this I have a string what by the way what are alpha 1 and alpha 2 alpha 1 and alpha 2 in general are strings over v and sigma. So, let me write it maybe more clearly here alpha 1 alpha 2 are strings over v union sigma star that means they are strings over the alphabet v and sigma together. So, basically such a string can have occurrences from v as well as from terminal alphabet and I have this string alpha 1 x alpha 2 and I have another string alpha 1 gamma alpha 2. I say that these two strings are related by this relation which I read it as that alpha 1 x alpha 2 derives in one step alpha 1 gamma alpha 2. So, let me write it provided x goes to gamma this is a in a my production. So, what we are saying and saying that look two strings over v union sigma I can relate them by saying that one string derives in one step the other string provided what I do is I rewrite one normal string. So, this is a non terminal in that first string by the right hand side of a production gamma is the right hand side of production whose left hand side is the non terminal which I am replacing. So, this is very simple I had an occurrence of x that is what I am replacing by gamma and that is why I am saying that this particular string is obtained from this particular string in one single step. So, now you know a step here is basically an application of the production rule to rewrite one non terminal by the right hand side of that non terminal of a production. So, this is clear if this notion is clear what does it mean to see a string derives in one step another string. So, now you can you can see that you know I may have some string gamma 1 and I keep on doing this from gamma 1 in one step I get gamma 2 from gamma 2 in one step I get gamma 3 and so on so forth finally let us say I get some gamma n and in the each I am doing the same thing like in gamma 1 there was some non terminal which I am replacing by the right hand side of a production whose left hand side is that non terminal and so on for from gamma 2 also I am replacing one non terminal by the right hand side of a production whose left hand side was that non terminal which I am replacing. So, now I will say that alpha 1 derives alpha n simply derives and let me write it as like this and I will also just for the sake of completeness let me also say that a string also derives itself those of you who are who love long word what this particular relation is the reflexive transitive closure of this relation, but very simply what we are saying that 0 or more applications of this derives in one step if through that I can go from alpha 1 alpha 1 to I should have said gamma 1 gamma 1 to gamma 2 gamma 1 step right. So, here you know I used so many times 1 2 3 and so many times I have made use of n minus 1 derives in one step I have used to go from here to there all right so now using this notion I should be able to describe the language generated by a grammar let G be a grammar and remember a grammar for us as a set of non terminals set of terminals set of production rules and start symbol s and now with this grammar I associate a language we associate a language L G with which is a subset of sigma star that means this is the language consisting of some set of finite strings over the alphabet sigma in the following way the grammar G as follows we say L G is the set of all terminal strings such that s derives remember that notion of derives as opposed to essentially derives in one step derives in one step means that just by one application of the production rules derives when we say derived that means in I can allow many steps to this s derives x. So, what we have seen the look the language generated by the grammar G that is how we read it language generated by the grammar G is the set of all terminal strings x which can be obtained which can be derived from the start symbol of the grammar. Now, do you see every of these notions are every one of these notions are being used in this course to define which all strings s can derive there I am using the set of production rules I am using the start symbol in a very essential manner because a language of the grammar language of this grammar G not only has to be a terminal string, but it has to be also something which is derived from the start symbol. So, I am using p in this what when I am saying that x s derives x I am using p I am using s of course, here and I am using sigma and p is of course, defined through v I am you know v can occur and so in p and therefore, all the notions there are being used all right. Now, let us be a little more clear or you know we have been very informal so far. So, we see G v sigma p s, but with the usual restrictions apply that is v and sigma they are this and so on is a context free grammar each element of p is of the form alpha where x is an element of v. And alpha is a string over v union sigma star you remember all the notation that basically means that alpha is a string using the symbol of v and sigma. Now, these are the only kinds of rules we have they need not be they can be you know the left hand side of the this by the way this is called the left hand side and this is right hand side I have been using that now you know also. So, left hand side can be more complex something it could have been a string right hand side also of course, is a string, but in context free language left hand side is always just one non-terminal and notice that what you are rewriting is always a non-terminal. So, you have some non-terminal you replace that non-terminal in the process of derivation. So, now we know what is the context free grammar and a language L is context free. So, let me now give the official definition of context free language we have been we have seen some examples language L is context is a context free language there is a context free grammar G such that this L is the language derived or associated with the grammar G. We you know the meaning of given G what is the meaning of L G L G is the set of all terminal strings can derive from the start symbol of that grammar and what you are saying is that a language is context free language is a context free language if you can find the context free grammar G such that this language is the associated language with that grammar. We also say that this this L G is the language generated by the grammar G let me use that term L G is the language generated by you have to be careful or you know we should appreciate one fact that while talking of regular languages right what did we do we typically we what we did was that we defined an automaton a finite automaton and then we said the language accepted by that automaton and here what we are seeing the language generated by the grammar G. So, you can see these are two different one we use there is a notion of acceptance by an automaton. So, there we said you know consider the set of all strings and now for each string check whether my machine accepts it or not if it accepts it is in the language if it is not it is not in the language here what we are saying that again consider the set of all strings over sigma. Now, if a string is generated in that grammar through is derived from the start symbol you know like this by the grammar then I say that string is in the language otherwise it is nothing exactly that is what we are saying there. So, here they were there when talking of automaton we had this notion of acceptance or string being recognized here we are seeing we are generating that string that is why these grammars are called generative devices. Now, let me take one more example. So, that these ideas become a little clearer consider another grammar this time we will have not just one non-terminal, but a number of non-terminal and it goes like this. Let me first write out the productions I have the start symbol I again use s. So, s goes to a, b and s goes to b, a then a goes to a, s and a can go to b, a can go to b, a can also go to a, b goes to I will give the productions whose right hand left hand side is b. So, b goes to b, s while reading these rules you can say s goes to this or you can say s can be rewritten as this whatever whichever is more natural. So, now, these are the set of productions. Now, you can see what are my set of non-terminal, b is s, a, b, sigma here is the set of non-terminal and this is your set of productions and s is your start symbol. So, my grammar g is b, sigma, b, s and where s is the start. So, let us try to derive a couple of things from here. Now, first of all do you agree that this is a context free grammar, this grammar that I have described here is a context free grammar. Why? Because of course, I have this v, sigma, p, s and my each of this production is of the type that I have a non-terminal to be and the right hand side is some string over terminals and non-terminal. So, this is indeed a context free grammar and therefore, the language generated by this is going to be a context free language and let me just derive a few free of this language. So, we start with s and maybe I use the first production. Now, you see I have a choice, I can use any one of these. So, why not let me use this more complex looking one, let me use the second one. This b therefore, I rewrite as a, b, b. Now, what I can again apply one of the b rules and notice now that I have a choice that I could either rewrite this rule, this particular non-terminal, this occurrence of b or this occurrence of b. We are not saying that which one you should do. So, maybe let me do the one at the end. So, this b I keep as such and that b I write it as b s. So, again I have a string I have from s I have derived a string in which I have this two non-terminal. You see it is in my hand I am doing the derivation, I could either use a production whose left hand side is b or I can use a production whose left hand side is s. Let us say that we use this one and maybe this time let me use this particular one s goes to b a. So, this s goes to b a and now what I will do is where did I stop, s goes to b a and now which one I will replace. So, let me replace this b by this one and then I am going to replace there is nothing to replace here. This is b, this is b, here let me replace this by and a. You know I made one mistake in this derivation. Can you spot where the mistake in that derivation is? I said that we will replace this b. This capital B there what we did was I replaced this b with left hand side of right hand side of this particular rule which is a b b which I wrote, but then I forgot to write this particular the first a. So, I should have an a all the time before this particular a in the beginning. So, let me do this now and I have got a string which is a a b b b a. Just one more derivation of the derivation from the same grammar. The point at least I tried to which I tried to illustrate here was that at any given time you may have an choice choices. You have two kinds of choices that there are several non-terminals which one you should replace. You have a choice. Also given making having made the decision that I am going to replace one particular non-terminal by the right hand side of a production which of which production you know like b when I chose here that I will replace this I mean here I said I will replace this s. So, I had a choice I could have either use this production or that production. So, nobody is going to tell us while we are deriving which one I should do. That is fine you know any one of these productions whose left hand side matches the non-terminal which I want to replace is equally welcome. And in the process of doing this finally, I have come to a terminal string which cannot be can you do anything further no because only thing you can do from a string as in the process of derivation to get another string is to replace one of the non-terminals by the by using one of the production rules. But, here there are there is no non-terminal at all. So, the production the derivation will stop derivation stops once you have a string which is all terminals.