 So, far you have learned what is a formal definition of a language and how to represent them finitely a finite representation. In that context you have learned the notion called regular expression to represent a language. Now, there you have understood already that some of the languages you could represent through regular expression and some of them you could not. So, in this context we will introduce a better tool to represent a language finitely a finite representation. So, in the context we are introducing the tool called grammars as naturally one may expect to understand a language. So, again here the notion of the word grammar is familiar to you and you know the grammars in the context of natural languages. Now, first here we formalize the notion and then we make you familiar with this formal notion through certain examples. First what is the understanding that you have about grammar that is a grammar of a language is a set of rules which are used to construct or validate a sentence of the language that is how we know a grammar that is the expectation that you have. Now, to formalize this notion we have to first look at certain general features of the grammars that we know already from which the important features may be captured and abstracted to formulate the notion of a grammar. So, when you are looking for the general grammars that means in the context of natural languages you may first look it. So, from English let us consider the sentence the boy ate an apple if you question that this is grammatically correct or not. Let us look at this way the article d followed by the noun boy form a noun phrase the article and followed by the noun apple forms a noun phrase and ate is a verb and now if you choose a sentence form S V O so called subject verb object. Now, you can understand that subject or object can be a noun phrase and you can validate the sentence for example, you can validate the sentence or you can verify that the sentence is grammatically correct in English. So, in a as I am showing here you may possibly validate that take the sentence form take the sentence the rule subject verb object then subject you may place by noun phrase and then the noun phrase can be an article followed by noun that rule you may use and the article you place by the article with the and then noun by boy and then verb by ate and in case of object you are choosing now noun phrase the rule and the noun phrase can be article followed by noun and the article here you are choosing an and the noun apple. So, the boy ate an apple can be parsed or can be verified through English grammar in this way. Now, to formulate the notion of grammars first let us see in this context what type of words that we have considered the words like a and the boy book is this kind of words that you have encountered and the words like article noun verb noun phrase this kind of words that you have encountered. So, what is the difference between these two if you look at the second type of words that means article noun verb this kind of things you have to address them further that means if you say noun what is that noun that you are going to talk about. So, you have to tell something about that further and that means these are this is not getting terminated. So, you have to further tell something about this now if you look at the first type that one if you say and the boy this kind of words there is nothing further you are going to say in English. So, that way we may call them as terminals and the first one we call them as non terminals. Thus informally when I am talking about grammars what are the things that you have understood here is a set of non terminal symbols have to be there and a set of terminal symbols and a set of rules. For example, noun phrase can be an article followed by noun or a subject can be a noun phrase this kind of rules that we have used in the derivation. So, a set of rules that you are requiring and now a distinguishing non terminal symbol that means for any sentence what is the main target in a grammar. So, you are going to validate sentences or you are going to derive a sentence through the rules that are existing in the grammar. So, essentially we are targeting to derive a sentence. So, a distinguishing non terminal symbol that you are indicating as a sentence. So, that is to derive any sentence from there. Now, formally we will introduce this is a combinatorial system as a quadruple I am writing here g n sigma p s where n is a finite set you may call them a set of non terminal symbols. Sigma a finite set again a terminal symbols and s belongs to n that means this is a non terminal symbol that we call them start symbol this is representing the phrase sentence. And p is a finite subset of n cross v star the called set of production rules what are the rules that we are here having that they are called production rules and here v is equal to n union sigma. Now, let us look at that set n cross v star here the elements are of the form in n cross v star a non terminal symbol and a string of v star. That means here alpha is a mix of terminals and non terminals a string over mix of terminals and non terminals and this a is a non terminal symbol. So, the rules are essentially pace which are of the form a alpha. Now, for convenience since we are calling it as a rule we write a alpha in this way and call a can be alpha. And now before going to talk about the sentences that are to be validated through a grammar and or sentence which can be derived or generated using a grammar we require certain concepts. So, let us first look at those concepts let us consider a grammar g n sigma p s with v is n union sigma first we define a binary relation that is represented here with the imply symbol with a subscript g on v star. That means we are defining a binary relation on v star v star again I repeat this is a strings with mix of terminals and non terminals. How we are defining this relation alpha related to beta using this relation if and only if alpha is of the form alpha 1 a alpha 2 beta is of the form alpha 1 gamma alpha 2. Here the relation between alpha and beta that for a non terminal symbol a that we are replacing by gamma that means if it is replaced by a production rule then we say they are related. So, that means here and a gives gamma is a production rule in this grammar for all alpha beta in v star. Now, the relation this is called 1 step relation on g because by applying 1 rule from alpha you get beta in 1 step. So, we call this as 1 step relation on g and if alpha gives beta in 1 step we call this alpha yields beta in 1 step in g. Now, another notation here this relation with a superscript star if we write that is the reflexive transitive closure of the relation 1 step relation that we have defined. That means if you take any 2 strings alpha beta in v star we say alpha related to beta with respect to this reflexive transitive closure of 1 step relation if and only if through finitely many strings these 2 are related that means there exist a number n greater than or equal to 0 and strings alpha naught alpha 1 and so on alpha n in v star such that this alpha naught is alpha and you can you can get in 1 step alpha 1 from alpha naught and so on alpha n you will get it from alpha n minus 1 that alpha n is beta. So, that means essentially this reflexive transitive closure the relation reflexive transitive closure is by through finitely many steps of 1 step relation you are relating 2 strings. Now, for alpha beta if these 2 are related through this reflexive transitive closure relation that means through finitely many steps if you can get alpha beta from alpha then we say beta is derived from alpha or you can say alpha derives beta further this expression alpha gives beta in finitely many steps is called derivation in g the notion of derivation is introduced here formally. Now, if you consider a derivation that is alpha gives alpha 1 in 1 step alpha 2 in second step and so on alpha n that is beta if this is the derivation then the number of steps in this will be counted and say that is the length of the derivation. In this case you can see that here are n steps and thus we call the length of the derivation is n and in which case one may denote by taking n in the subscript superscript of the relation. Now, let us have this convention because every time we are using this subscript g in a given context if you are handling with only one grammar g then you may simply use the symbol implies instead of using this implies with a subscript g. Now, if you consider a derivation alpha gives beta infinitely many steps then we also call beta is an yield of the derivation. Now, if you take a string alpha in v star we call this as a sentential form if that can be derived from start symbol s. So, in the grammar under concentration you have start symbol s from which if you can derive this string alpha then we say this is sentential form. In particular if this alpha is only with terminal symbols that means if it is an element of sigma star then that sentential form is called sentence. In this case we say alpha is generated by g and now we formally define the notion of language generated by a grammar g. The language generated by g denoted by l of g is the set of all sentences generated by g that is l of g is set of all x and sigma star those terminal strings which can be derived from the start symbol. Now, if you look at an arbitrary rule in a grammar that we have defined it is of the form a gives alpha or a goes to alpha. If you consider the sentential form that means this string x 1 and so on x k a x k plus 1 and so on x n if this is sentential form that means this is derived from the start symbol. And as this is a goes to alpha is a rule by applying this rule in the sentential form you can get a next step which is of the form x 1 x 2 and so on x k alpha x k plus 1 and so on x n. Now, in this sentential form the replacement is independent of the neighboring symbols of a because the production rule is of the form a goes to alpha wherever a is occurring you can substitute a by alpha to get a sentential form resulting like this resulting like this. Now, as this replacement is independent of a one may call a is within the here you observe that first the non-terminal symbol a one may say this is within the context of x i's the neighboring symbols x 1 x k the neighbor symbols of a are say for example, here x k x k plus 1 or if you go little more x k minus 1 x k x k plus 1 these are the neighboring symbols of a. Now, when one may call a is in the within the context of x i's depending on the neighboring symbols that you are considering. Now, hence the rule a goes to alpha is said to be of context free type the reason why when we are substituting a by alpha we are not worried about the neighboring symbols of a. And thus this rule is called context free type and that the type of grammar that was defined. So, now here we call them as context free grammar or simply c f g context free grammar. Now, the notion of context free language is as follows a language a is said to be context free if there is a context free grammar g that generates l that is l of g is equal to l the language generated by g should be equal to l. Now, let us look at certain examples to understand the notion defined here consider the c f g n sigma p s where n is singleton s sigma is set of a b and the production rules we are considering s gives a b s gives b b s gives a b a s gives a a b. Finally, many rules and left side only one non terminal symbol is only one non terminal symbol is there and right side mix of terminals and non terminals are allowed here we are considering only terminal symbols. For example, here a b and b b here a b a here a a b here finally, many rules are under consideration. Now, what let us look at certain examples of derivations in g s derives the string a b in one step by using the rule s gives a b. Let us look at another derivation s derives a b by using the second rule s gives b sorry s gives b b you can derive the terminal string b b in one step again. Similarly, s derives a b a you can apply third rule within one step you can get the terminal string also s derives a a b and can we have some other derivations in this grammar. There is only one non terminal symbol you will start from the non terminal symbol a and you have to apply one rule to get the next string sentential form in a derivation under consideration. The possibilities that you understand here you may apply the first rule or second rule third rule or fourth rule whatever that you are applying for example, if I apply first rule s gives a b then after that to continue this derivation you require a non terminal symbol on the in this and there is no non terminal symbol here and thus you cannot continue this derivation by starting with first rule. If you want derivation with more than one step for example, if I consider the second rule s gives b b you have the similar situation because right side immediately you have got a terminal string similarly, third third rule or fourth rule because every rule here essentially is of the form x, s goes to say some x where x is in sigma star a terminal string. I do not have option to continue a derivation further and thus here you can understand that any derivation in this grammar is essentially can have only one step and in one step here you can generate a b or b b a b a or a b. There are only four strings that you can derive here thus you can understand that the language generated by this grammar is containing precisely four strings a a b b a b a a a b and these four strings can be generated to understand that this set is equal to l of g every string in this set can be generated by g as we have shown derivations here and these are only strings that can be generated in this grammar thus the language generated by g is this. Let me consider another example by extending this notion whatever the example we have defined here in the previous example once again here we have considered four strings and this is the grammar we have defined now if you take any finite language you have finitely many strings in that say l equal to x 1 x 2 and so on x n finitely many strings are there. Now if you consider the finite set p containing s gives x 1 here we are introducing this notation r symbol r x 2 r x 3 and so on r x n. So, here the notation what we are introducing here is in the earlier case s goes to a b is one production rule s goes to b b is one production rule what the notation we adopt instead of writing two rules separately when the non-terminal symbol left side is both or having s we may write it as s goes to a b or b b this is how we read and in the present context when you have finitely many strings x 1 x 2 x n the finite language by considering the production rules s gives x 1 r x 2 and so on r x n there are n number of rules by considering we can generate the language x 1 x 2 x n. Now here one more notation that we are adopting namely if you are given just set of production rules in this set this context s goes to x 1 r x 2 x n we are not giving what is that quadruple for the grammar we may not specially specify whatever the non-terminal symbols that are occurring in this set that you can consider in n for example, if you have here only one non-terminal symbol and sigma whatever the terminal set under consideration and production rules are defined and the start symbol s is in order from non-terminal. So, using this notation just by stating the set of production rules one can always write the quadruple g and thus using that notation I am simply stating note that the c f g singleton s sigma p s generate the language l. Now, let me give one more example consider the c f g with precisely the following production rules 1 s gives 0 s to s gives 1 s s can be epsilon. Now, we observe that the grammar g generates 0 1 star here I am not stating the quadruple only the production rules are stated if one is interested to write it is easy in this context you have again only one non-terminal symbol terminal symbols are 2 0 and 1 the production rules are stated and the start symbol s you consider and thus the quadruple g is this in the present context. And if I call the grammar g we observe that the grammar g generates sigma star here sigma is 0 1. So, all the strings over sigma can be generated in this using this to prove this first what are all the possible strings that you can generate using this using this rules that we have to evaluate. First of all can we generate some string through this rules that we have to understand for example, using the third rule s goes to epsilon one can clearly generate in one step the empty string epsilon or if you consider the first rule you can get this 0 s in one step. And if you use the third rule then 0 epsilon that is equal to 0. So, the string 0 can be generated using this grammar that means you are generating certain strings through this rules similarly by using second rule one can derive one s and applying third rule you may get simply 1. So, 0 can be generated one can be generated if you want 0 0 to be generated consider the first rule once you can derive like this. And then you apply once more the first rule you can generate the string 0 0 s now if you apply the third rule 0 0 epsilon the empty string epsilon that is essentially 0 0. So, 0 0 can be derived and you understand here through this rules the three rules s goes to 0 s s goes to 1 s s goes to epsilon you are deriving certain strings. And whatever the terminal strings that you are deriving they are over the set 0 1 and thus first you understand whatever the strings that are generated through this grammar the L of g is a subset of sigma star because sigma star is the set of all strings over 0 1. And thus L of g is a subset of sigma star conversely we have to understand that every string of sigma star can be derived in this grammar take an arbitrary string x in sigma star. If it is empty string using third rule in one step you can derive it otherwise if you write it as a 1 a 2 a n where a i either 0 or 1 some n number of n length string if you consider. Now, this a 1 a 2 a n we can derive as shown below this a 1 can be 0 or 1 if it is 0 then apply rule 1 to get that s gives a 1 s if it is 1 then you can apply rule 2 to get 1 s here. Similarly, to derive a 2 there either it is 0 or 1 as discussed here if a 2 is 0 apply rule 1 to get a 2 s here if a 2 is 1 apply rule 2 to get a 2 a 1 a 2 s and continue this to get a 1 a 2 a n s. And at the end apply rule 3 and nullify the non terminal symbol here to get a 1 a 2 a n that is what is x. So, any string x a 1 a 2 a n can be derived in in n number of steps you are getting a 1 a 2 a n in the first step you are getting a 1 in second step you will get by that time of second step you get a 1 a 2 and following s and in n number of steps you are getting a 1 a 2 a n s and in n plus 1 step you can nullify this s by substituting empty string and thus you are getting a 1 a 2 a n. So, this n length string that you are deriving we are deriving here in n plus 1 steps the length of the derivation here is n plus 1 and an n length string can be derived here in exactly n plus 1 steps. And thus you understand that any string over sigma star can be derived to the grammar hence x is in l of g. So, we have got the reverse inclusion and thus l of g is sigma star. Let us consider one more example if you consider the language m t the language m t or any alphabet you can consider we can say this can be generated by a c f g how do you give the grammar I give you one method here. If you consider the grammar singleton s sigma p s where p is empty the set of production rules we did not put any restriction we simply said that this production rules is a subset of n cross v star where v is sigma union n there is no restriction on this it can be empty set also as a subset being a subset of this. So, if you choose the production rules to be empty then we are not going to derive or we will not be able to derive any string this grammar. And thus the language generated by the grammar g defined here is empty or if you want some production rules you can follow this method to if you consider the grammar in which each production rule has some non-terminal symbol on the right hand side some non-terminal symbol on the right hand side on each production rule. So, what is the situation that you get that means say finitely many production rules that you would have let me take for example, a goes to say alpha 1 a 1 goes to alpha 1 a 2 goes to alpha 2 and so on some finitely many rules a k goes to alpha k finitely many rules. And the assumption here is each alpha i has some non-terminal symbol some non-terminal symbol on it is right hand side. Now, you start the derivation you may start with the derivation what is the start symbol yes if start symbol is equal to a i for some i here you may get that alpha i. And then in alpha i what is the non-terminal symbol is appearing and we will continue to some a j you substitute that and you are getting some string beta where beta is obtained from alpha i by substituting the non-terminal symbol that you have. Say for example, you have a j in alpha i that means alpha i is of some alpha 1 alpha 2 and by substituting what is the beta I have got is alpha 1 a j substituted by alpha j and alpha 2. And clearly as we as per our assumption let me call it as instead of writing alpha 1 alpha 2 which I have used here already. Let me call it as alpha dash say alpha double dash what we have got beta is alpha dash alpha j alpha double dash. And in beta you can clearly see that there is at least one non-terminal symbol because alpha j has a non-terminal symbol. Now, there may be non-terminal symbols in alpha dash and alpha double dash also. Now, since you see in alpha j there is a non-terminal symbol for example, if you choose and substitute and if you if you continue like this every time you are observing that there is one non-terminal symbol in each step that you are continuing to. So, you are continue to have non-terminal symbol always in each step and thus any derivation that you start with it will never terminate. And thus no terminal string can be generated using this grammar and thus this CFG with this property you will derive only empty set. So, thus you understand that in a either case if you follow method 1 or follow method 2 the grammar under consideration generates the language empty. Let us look at some more. So, for what I have considered here I gave some grammar and understood that the language is generated by those grammars and now in case in the previous case I have considered empty language and given an appropriate grammar to generate empty. Now, let us consider this because what are the examples that I am starting with they are already known to you that they can be represented by regular expressions because the finite set with 4 strings you know the regular expression of that and finitely many strings that is a regular language sigma star that is a regular language and empty this is a regular language. Now, one more here in the consideration you know this is also a regular language and as we are defining a context with grammar for this languages we are also observing that these languages are context free. So, a typical string in this is having k s it can be empty string 2 s 3 s and so on a power k. Now, for this purpose if you consider anyway you require 1 non terminal symbol as a start symbol let me call as I said s I have to derive a. So, let me take a and I should be able to further continue. So, because I am going to produce only s. So, if I consider the rule s goes to a s I can have this kind of situation s gives a s and using the same rule I can derive 2 s and again apply the same rule you can get 3 s because this s substitute by a s using this rule and if you continue this you can produce n number of a s after n steps. If you want if you terminate this by substituting s goes to epsilon the empty string then you can terminate this that is what is a power n and you observe that by considering these 2 rules s goes to a s s can be a epsilon. I can clearly see that in n plus 1 steps you could generate a power n thus if I consider the grammar with this production rules that is I am not going to write the quadruple as I said you can consider this non terminal symbol and terminal set in singleton a and production rules are stated here. So, just stating this we are stating the grammar. So, let g be the grammar with this following production rules the claim now is the language generated by this grammar is and as you understand here that what are the strings that you can derive using l of g that first we have to look at and if sigma is singleton a under consideration sigma a singleton a under consideration what is sigma star what is sigma star this is precisely a power k such that k greater than equal to 0. And one can observe that whatever the strings that you are generating in this grammar is a subset of sigma star as discussed earlier. Now, to show that any string of sigma star can be generated we can look at the previous example and quickly understand that we have considered by taking sigma to be singleton a. So, thus you are understanding that l of g is equal to sigma star. So, what is the point here is we are treating by considering sigma to be singleton a as a special case of the previous example. And thus you understand that a power k k greater than equal to 0 this is context free because if you consider sigma to be singleton a this is what is sigma star. Let me take let me discuss one more example this you have not seen as regular earlier the language a power n b power n such that n greater than equal to 0 a power n b power n such that n greater than equal to 0. So, how does a typical string look like in this language epsilon by taking n equal to 0 you can have a b here by taking n equal to 1 you can have a a b b a a a b b b and so on. These are this kind of strings are there in this language now look at the pattern of the strings here corresponding to each a here you have a b in this you see there are 2 a's here 2 b's 3 a's 3 b's let me look at let me put the correspondence like this this a is corresponding to this b. So, here let me put the correspondence like this and similarly here this a is corresponding to this b and second a is corresponding to second from the right side suppose if you give this correspondence then you can think of the production rules that can generate this language because in each step because we have to remember that a and b are having this kind of correspondence. So, if I write the production rule like this ferrude the production rule like this and this production rule if you keep applying in each step you can generate left side n number of a's and similarly corresponding to each a right side n number of strings. Let us look at a derivation apply this rule to get a a's b and you have I have written only one rule. So, you have the choice of applying the same look that 2 a's sorry here it is b 2 b's are generated. Now, if you apply this rule once again what do you get a a a a a's b b b now you understand how to terminate this derivation by choosing the rule a's goes to epsilon. Now, these 2 strings this sorry these 2 rules can generate any string of this language and conversely any string of this language any string of this language let me call l can be derived through these 2 rules. Now, if you consider grammar with production rules as defined here a's goes to a's b or epsilon the grammar with this production rules. You can observe that the language generated by the grammar g is l a power n b power n such that m data is equal to 0 you can take this as an exercise. That means, what you have to show a string generated in g is of the form a power n is of the form a power n b power n that is in l and if you take any string in l that means a string of the form a power n b power n can be generated through g. So, prove this as an exercise let me give let me discuss some more examples or any alphabet if you consider the language x power r what is this language all those strings in sigma star which is equal to its reversal that means, the set of all palindromes if you consider this the previous example the concept of previous example can be used to define a grammar for this language. For simplicity let me consider sigma to be a b from that you can understand to give a grammar for an arbitrary alphabet sigma what is the situation here if you take any string x in the language l here let me call l for this language the string x what is the form a 1 a 2 and so on a n there are 2 cases 1 there may be even number of symbols here this is of even length there may be of odd length odd length if it is of even length what do you have a 1 it is of the form a 1 a 2 and so on a k let me write and then there after a k plus 1 they are say b 1 b 2 b k it is of length 2 k n is equal to 2 k and if it is of odd length I may write it as a 1 a 2 a k a k plus 1 and then let me write say b 1 b 2 b k let us write like this this even length string this is odd length string here n is of the form 2 k plus 1 this is the length. Now, look since it is a palindrome the symbol a 1 should be equal to this b k a 2 should be equal to b k minus 1 and so on this a k should be equal to b 1 in case of odd length the same situation either side and a k plus 1 can be anything between a and b. Now, you can think of production rules for the sigma set of a b and give a grammar which generates palindromes try this as an exercise try this as an exercise and hence you can extend this for any arbitrary alphabet sigma.