 Welcome to the lectures on syntax analysis. So, in this sequence of lectures, we will learn about context free grammars, push down automata and parsing. So, we will understand what exactly is syntax analysis and then study context free grammars which are the basis for specification of programming languages. Parsing context free languages is based on push down automata, just like regular language recognition was based on finite state automata. We will study two types of parsing, one is the top down parsing, the other is the bottom up parsing. So, in top down parsing we will study LL1 and recursive descent parsing techniques and in bottom up parsing we will study LR parsing techniques and there is also a tool called YAC which is based on LR parsing. We will see examples of how to use it for parser construction. So, what exactly are grammars? So, every programming language you know has to be described very precisely and a grammar is used for describing the syntax of the programming languages. So, for example, you know if you consider a language such as C or Pascal, a grammar can be written to describe the syntactic structure of well formed programs, correct programs. When we say correct we do not mean correctness at run time, but correctness as far as the syntax is concerned. The grammar rules for such a grammar state how functions are made out of parameter list, declarations and statements and in turn they will also say how statements are made up of expressions and in turn how expressions are made up of numbers, names, parenthesis, etcetera, etcetera. Grammars are very easy to understand as we will see and parsers for programming languages can be automatically constructed from grammar specification of certain types of grammars. Not all grammars can be used for automatic construction of parsers, but certain types can be and it is important to note that parsers or syntax analyzers are generated for a particular grammar. So, in other words if you I told you that there is a tool called YAC which can be used for generation of parsers automatically. So, we input a particular grammar and then out comes a parser which checks sentences based on that particular grammar. For a different grammar we need to generate a parser all over again, but if there are couple of grammars available different grammars available for a particular language based on the restrictions placed in the tools one of those grammars may satisfy the restrictions and that can be chosen for parser generation. It really does not matter which grammar is chosen as far as long as the restrictions of the generators are met and context free grammars as I already said are usually used for syntax specification of programming languages. So, context free grammars are a subclass of programming languages. So, I told you during lexical analysis there are different types of languages in grammars. So, there are regular languages, context free languages, context sensitive languages and type 0 languages. So, context free grammars are used to specify context free languages and these are the most useful for programming language purposes. What exactly is parsing or syntax analysis? So, what does a parser do you know you are given a programming language. We wrote a grammar for it and then let us say we also wrote a parser based on it or we generated a parser based on this grammar. So, it varies the parser verifies that the string of tokens for a program in that particular programming language can indeed be generated from the grammar that we have provided as a basis for the parser. Tokens are nothing but you know the entities which a lexical analyzer curves out of character streams. So, the tokens form a sentence in a particular language and the parser checks whether that sentence is indeed from the language of the parser. It reports any syntax errors as the case as in the case of lexical analyzers and it constructs a parsery representation of the program. We will see what parseries are, but it is not always necessary to construct a tree explicitly. Sometimes it is possible to do without it. It usually calls the lexical analyzer to supply a token whenever it finds that it requires a token to proceed further. Of course, they could be handwritten or automatically generated as well and our parsers are all based on context free grammars. So, grammars of course, are generative mechanisms whereas machines such as finite state automata or push down automata are accepting mechanisms. So, grammars are very similar to regular expressions. So, you know and push down automata are the machines corresponding to context free languages such as you know just like FSA or the finite state automata or the machines corresponding to regular languages. So, let us define a context free grammar. So, a context free grammar is denoted as G equal to a quadruple n comma t comma p comma s where n is a finite set of what are known as non terminals or variables. T is a finite set of what are known as terminals. These are the terminals I mentioned in lexical analysis would correspond to the tokens. So, tokens of a lexical analyzer are the terminals of a context free grammar. S is a non terminal which is which is special it is a start symbol and p is a finite set of productions and for context free grammars all the productions are of the form a arrow alpha v a going to alpha that is how it is read where a is a non terminal and alpha is a string a combination of made up of both non terminals and terminals. So, n union t star. So, usually whenever there is no requirement we do not mention all the n t p s components separately, but we just provide p assume that the first production you know shows the start symbol on its left hand side. So, let us take some examples the first example is e to e plus e e to e star e e to parenthesis and e to i d. So, these are the four productions corresponding to the grammar here there is exactly one non terminal the bold face symbols are all non terminals and lower case symbols are all terminal symbols in our examples here. So, e is the only bold face symbol and that corresponds to the single non terminal and then terminals are plus star the left parenthesis right parenthesis and the i d. So, 1 2 3 4 5 terminal symbols one non terminal these four rules correspond to the productions the as I said the first production shows the start symbol on the left hand side. So, e is the start symbol. So, let us later understand what exactly this particular grammar generates, but for the present go let us move on with other examples and understand other possibilities of specification. Second example also there is exactly one symbol which is the non terminal that is the s the terminal symbols are 0 and 1 epsilon as visual is the null string, but it is neither a non terminal nor a terminal it is the empty string, but to some extent informally we can say it is a terminal symbol because it does not you know generate any more symbols from it. So, it cannot be a non terminal the third one has two productions s going to a s b and s going to epsilon and again here s is the non terminal a and b are the terminal strings. For the fourth one the notation is slightly different. So, as the first production starts with s. So, this is the start symbol the non terminals are s a and b these are the only ones which are mentioned all over little a and little b are the terminal symbols. So, this vertical line or the bar which is present between a b and b a indicates that there are these are the two productions s going to a b and s going to b a this is a shorthand for writing productions with the same left hand side non terminal. So, this would be a going to a or a going to a s or a going to b a a and this would be b going to b or b going to b s or b going to a b b. So, even here I could have written the four productions as a going to e plus e bar e star e bar parenthesis e parenthesis bar i d. So, and even this could have been written in a single line separated by bars the same is true for 3 as well. It is just a question of making it convenient to read. Now, let us move on to a very important concept known as a derivation. This is necessary to understand what exactly is a sentence derived by a grammar. So, here is an example e then we have an arrow and there is a production written here e to e plus e and there is a there are three symbols e plus e written here. Similarly, again e to a the arrow e to i d and i d plus e finally, another arrow e to i d and i d plus i d. So, this is a derivation. So, as you can see the first symbol is the start symbol of this particular grammar. So, now we are looking at this grammar and derivations from this particular grammar. The last three symbols i d plus i d are the terminal symbols. So, if i d plus i d is regarded as a terminal string, then this sequence of steps that we have shown here corresponds to a derivation of the terminal string i d plus i d from the non-terminal or the start symbol e. So, what is written here is this symbol the the arrow big arrow corresponds to derivation. So, we read this as e derives e plus e and the production which is written here is used to tell the reader that the derivation uses the production e to e plus e. So, the way this works as the same is true here the production used is e to i d and e derives a d plus e and again i d plus e derives sorry e plus e derives i d plus e and i d plus e derives i d plus i d. So, at each of these steps there is the production which is used and it is noted at the top of this arrow symbol. The way the derivation works here is a non-terminal e. So, and we know that there are four possibilities for the non-terminal e there are four productions which start with e on the left hand side. So, to derive any particular a particular string we could choose any of these which one is appropriate depends on the terminal string that we want to derive. In this case we want to derive i d plus i d therefore, the only production which has plus in it is e to e plus e. So, let us try out that particular production. So, the symbol e is now replaced by the right hand side of its production namely e plus e that is the reason why we have written e plus e here. So, this is this e plus e is known as a sentential form. So, in this sentential form this is an intermediate form before we reach i d plus i d we can choose to replace this left e or the right e any one of them and in any particular order it really does not matter the order does not matter. So, let us assume that we are replacing the left e by an appropriate right hand side of a production. Again there are four possibilities and since we are only looking at a particular string known as i d plus i d we do not want to derive any stars or parentheses. So, we choose the production e going to i d for that purpose. So, the left e is now replaced by the right hand side of the production e going to i d. So, this becomes i d and the rest of the sentential form remains that is plus e. In the third step i d and plus cannot be refined further there is nothing to do they are already terminal symbols and this e can be replaced by another i d and in that process we use the production e to i d. So, e is replaced by the right hand side of the production e to i d and this becomes i d plus i d. So, this entire process I just now described gives you the string i d plus i d starting from the start symbol e. So, we read we write this process very concisely as e derives stars say 0 or more steps e derives i d plus i d. So, this is a derivation of the string i d plus i d from e. So, once we understand how to derive strings from the start symbol we are also ready to define the language which is generated by a context free grammar and then we will take up more examples of the symbol you know the grammars and the further derivation trees and so on. So, here context free languages are specified by context free grammars and context free grammars are said to generate context free languages. So, how do once we are given a grammar g the language generated by that grammar is denoted as l of g just as for regular expressions we wrote l of r here we write l of g. So, this is a set of strings w w is a set of strings. So, it has no non terminals in it. So, t star t is a set of terminals and t star it is it is you know closure. So, w is in that closure of t star. So, that means it is a string of terminal symbols and we just describe the derivation process. So, s derives w s is the start symbol. So, all those strings which can be derived from the start symbol and belong to the terminal string set star that is the closure of the terminal string is actually in the language l of g. So, this is how we describe or rather define the language generated by a grammar. So, for the first example which had these four productions this is as you can see this is a you know this language corresponds to arithmetic expressions with plus star parenthesis and id and then the second language is a set of palindromes over 0 and 1. The third language is a and b and with n greater than or equal to 0. So, let us look at that. So, here we have 0 s 0 or 1 s 1 or 0 or 1 or epsilon being generated by s. So, as you can see the number of 0s and number of 1s are balanced on both sides of s. So, once a 0 is on the left side there is a 0 on the right side as well. So, that means this can generate you know palindromes. Here you know this generates any number of a s and any number of b s which are equal in number on both sides of s and therefore, easy to see that it generates a n b n and the fourth one is not so intuitive. So, it generates an equal strings with equal number of a s and b s. So, the you know you can actually check out a couple of derivations and make sure that you understand the operation of this particular grammar. So, a string alpha which is a combination of both non terminals and terminals is a sentential form. So, I mentioned this already. So, if s derives alpha. So, the difference between a sentential form and a sentence is a sentence has only terminal symbols whereas, a sentential form has both non terminals and terminals. And when are that when are two grammars g 1 and g 2 equivalent only if their languages are exactly identical. So, let us move on and understand the concept of derivation trees and before we take up more examples of derivation of strings. So, let me let me first show you a picture and with an example and then go back to the definition. So, here is a simple context free grammar s going to a s or a a going to s b a or s s or b a and let us consider the string a a b b you know a a b b and then a a. So, here is a derivation of the string a a b b a a from the start symbol s. So, s you know we first use the production s going to a a s. So, we get the sentential form a a s. Now, the first a is expanded by another production. So, a a going to s b a. So, we get little a then s b a then followed by s. Now, we expand the s that was recently acquired. So, we apply s going to a and we get a a b capital a capital s. Now, it is time to expand the capital a. So, capital a goes to you know b a and we get a a b b a a and a capital s finally, capital s gives the last a. So, this is the derivation of the string from its start symbol. So, the derivation tree really shows which productions were applied with which points and how the string was derived from the start symbol. So, the first production applied was s going to a a s. So, a small tree is created you know shown here with s as the root little a capital a and capital s being the three children. And then I said the capital a is expanded further with the production a going to s b a. So, again the same structure with three children is present here as well, but the symbols are different. This a gives rise to a. So, the production s going to a was applied here. This a gives rise to b a and the production a going to b a was applied here. Finally, this s gives rise to a and the production s going to a was applied here. So, the structure of the derivation tree is very simple at any internal node. So, this s this a and this a and this s are all internal nodes whereas, these are the leaves a this a this b this b this a and this a these are the leaves. So, all the internal nodes are non terminals and all the terminal nodes are all the leaves are terminal symbols. And every non terminal node corresponds to a production which is applied there and the right hand side of the production will be represented as the children the nodes corresponding to the right hand side will present represent the children of this particular node. So, that is true here and we cannot develop the parse tree further once we reach the terminal symbols. So, derivations can be displayed as trees. So, I already showed that and now there is another important property here. If we had actually expanded this in this derivation sequence we always chose to represent you know expand the first non terminal at the left most non terminal, but we could have easily expanded any non terminal which you know in this particular form. So, for example, here we could have expanded with capital S capital A or capital S. And finally, after we exhaust all the non terminal symbols the symbol the terminal string would possibly be the same. Now, unless we have used a different production to expand a particular non terminal to get a particular string the same productions will have to be applied at various places during the derivation. And when you apply that production is really immaterial. So, the internal nodes of the tree are all non terminals and leaves are all terminal symbols. And as I already said corresponding to each internal node A there exists a production in P with the right hand side of the production being the list of children of A right from left to right. And yield of a derivation tree is the list of all the labels of the leaves from left to right. So, and finally, if alpha is the yield of some derivation tree S derives alpha and if S derives alpha then alpha is the derivation of a derivation tree for some yield of a derivation tree for some grammar G. So, here the yield is this A concatenated with this A, this B, this B, this A and this A. So, and S derives this A A B B A A and we have displayed a derivation tree here and for this derivation tree this is the yield. So, as I said there are many ways to derive strings many derivations are possible. So, if we expand the left most non terminal in a derivation in each at each step then it is called a left most derivation. And if we choose to expand the right most non terminal first at every step then it corresponds to right most derivation. And if there is a string W in you know which is generated by the grammar or it belongs to the language generated language of the grammar. Then W has at least one parse tree and corresponding to a parse tree W has unique left most and right most derivations. So, this is possible only if the grammar is not ambiguous and if the grammar has ambiguity then the next bullet already says that a word has two or more parse trees then the grammar is ambiguous. So, if there is a unique parse tree then the grammar is unambiguous and if there is more than one parse tree for the same string the grammar is ambiguous. And if there is a unique parse tree then W has unique left most and right most derivations for that particular parse tree. Of course, if there are many parse trees for each of these parse trees there would be a unique left most and right most derivation as well. So, a context free language in which for which every g is ambiguous is what is known as an inherently ambiguous language, but we are not so you know particular about this language inherently ambiguous variety because it is not of much use to us. Now, let us understand ambiguity further first of all left most and right most derivations. So, the same grammar and the same derivation tree and the same sentence as well. So, we have seen this derivation already this is you know the left most derivation. And for the same string suppose we expand the right most non-terminal. So, in a a s we expand s. So, we get a a a here we expand a there is no option we get a a a s b a a here we can we expand the right most a. So, we get a a s b b a a and finally, the s is expanded to little a. So, giving us the string a a b b a a. So, this is the left most and this is the right most derivation. So, let us understand ambiguous grammars because these are very important to us. If the grammar is ambiguous then the tools for generating parsers will be in trouble further even our parsing techniques will be in trouble. So, we must understand ambiguity appropriately. So, as I already said ambiguous means there are two parsers for the same grammar and for the same string. So, for example, the grammar e to e plus c e star e parenthesis e parenthesis i d this is ambiguous, but we can actually design another grammar which acts yields the same language generates the same language e to e plus t r t t to t star f r f f two parenthesis e parenthesis r i d. The difference between these two grammars is for this grammar plus and star are at the same precedence level whereas, here star has more precedence than plus. So, let us look at an example to understand this better. So, we have a grammar e to e plus e e star e parenthesis e parenthesis i d and let us look at the string i d star i d plus i d. So, you can look at this particular string i d star i d plus i d as the multiplication first and then addition or you can look at it as addition first and then multiplication and correctly for each of these interpretations this is correct according to the both interpretations are correct according to the grammar simply because we have not mentioned whether plus has precedence over star or star has precedence over plus. So, let us see what happens in the parse tree representation. So, let us assume that the you know the structure at the highest level is e plus e in other words you know rather the we first do the star and then we do the plus. So, we have e to e plus e. So, this is the first step in the derivation expand the left e. So, it becomes e to e star e plus e. So, e to e plus e then this left becomes e to e star e and these two yield i d's. So, this is one parse tree the second parse tree says at the highest level look at it as e star e and then the second one is e plus e. So, plus is done first and then the star. So, this corresponds to this particular derivation sequence. So, for each of these derivation you know parse trees or derivation trees we can write down the left most derivation and the right most derivation that would be unique up to the parse tree, but for this parse tree the left most and right most derivations would be different. You can see that very clearly right here this is the left most derivation for this parse tree and this is the left most derivation for this parse tree. Obviously, these two are very different it will be the same case with right most derivations as well. So, let us consider the unambiguous grammar equivalent to the ambiguous expression grammar that we studied so far. So, this is our unambiguous grammar e to e plus t r t t to t star f r f f to parenthesis e parenthesis r i d and let us take the same I know the string I d plus I d star I d. So, let us see what happens in the derivation. So, first of all e to e plus t is a possibility because we would look at it as I d plus and then I d star I d. So, I d star I d gets done first and then this e gives rise to t f and I d. So, this is the derivation sequence and this t gives rise to t star f f becomes I d and t becomes f and I d. So, now suppose we try to look at the same string I d plus I d star I d and try to look at it as star first and then positive plus right. So, that is let us begin with the production e to t star f right e to t and then t star f. So, I just skip one step here just to compress the derivation e to t and then t to t star f and t star f becomes f star f, but once you reach the stage it is imperative that we use the parenthesis e parenthesis as the right hand side of f in order to generate e because without e you cannot get e plus t, but once we use parenthesis e parenthesis we have introduced two new symbols parenthesis left and parenthesis right. So, the string that we finally generate becomes parenthesis I d plus I d star I d which goes to show that we have disambiguated the grammar and whenever you need to do plus between these two and then do a star we need to include them in parenthesis otherwise it would be interpreted as star first and then plus. So, this grammar you know is indeed unambiguous and for every string there is exactly one derivation tree. Another example for ambiguity so, before that the text form of this is here. So, statement going to if expression statement or if expression statement else statement or other statement. So, this is a common piece of grammar generating if then else statement. So, it says if followed by expression followed by statement that is our if then statement otherwise the other option is if expression statement else statement. So, this generates the else part as well otherwise we any other type of statement this happens to be ambiguous, but the second grammar is unambiguous as we will see very soon. Let us finish off this last part the language is a n b n c m d m with n m greater than or equal to 1 union with a n b m c m d n with n m greater than or equal to 1. So, this language with these two sets in union is inherently ambiguous it is very difficult to it is impossible to write down a grammar which get us to you know which is which get us to the cell, but is not ambiguous. The problem is some strings are generated by the fall in this category and also in this category the grammars for this and this are always different and therefore, it is very difficult to say you know it is impossible to rather write down a unambiguous grammar for this particular language. So, let us proceed with our example. So, here is the classic example of the ambiguity in if then else the statement is if e if e to s 1 else s 2 very similar to over c index. Here this can be seen as either an if then statement with the inner else belonging to the inner statement that is if e to then s 1 else s 2 is one unit and if even and the whole thing is another unit. So, we have a statement if e statement and this inner statement is if e to s 1 else s 2 this is one interpretation we could also say the else belongs to the outer if. So, we would say if even and then if e to s 1 is the if then else if then statement and the else corresponds to the outer if. So, the derivation tree for that would be as follows. So, there is a statement if even statement else s 2. So, the outer one is the if then else statement and the inner one becomes if then statement. So, both are correct as far as the language is concerned, but the grammar generates two types of parse trees and therefore, it is ambiguous. So, now look at the same sentence and let us say let us see how it is handled by the unambiguous version of the same grammar. By the way let me add that it is impossible to convert ambiguous grammars into unambiguous grammars in any automatic fashion. In certain cases we will be able to rewrite and or write another grammar such as the ones we have shown in which the grammar second grammar is unambiguous, but this cannot be automated this is an undecidable problem. So, here is your unambiguous grammar. So, if e then statement I have shown s as statement and e as expression just to write a little less the r if e then s otherwise if e then match statement else statement. What is a match statement? It always generates if then else if e then match statement else match statement otherwise other statements. So, the beauty of this grammar is the match statement always generates you know and if then else matched statement. So, the interpretation that you know the whether the else belongs to the outer if or the inner if is eliminated here. It is always you know if you want to generate an else it must be generated from s here. So, that means the else corresponds to this if directly. So, and if you are generating we are not generating any else then you know we can use this if e then s, but once we have generated this m s the internal the m s internal cannot generate if then statements because that is the one which gives rise to you know ambiguity here. So, it always generates only matched if then else statements. So, in our case we have a statement here and it generates if e statement. So, and this statement in turn generates if then else statement. So, there is no other possibility for this particular rule it is not possible to say if e use this because the if you had use this alternative the outer else would have you know, but the inner if then else if then could not have been generated because m s always generates matched if then else statements. So, this is the only derivation tree that is possible for this particular sentence. So, let us now take up a fragment of C grammar. So, this is a fairly large example, but still it is a fragment of the entire C grammar the entire C grammar is much bigger. So, I have chosen very important parts of the language in showing you this grammar. So, a main program you know is denoted by the non-terminal program and as you know this starts with word void main then the two parenthesis followed by a body which can be called as compound statement. Compound statement in general it could be empty. So, just two flower brackets or it could have flower bracket followed by a list of statements followed by another flower bracket this is also well known or we could have flower bracket followed by some local declarations and another set of statements. So, all these three possibilities exist. So, these are our local variables which hold only within this particular block. So, what is a statement list? It is either a single statement or a list of statements. So, this mechanism is used to generate as many statements as we want. So, this statement list can go on generating individual statements if we apply it again and again. So, let us say first time we apply statement list going to statement list followed by statement. So, if we have generated one statement second time we again apply this to generate a second statement then we could apply the same production to generate a third and so on and once we have generated the required number of statements we can stop with the with this terminating production statement list going to statement and what is a statement? It is either a compound statement or expression statement or if statement or while statement. There are many other statements in C, but we will confine our attention to just this to show how grammars are written the others can be added very easily and we have seen compound statement already. So, let us see what exactly is an expression statement expression statement could be just a semicolon that is null body or it could be expression followed by semicolon. So, any expression in general in C is a statement as well that is the reason why this is expression followed by semicolon and if statement expression is you know expanded later. If statement obviously we have seen the grammar already if expression statement or if expression statement else statement this is ambiguous, but for our purposes it does not matter because we already know how to write a write an unambiguous grammar for this particular syntax while statement is while expression statement and now we come to expression. So, expression has assignment expression or expression comma assignment expression. So, again this can be used to generate as many assignment expressions as necessary and assignment expression shows a you know shows how to break the ambiguity. So, for example, at the highest level or logical r expressions otherwise user expression assign op followed by assignment expression. So, let us say we take this logical r expression. So, the logical r expression then goes to logical and expression logical and expression goes to equality expression equality expression goes to relational expressions. Let me go further and then come back relational expression goes to add expression add would be multiply and then you know star slash plus minus etcetera etcetera. So, the what I wanted to show you is if you write a expression write some expression with a you know r b etcetera etcetera there is a unique way of parsing it because this grammar happens to be this expression grammar happens to be unambiguous. However, in places of r we could always have used plus and things of that kind. So, that is something we really cannot avoid so easily that is a semantic part. So, let us go further. So, assign op is either this assignment or multiply assign star equal to slash equal to plus equal to minus equal to and equal to r equal to these are all assignment operators. Unary expressions are primary expressions or unary operator followed by unary expression. So, unary operator is plus minus or not. So, one of these three primary expression is a some kind of a terminator id or num followed by a or a parent sized expression. So, logical r expression is used to generate logical r expression r op logical and expression. So, this shows that r is at the higher level and is at the lower level. So, and gets higher priority than r and and expression gives you logical and expression and op equal to equality expression. So, equality gets higher priority than and up and equality expression has e q op na n e q op or relational expression. So, this equal to not equal to operators get higher precedence than in a lower precedence than relational operators which are here less than greater than l e q g e q and here is ad expression. So, ad expression has plus minus they get higher priority than these and multiply expression has star slash which get higher priority than plus minus. So, one has to I just wanted to show that writing a grammar for you know a fairly large language is a very very non trivial exercise and one has to pay a lot of attention to the precedence of operators and many other details. Then we have declarations which are generated by declaration list followed by declaration this part and this is the terminator part of the declaration list. So, each declaration has type followed by a number of IDs. So, idealist is used to generate IDs and type can be integer or float or character many other possibilities exist, but I just wanted to show you a few samples here. So, that is about the grammars and you know a large example as well. So, next now look at some of the machines which can be used to implement parsers. So, this push down automata is a machine which can be automatically derived given a grammar and it can be used to parse context free languages to the specified by context free grammars. So, let us understand how it works this is a stack based system it is very similar to a finite state machine. In a finite state machine we had q we had sigma we had delta we had q naught and we had f here and the meanings of those are exactly the same q is a finite set of states sigma is the input alphabet q naught is the start state f is the set of final states and delta is the transition function. The meaning of the transition function is different. So, we will come back to that very soon and here is gamma which is the stack alphabet. So, we I told you that there is a stack the symbols which can be stored on the stack or form the stack alphabet that is a finite alphabet again and this z naught is the symbol initialized for which the stack is initialized. So, the start symbol on the stack and what is delta? So, delta a typical entry is shown here it has a state delta has a state as the first parameter the input symbol as the second parameter the top of stack symbol as the third parameter. So, given this combination it can go to any number of states and rewrite the stack using gamma 1 gamma 2 etcetera. So, if it goes to the state p 1 it writes gamma 1 on to the stack if it goes to p 2 it writes gamma 2 etcetera. So, this is generally the non deterministic version of a push down automata. So, a finite state automaton also had this possibility we just said a finite state machine non deterministic variety could go into any of the states mentioned in the after the equal to part of the delta. So, here along with the states we also say what is the stack rewriting that is to be carried out. The important thing is the stack symbol is replaced by this gamma 1 or gamma 2 etcetera and the input is advanced by one symbol. So, we are going to see examples of how this works the left most symbol of gamma i which is actually going to replace the top symbol of the stack will be the new top of stack. So, you know if a b c is gamma i and then a will be the new top of stack symbol a in the above function delta could be epsilon. So, here this a could be epsilon that is why the definition contains q cross sigma union epsilon. So, and then cross sigma gamma and it could go to finite subsets of q cross gamma star. So, a state and a string of stack symbols. So, if the input is not read then we mention this a as epsilon in which case the input symbol is not used and the input head is not advanced. Otherwise whenever there is a non epsilon symbol the input is read and the input head is advanced. We define a language which is accepted by the machine m by final state. So, we say language accepted by m by final state and there is another variety called language accepted by m by empty stack. This is very straight forward. So, this is the set of all strings w which are in the you know which are terminal strings really. So, starting from the initial state q naught with the entire input unused and the stack start symbol z naught on the stack a series of moves gives you the state p the input is emptied and some strings gamma on the stack this is not very relevant. But the important thing is p is a final state and the input is empty. So, if this happens then the string is in the language. If it is by empty stack then you know starting from the same initial configuration q naught w z naught reach the configuration p epsilon epsilon where p is not necessarily a final state, but the input is empty and the stack is empty. So, in such a case we actually say this is an automaton which accept by empty stack and since the final state is not relevant we can set f as empty here. Let us take a simple example. We have l n equal to 0 and 1 n n greater than or equal to 0. The machine corresponding to it would be it has 4 states q naught q 1 q 2 q 3. The input is 0 1 the stack symbols are z and 0 delta is defined very soon q naught is the start state z is the start stack symbol and q naught is also the final state. So, recognizing this language is very easy you push all the 0's on to the stack and then once a 1 appears in the input start cancelling or popping the stack 1 0 for each 1 and if we reach a state where the input is empty and the you know last the start of stack symbol has been reached then we go to a final state. So, q naught 0 z so input symbol is 0 and start symbol is on the stack. So, we go to the state q 1 and push 0 on to the stack. Remember this 0 is the new top of stack and q naught on a 0 and 0 we go to q 1 push the 0 on to the stack q 1 1 0 we pop and go to the state q 2. So, this should have been q 1 this is not q 0 q 1 0 0 you know is q 1 0 0. So, in the state q 1 we go on pushing the 0's and once we reach a 1 the middle part we start popping we go to q 2 and what is written on to the stack is epsilon. So, here the 0 was pushed on to the stack here also the 0 was pushed on to the stack see the whole 0 is already present here still present and when we are in state q 2 we go on popping the stack against the input cancel the input against the stack symbols and in state q 2 if we hit if we exhaust the input and see the top of stack symbol start stack symbol z then we enter the state q naught with epsilon input and q naught happens to be with epsilon as the stack. So, that means we have accepted the input. So, here is the movement possible q naught 0 0 1 1 z q 1 0 1 1 0 z. So, we have pushed a 0 we have pushed a second 0 we have popped the first one against a single 0 we have popped the second 1 against the 0 and now we have reached a state where we have nothing on stack and nothing on input. Therefore, this is a acceptance by empty stack and here this is an illegal symbols you know input 0 0 1. So, we start pushing the 0 so and then we push the second 0 we pop the first one against the 0, but then there is nothing to do q 2 has no other you know move defined. So, we end up in an error here also this is another you know illegal input 0 1 0. So, we push a 0 then we pop the 1 against a 0, but we are left with a non empty input and almost empty stack. So, this is an error situation. So, we will stop here and continue with push down automata in the next lecture. Thank you.