 Welcome to the lecture number 7 in parsing part 7. So, far we have seen you know various topics in syntax analysis on including context free grammars, push down automata, l l parsing, recursive descent parsing and parts of l r parsing. Today we will continue with l r parsing and also look at the commercial yaw parser generator. So, just to do a bit of recap, we discussed l r 1 items, l r 1 item construction etcetera in the last lecture. So, here we define two operations items at closure which really includes all items b going to dot comma comma b whenever there is an item of the form a going to alpha dot b beta comma a in certain item set. So, if you take this example s prime going to s and s going to a s b or epsilon to be begin with the initial item set s prime going to dot s comma dollar and then because of this s we add s going to dot s b comma dollar and then we add s going to dot comma dollar etcetera etcetera. So, and then we also defined another operation go to of i comma x i is a set of l r 1 items and x is a grammar symbol either a terminal or a non terminal. So, here we have an item of the form a going to alpha dot x beta comma a and we advance the dot beyond this non terminal or terminal x. So, that gives us an item a going to alpha x dot beta comma a and then after these items are obtained we also take the closure. So, for example, here from the state you know s which contains s going to dot a s b comma dollar on little a we get the state containing the item s going to a dot s b comma dollar. Now, we take the closure and add the two items s going to dot a s b comma b and s going to dot comma b similarly in the state four as well. So, this repeated application of advancing this dot gives us several go to states and taking the closure of that will you know add more items into those states. So, to put together these two operations we define another function set of items g prime set of item sets g prime. So, we begin with the initial item s prime going to dot s comma dollar take its closure and then apply the go to operation repeatedly and collect the sets into the same you know collection of sets c. So, each set in this c corresponds to a state of the L R 1 D F A and this is the D F A that really recognizes the viable prefixes this is what we learnt in the last lecture. So, now construction of the L R 1 parsing table is quite straight forward. So, we define two parts in the table action and go to for the action part. So, whenever a going to alpha dot little a beta comma b is an item in a state and there is another state containing a going to alpha a dot beta comma b we add action of i comma a 2 as shift j. Similarly, for the reduce whenever there is an item of the form a going to alpha dot comma a we make it a reduce by a to alpha. So, the particular entry in action i comma a is made reduce a by a to alpha. So, then we add accept and then you know the go to and error. So, the S R and R R conflicts in the table imply the grammar is not L L 1 and to add the to the go to table whenever we have a non-terminal after the dot and there is another state containing the non-terminal you know in the advance after advancement of the dot we add a go to i comma a equal to j. So, all other entries are error. So, this is a this is another example here is the grammar. So, this grammar is not S R S L R 1, but is definitely L R 1 as we can see let me demonstrate on this particular state how exactly it is constructed and all those are similar. So, S prime going to dot S comma dollar is the initial item. So, and because of this S we have the closure I adds S all the productions of S as initial items S going to dot L equal to R comma dollar and S going to dot R comma dollar and because of the L and R we add the items of L and item of R as well and R gives you another item R going to dot L comma dollar whereas, here you know we had S going to dot R comma dollar. So, here there are two sets of items corresponding to the productions of L. So, this was defined because of this item these two and there you know look ahead would be equal to because what follows L in this item is equal to whereas, here what follows L is actually epsilon. So, only dollar comes into picture and we have L going to dot star R comma dollar and L going to dot I D comma dollar. So, observe that there are you know two items with L going to dot I D, but with different look ahead. So, this is quite you know possible it is not necessary to have a single item with a single look ahead etcetera corresponding to a production in any particular state. This particular state number two was giving us a conflict in the SLR 1 case. So, here it does not give a conflict because the reduction are going to from you know using the production are going to L will happen only on dollar whereas, the shift happens on equal to this is possible because we have been carrying the local follow which is you know equal to after L etcetera in the items as well. So, this makes sure that you know there is no possibility of a conflict in this state. So, you should also observe that there are many states with more than one look ahead for example, you know item with items which have more than one look ahead. So, R going to dot L and equal to dollar both. So, here also these two items could have been written in a similar manner. So, L going to dot I D comma equal to slash dollar etcetera etcetera. So, this grammar does not have any conflicts in any state and therefore, it is LR 1. So, now we have seen two examples of LR 1 grammar. So, it is time to go ahead and see if there are grammars which are not LR 1. So, there is a very famous theorem which says the deterministic context free languages have LR 1 grammars and all LR 1 grammars you know generate only T C F L. But here we have a grammar S prime going to S, S going to A S B, S going to A B, S going to epsilon. Unfortunately, this is ambiguous. So, the reason for ambiguity is we can go on producing as many A S and B's as we want and then I stop using either S going to A B or S going to epsilon both are equally possible. So, therefore, you know for example, for a simple string A B we can say S prime going to S and then S going to A B or we can say S prime going to S, S going to A S B and S going to epsilon. So, two parse trees are possible. All ambiguous grammars will fail the LR 1 test and so does this grammar as well. So, the construction of sets of items is left as homework. So, here when we try to fill the parsing table we get two items, two entries S 3 and then R which is a reduction S going to epsilon and here again a shift item S 9, action S 9 and the action reduced by S 2 epsilon. So, clearly between shift and reduce there is a conflict here and a conflict as well and this grammar is therefore, not LR 1. So, the next class of grammars is the L A L R 1 grammars and the corresponding L A L R 1 parsers. So, what has been observed? So, let me show you a picture before we discuss further. So, for example, what has been observed when we construct the L R 1 sets of items is that many states have you know the same set of items, but the lookaheads are different. So, for example, in this case the state 0 of course does not have any other state with the same set of items with different lookaheads. The state 1 also does not have any, but states 2 and 4 if you observe they have the same you know core item, but the lookaheads are different. So, this part without the lookahead is called as the kernel. So, the kernel of the states 2 and 4 is the same and lookaheads are different similarly, the states 3 and 6 have the same kernel S going to AS dot B, but the lookaheads are different and exactly Ditto for states 5 and 7 they have the same kernel S going to AS B dot, but with different lookaheads. So, if this happens you know the then you know the reason why we are looking at all this is L R 1 parsers have a very large number of states. So, for C grammar for example, there are many 1000 states whereas, if you consider an SLR 1 parser or L R 0 parser for C it will have just a few 100 states, but there will be many conflicts. So, the grammar is not usable. So, there is a class of grammars between SLR 1 and L R 1 called as the lookahead L R 1. The advantage of these parsers is that they have the same number of states as SLR 1 parsers for the same grammar and they are derived from L R 1 parsers. Of course, they will not have as many conflicts as the SLR 1 parsers of course. So, the SLR 1 parsers may have many conflicts, but the L A L R 1 parsers will have very few conflicts and of course, if the L R 1 parser had no shift reduce conflict the corresponding derived L A L R 1 parser will also have none, but for a R R conflicts this is not true. We will see the more of this later. The L A L R 1 parsers are very compact as compact as the SLR 1 parsers and are almost as powerful as the L R 1 parsers. Theoretically they are not the same as L R 1. So, they are actually between SLR 1 and L R 1 and fortunately most programming language grammars happen to be L A L R 1 if they are indeed L R 1. So, how exactly is the construction of L A L R 1 parsers done? So, let me again demonstrate it using the same picture. So, what we really do is take the states with the same kernel merge them into a single state. So, 2 and 4 become the new state called 2 4, 3 and 6 become a new state called 3 6, 5 and 7 become a new state called 5 7. So, from the L R 1 DFA and the L R 1 sets of items we get the new L A L R 1 sets of items in the set of states for the DFA. What happens to the look ahead? Of course, now we are going to actually merge the look ahead as well. So, in the new state 2 4 the items are going to be S going to A dot S B comma B slash dollar, S going to dot A S B comma B and S going to dot comma B. So, the items are you know the look ahead will be merged because the look ahead for these 2 is the B we are not going to repeat the items with the same look ahead, but for this particular item there is a dollar here and a B here. So, we are going to add you know both the items into that set. So, the look ahead is either B or dollar. So, 2 items will be present and that will be denoted as S going to A dot S B comma B slash dollar. So, similarly the states 5 and 7 will have you know they will become a new state 5 7 with the item S going to A S B dot comma dollar slash B and 3 and 6 will become a new state merge into a new state 3 6 with the item S going to A S dot B comma dollar slash B. So, this is the new DFA and from this we construct the LA L R 1 table. So, intuitively the way it is constructed it can be demonstrated by merging the rows. So, suppose we have the L R 1 parser table and now we have merged the 2 states 2 4 3 6 and 5 7. So, they have become the new states and now what happens in the L R 1 parsing table is the rows corresponding to 2 and 4 are actually combined. So, this has S 4 S 4 in both of them. So, it becomes you know 2 and 4 will have a shift action and since the state is 4 it will have 2 4 as the state number here and then there is a reduce action here and that reduce action is the same in both 2 and 4. So, we will have a reduction on this item and then this 3 and 6 obviously merge to 3 6. So, property similarly the same thing happens with 3 and 6 as well. So, we merge 3 and 6. So, this becomes S 5 and S 7 they become the new action S 5 7 because 5 and 7 have merged and the rest of it you know remains the same. So, 3 6 have been merged 5 7 have been merged. So, the number of rows here reduces. So, what really happens is this. So, the merge the states with the same core along with the look ahead symbols and rename them. Now, construction of the parser action and go to tables as I just now demonstrated will take place like this merge the rows of the parser table corresponding to the merge states replacing the old names of the states by the corresponding new names for the merge states. So, this is what I just now demonstrated. So, the advantage of this particular thing is the parser has L a L r 1 parser has fewer states. So, you can observe that if we merge these states you know irrespective of the look ahead we are merging them. So, we really get the same set of states as the S L r 1 or L r 0 absolutely whenever the kernel is the same we merge all those states and the look ahead as well. So, we really have states with unique kernels that is nothing, but the S L r 1 or L r 0 parser, but the look ahead is also present. Unfortunately, the look ahead is not as strong as in the L r 1 case, but fortunately it is not also as weak as in the S L r 1 case. So, this parser is stronger than the S L r 1, but weaker than the L r 1. What is the effect of this we will see now? The effect is on error detection. So, let us go through these parser steps carefully. The action of the L a L r 1 parser is identical to that of the L r 1 parser whenever the string is accepted. So, for example, the parser on input a b dollar. So, it does a shift in the L r 1 parser case and shift in the L a L r 1 parser case as well. So, with a new state here is 2 and the corresponding state is 2 4 here. Now, there is a reduction in both of them and then there is a shift again there is a shift here as well. Now, there is a reduction here and a reduction here finally, there is an accept here and accept here as well. In this example, for the wrong string a a dollar there is a shift shift and error and same is true for this as well, but in certain cases for erroneous inputs for example, a a b this does a shift then a shift then a reduce then a shift and then it finds out there is an error on dollar. Here it does a shift it another shift exactly like in the case of L r 1 then a reduction exactly as in the case of L r 1 another shift again exactly as in the case of L r 1. Now, this L r 1 parser actually declared an error whereas, this L a L r 1 parser now says there is a reduction possible here by s going to a s b. So, it reduces and then declares an error. So, this is the difference between the L r 1 parser and the L a L r 1 parser. What happens is in the case of correct inputs the number of steps will be identical to that of the L r 1 parser, but in the case of L a L r you know erroneous inputs it is possible that there would be more reductions than the tough the L r 1 parser and then the error would be declared, but it is certain that there will be no shift you know extra shift that is carried out even in the L a L r 1 parser. In other words the wrong symbol which is causing the error will never be shifted on to the stack there may be a few more erroneous reductions, but never a shift that means the error point will remain the same in both the parsers it will never shift to the next symbol erroneously. So, now let us look at the characteristics of L a L r 1 parsers. The most important property is if an L r 1 parser has no shift reduced conflicts then the derived L a L r 1 parser will also have none. This is actually quite easy to prove. So, let us go through the simple proof here both L r 1 and L a L r 1 parser states have the same core items or kernel look at of course may not be the same in each of the L r 1 states, but the kernel will be the same we merge them. So, if an L a L r 1 parser state S 1 has a shift reduced conflict then it would have a reduce item and a shift item the reduction also happens on a the shift also happens on a. So, if these two items are actually in the state S 1 they would have actually come to state S 1 because of let us say some other state S 1 prime. So, in the S 1 prime would have the same core as that is a going to alpha dot and b going to beta dot a gamma, but with different possible look ahead. So, this is possible in the state S 1 prime. So, if state S 1 prime had a going to alpha dot comma a and it also must have another item b going to beta dot a gamma, but with a different look ahead c. So, if this happens this state S 1 prime also had exactly the same shift reduce conflict in the L r 1 parser. So, if the L a L r 1 parser has a shift reduce conflict the L r 1 parser would also have had the same shift reduce conflict. So, merging states to produce a compact L a L r 1 parser will not increase the number of shift reduce conflicts, but this is not. So, as far as the reduce reduce conflicts are concerned the L a L r 1 parser may have reduce reduce conflicts even if the original L r 1 parser had none. So, this is you know the point. So, if this was also not true then the L a L r 1 and L r 1 parser would have had the same power almost, but because L a L r 1 is weaker it is possible that L r 1 parser is clean, but the L a L r 1 parser has reduce reduce conflicts. Fortunately, such grammars are very very rare and it is very difficult to find practical grammars which have reduce reduce conflicts in the L r 1 parser. So, let me take an example from the dragon book. The you know constructing the sets of items L r 1 sets of items for this grammar is left as an exercise. When you actually construct the L r 1 sets of items you get two states one containing a to c dot comma d b to c dot comma e the other one containing a to c dot comma e and b to c dot comma d. So, the kernel is the same the core is the same. So, when you merge you get a to c dot comma d slash e and b to c dot comma d slash e. So, this parser has you know reduce reduce conflict this also says reduce this also says reduce and the look ahead is also identical. So, this is an example of the L a L r 1 parser having introducing a you know reduce reduce conflict, but the original L r 1 parser really had none. So, then let us move on to error recovery in L r parsers. So, error recovery very sophisticated error recovery is very difficult in L r parsers. So, basically this is a practical approach the compiler writer identifies what are known as major non-terminals corresponding to the important constructs in a programming language such as the non-terminal program non-terminal statement non-terminal block expression generating the important program constructs in the language. And then for each of these major non-terminals error productions are added to the grammar. So, each error production is of the form a going to error alpha where a is a major non-terminal and alpha is a suitable string of grammar symbols usually terminal symbols. Now, for each of these error productions there is also an error message routine which is which print out say suitable error. Now, after this modification of the grammar build the L a L r 1 parser and go ahead with the parser operation. So, when the parser encounters an error it really is you know cannot proceed further by shifting or reducing any items it can do so only up to a particular point. So, then it stops it now goes down the stack starting from the top and tries to find an item called an error item of the form a going to dot error alpha so that would be the first thing. Then the parser shifts a token error as though it occurred in the input because this is the special token error. So, there is a dot error alpha item so that means the next input that is expected is the token error. So, even though the error token does not occur in the input in the real case it assumes that the token error has occurred and shifts it on to the stack. So, if alpha is epsilon then obviously the production item would be of the form a going to dot error so a reduction by a to epsilon is called for. So, it does that and then prints out the error message associated with it. If alpha is not epsilon then there must be a string here assume that these are all terminals symbols in alpha. So, it goes on discarding input symbols until it finds a symbol with which it can proceed. So, then it finds all the symbols in alpha. So, for example, if the item is a going to dot error semicolon it skips all it you know input symbols until it finds this semicolon then shifts the semicolon on to the stack reduces by the production a going to error semicolon and then proceeds. So, this is not perfect, but and the parser may also abort if it just gets the end of input. So, let me show you an example. So, here is the a simple grammar is going to rhyme rhyme going to sound place or we have also added the error production error del sound going to ding dong place going to del and another error production error del. So, we construct the L R 1 sets of items here and let us go through the error recovery procedure in this particular parser. So, the input is a ding del obviously this is error erroneous the correct input you know is not shown here because it anyway works well. So, if there is an error here first it reads ding. So, ding really takes it to state number 5 and then instead of dong there is del. So, there is error and it pops the state number 5 because state number 5 does not have the error item it goes back to state number 0 which has an error item. So, now it shifts error once it shifts error the next input symbol is del. So, it reads del as a part of this. So, I once it shifts error it goes to state 3 state 4 reads del and then reduces by rhyme going to error del. Now, it is time for reduction by S going to rhyme and it accepts. So, whereas, ding dong ding dong ding del. So, it goes up to the second ding you know then there is error. So, it has gone to state 2 fortunately 2 has an error item you can see that here. So, dot error del. So, it shifts you know error item on to the stack then ding is illegal. So, it ignores ding reads del and it enters state number 10 here now reduces by place to error del enters state number 7. Now, it reduces by you know this rhyme going to sound place and finally, it reduces by S 2 rhyme and accepts similarly, here also. So, here instead of any accept it has recovered and accepted the string the what it has really done is it has skipped ding and here it has assumed the presence of dong. Whereas, in the case of just one ding and nothing else it tries to pop it tries to pop it pops 5 goes to 0 which contains error item, but now the input is dollar. So, the parts are aborts. So, nothing can be done it is possible to insert you know the error production as rhyme going to error instead of rhyme going to error del, but and then this error will go away, but that does not mean the error recovery in other cases will be good. So, this is the problem with error recovery it can do a little bit and not everything. So, let us move on to you know a practical L R parser generator called Yawk. Let me give you a simple example of how the Yawk works. So, just like lex the specification of Yawk you know is very similar. So, we actually describe all the tokens that appear in the grammar with a token directive. We show the start non-terminal way in start rhyme and the production and then the productions of the grammar are written next. So, sound plays ding dong and del then there are some support routines which are necessary. So, this is the basic structure of Yawk. The corresponding lex specification for this Yawk example it says ding dong and del are the three tokens that we define for each of these it returns the capital ding dong and del tokens as such. Now, blanks are all skipped new line and any other character we just return y y takes 0. So, that the parser can actually stop. So, sound plays new line it actually reduces and stops. Compiling and running the parser is quite straight forward. So, you do a lex which we know very well ding dong dot L do a Yawk on ding dong dot y and then compile the file which is produced in the case of you know L L the lexical analyzer the lex dot y y dot c was the file produced and here it is y dot tab dot c and then we run the parser. So, here are the sample inputs ding dong del is correct. So, it prints string valid ding dong and ding dong del dollar are all both illegal. So, there is syntax error printed out. So, let us now go into details of the parser generator. So, this Yawk has a language for describing context free grammars just as lex has a language for describing regular expressions. So, it generates an L L R 1 parser for the context free grammar which is described inside the file. The form of a Yawk specification or Yawk program. So, you have percent and flower bracket percent and flower bracket inside this there are declarations these are c declarations which are just copied into the output then there is a marker percent percent and another marker percent percent. So, within this there are context free grammar rules and this is the compulsory part of the Yawk specification. After that there are several you know programs that is functions supporting functions etcetera which are all optional, but usually we require at least one or two to make the parser work. Now, the Yawk does not define any you know regular expressions it defines only context free grammar rules. So, it uses the lexical analyzer generated by lex to match the terminal symbols of the context free grammar. So, we need to provide a lex specification and a Yawk specification as I showed you just now in order to make a parser and the parser generator Yawk generates y dot tab dot c. So, let us look at the details of the specification. So, tokens are defined as percent token name 1, name 2, name 3 etcetera start symbol is defined as percent start and the name of the non terminal names in rules have the form letter or digit or dot underscore start. So, this is how names of the non terminals can be and of course, letter is either lower case or upper case. So, for example, a instead of a right arrow we have a colon here. So, a colon b is a rule part of a rule, then we have this action and then another non terminal c another action followed by the terminator of the production mesemicola. So, the production itself is a going to b c and there are some actions which are interspersed and mixed with the right hand side. So, there are also values attached to the non terminals. So, the left hand side non terminal has a value which is denoted by dollar dollar where right hand side non terminals have you know values dollar 1, dollar 2, dollar 3 etcetera which are based on the position. So, b is in the first position. So, its value will be available in dollar 1 let I will explain what these values are a little later. This is an action. So, its value is available in dollar 2 c is the third symbol. So, its value will be available in dollar 3 and this is the fourth action. So, it theoretically it is available in dollar 4, but there is nothing more to use it. So, this is how the specification of a rule would be we are going to see more examples very soon. So, in this case there were actions which are in between. So, each of these actions of course, context free grammar does not allow actions like this directly. So, Yogg clearly converts these actions into dummy productions going to epsilon. So, it introduces a new non terminal dollar act 1 to epsilon and adds this you know action to it. So, actions can be added only at the end of a production otherwise. So, for example, now the production becomes a colon b dollar act 1 c. So, this is a dummy non terminal produced by Yogg and the action is at this end. So, now a is dollar dollar b is dollar 1 dollar act 1 is dollar 2 and c is dollar 3. So, what is the value that can be produced by dollar act 1 that is defined here. So, this is dollar dollar and here this side there are no symbols. So, there are no values produced. So, if you compute a value and assign it to dollar dollar it will be available in the non terminal dollar act 1 when it is on the right hand side of some other production. This will become clear as we go on. So, the intermediate actions cannot to refer to any values of symbols to the left of the action. For example, here this action cannot use the values of b and this action cannot use the values you know elsewhere and so on and so forth. So, this is at the end after the production over. So, suppose you had another non terminal d then this action could not have used the values of b and c, but since this is the last action it can definitely use the values which are available to the left of the production or left of the action. So, dollar 2 dollar 3 etcetera can be used only when this is the last action in the whole production. Actions are translated into c code which are executed just before a reduction is performed by the parser. This is extremely important. So, the actions are performed only at the you know just before the reduction is performed by the parser. So, now the connection between lexical analysis and parser. The lexical analyzer returns integers as token numbers that is the normal case and the token numbers are assigned automatically by YAC starting from 257. So, for all the tokens declared using the percent token declaration. So, if we provide a percent token declaration YAC says now produce you know I am going to assume that the token numbers are 257 onwards. So, in the lexical analyzer we never define the token value as such. The parser defines it and the lexical the lex dot y y dot c will assume the values which are defined by the parser. So, that is how the two interface remember in the in this simple specification here we had said hash include lex dot y y dot c. So, we have defined the tokens ding dong and del, but we never mention any value. So, this is 257, 258 and 259 and lex dot y y dot c will also use this ding dong and del you can see that here, but we never define a value. Automatically since the file lex dot y y dot c is included in the parser the definitions of ding dong and del will be available to lex dot y y dot c. So, then the tokens can return not only token numbers in the lexical analyzer, but they can also return other information for example value of a number, character string of a name, pointer to a symbol table etcetera. Obviously, these are all required the extra values are returned in the global variable y y l val which is known to the yaw generated parsers. So, we will see how this works very soon then how are ambiguity you know conflicts then disambiguation etcetera carried out. So, if you consider a grammar e going to e plus e e minus e e star e e slash e parenthesis e parenthesis i d this is ambiguous. So, apart from the problem with star and plus there is also a problem you know in plus and minus as well. For example, if you consider e minus e minus e or even e plus e plus e is this a you know left associated operator or a right associated operator. So, should we shift or reduce on minus this is a question. So, even if you say e going to e minus e then are we going to whether we expand this e or this e we get the same e minus e minus e. So, there is no indication of whether we should reduce or shift on this minus terminal. So, this yaw gives you disambiguating rules. The default is a shift action if there is a shift reduce conflict in the parser and reduced by earlier rule in the case of reduce-reduce conflicts. And we can specify associativity and precedence explicitly. For example, if you say percent right equal to then the equal to you know token takes the associativity as right associative. Similarly, plus and minus become left associative and star and slash become left associative, exponentiation becomes right associative. And the precedence we are going to increase the precedence as we go downwards. So, plus and minus have the same precedence, but star and slash have higher precedence than plus and minus. Of course, exponentiation has the highest precedence equal to has the lowest precedence. So, a precedence increases as we go on down then the symbol values tokens and non terminals are both symbol stack symbols. So, stack symbols can be associated with types and we are going to declare them in what is known as a percent union declaration. So, let me show you what this is. So, for example, this is a specification for a desk calculator. So, we have some of the program declarations here. The symbol table has 20 entries possible. The structure of the symbol table is a structure with name which is a pointer and you know character pointer and value which is called as double. And the symbol table itself is a table array of n sims and number of entries. So, the symbol look is a function pointer which returns a pointer into the symbol table sim tab star and there are of course, some library inclusions as well. Here is the percent union declaration. So, double d val and struct sim tab star sim. Now, the token name takes a value simp. So, that means it is a pointer to the symbol table whereas, the token number takes a value d val which is a double. So, this is the way we are actually going to mention the values taken by the tokens in the union declaration and then refer them in the token declaration. These are not the only ones which can get the values. So, and then we have many of these token post plus, post minus equal to plus minus star slash post plus, post minus and then percent right u minus. Here is the non terminal expression and we use a type declaration with a union member d val. So, that means expressions have values of type double that is the indication. So, here is the context free grammar along with some of the actions. So, let us look at the most important part of it first. So, here we have the production expression going to name equal to expression. So, this desk calculator allows plus minus star slash and then parenthesis expressions of course, then the negation of an expression. It allows a post plus and post minus operation on expressions. It allows only numbers. Of course, it allows expression values to be placed in the symbol table and then used. So, in each of these expressions we can actually use a name as well. So, expression going to name is possible. If the name is already defined in the symbol table, then that value is extracted and used here. If the name is not defined in the symbol table, then of course, one could give an error. I am not giving an error here. It just assumes a value 0 and places it in the symbol table on the first occurrence. So, what about the computations here? Let us look at these four computations first. Expression going to expression plus expression. So, intuitively it is clear that the value of this expression, which is bigger than expression plus expression should be the sum of these two. So, dollar dollar equal to dollar 1 plus dollar 3. This is dollar 1. This is dollar 2. This is dollar 3. Similarly, for minus star and slash as well. So, we compute it using the usual arithmetic cooperators. The parenthesization does nothing. So, the value is just copied. The negation of course, negates the value of the expression. Post plus adds 1 and post minus subtracts 1. Number does nothing. The value of the number is available in y y l val and that would be copied to dollar dollar automatically. When we actually get to expression going to name, the searching the symbol table to get the value is not denoted here. It is actually done by the lexical analyzer itself. So, I will show you that in a minute. The production name equal to expression says whatever is the value of the expression should be assigned to name. Again, the introduction of the name into the symbol table etcetera is not shown here or the value also. It simply says dollar 1 pointer value equal to dollar 3. Dollar 1 is the pointer into the symbol table for name. So, here also dollar dollar equal to dollar 1 pointer value and dollar dollar is dollar 3. So, dollar 3 is value of the expression that is passed on to the left non terminal. So, now you can see in the lexical analyzer whenever we find a number which is nothing but what is described here. We read the token text which is available in y y text using a scan f and place the value of that number in y y l val dot d val. So, y y l val also takes both these values you know it is of this union type. So, it can either take d val or it can take simple both are possible. So, in this case it takes the y y l val dot d val and behaves as if it is a double and then token returned is number. In the case of a name the symbol table actually does the symbol table routine does a look up with this particular string which is nothing but the name of the identifier. So, if it finds it it returns a pointer to it otherwise it inserts it and then returns a pointer to it. So, y y l val dot simp is the pointer into the symbol table. So, here is the assignment and it returns a number the rest are quite straight forward. Here is the initialization. So, name equal to null and then here is the support routine which inserts into the symbol table. So, if the number of names in the symbol table is less than n sims and the name s p dot name not equal to null that means the name is present in the symbol table and the name happens to be exactly the same as what we want then it returns a pointer to the symbol table. Otherwise it may come out of this loop just because the number of symbols number of names that can be put into the symbol table is too large in which case it issues an error otherwise it inserts the name into the symbol table and returns that particular pointer. So, this is how the expression specification works. So, now let us look at the error recovery procedure in YAC. So, in order to prevent a cascade of error messages the parser really remains in error state until three tokens have been successfully shifted onto the stack. So, the point is here is the it is an LALR1 parser. So, it actually you know goes on shifting and then reducing enter some error state. Now, it will obviously try to do some error recovery, but until it has completely recovered from error and that is known only when a token is shifted successfully onto the stack. Remember in an error state we never you know shift items unnecessarily on to the stack you must get out of the error state and then only then we will be able to shift something on to the stack successfully. So, when it is in error state it goes on you know skipping symbols and until you know it shift it until it handles three tokens successfully and shifts them on to the stack it will not you know give you too many error messages because the number of error messages otherwise would be just one too many. If an error happens before this no error further error message is given and the input symbol is quietly deleted. So, the user again it uses the major non-terminal method identifies major non-terminals program statement block etcetera and adds you know error productions. So, here state going to error state going to error semicolon etcetera here is an example of how it does. So, this is percent token ding dong del and the start is the non-terminal s. So, for the production rhyme s going to rhyme there is no error message associated with it and there is no error production either, but for the production rhyme going to sound place there is also an error production rhyme going to error del. So, here there is an error message given as well y y error message one token skipped and for the production sound going to ding dong there is no error production, but for the production place going to del there is an error production place going to error del. So, here the error message message two token skipped is issued. So, this is the same you know example as we saw in the case of the LR1 error recovery and this is exactly what we had said as you know error productions etcetera. So, we can actually produce a grammar with error productions in this form in the arc compile this grammar and then of course run the parser associated with it. So, that gives you an error recovery parser as well. So, thus now we come to the end of the lecture. So, we will continue with semantic analysis in the next lecture. Thank you.