 So let me start looking at a situation where I will not even do a wrong shift on the parser and we found that this clearly takes care of the situation and dollar is in the follow of first thing and then if I say that C star D which is not in the last shift, parser is immediately going to be there and so it will not do even a wrong shift there or reduction. And on error canonical elapsed parser never makes a shift will be smooth and immediately catches in error and therefore as far as error will be concerned this is the most powerful parsing method we have. Problem was that parsed table is just going to be in the last shift. It is the order of magnitude of this. So then we started looking at elapsed parser. Now elapsed how it comes from look ahead elapsed. It is not clear by look ahead because everything is look ahead but this is the name which was given to it historically. So this is what we follow and this is not been explained in literature but what is the difference here when we follow look ahead. So what we do here is that we look at similar looking states and when we say similar looking state what that means is the states which have the same kernel but different publicates set up an algorithm and then we try to merge them. So for example when we had I4 and I7 where the kernel was the same which said C goes to D dot and this is the algorithm. And the look ahead in this case was CD and in this case was dollar instead we can merge this and if we merge it we are going to replace both 4 and 7 by let us say new state I4 7. What does it consist of? It consists of the same kernel in the look ahead of CD and dollar. And similarly we saw that 3 and 6, 8 and 9 they also form pairs and if we merge all the N R 1 items which have all the so I look at all the sets of N R 1 items and I merge them I am going to get only the items where the kernels will be same but look ahead so it will be different. Then we started the discussion saying that if I construct now an LaLa pass table what will be the size of this table and size we came to a conclusion that size is going to be same as SLR. The question that came was will it be having the same power as SLR or it will be more powerful? That is what we were discussing and turns out that we can do it by construction of languages that their languages which are not in SLR but which are in LaLa LaLa it turns out is more powerful than SLR but less powerful than canonical LaLa because we have lost out on such information when you started compacting the tables. So when we construct LaLa pass table is that so this is a longer step that first we construct all sets of LaLa items this is the step we went through and then we say that for each core which is specific in LR1 we find all the sets having the same core and replace this at wider union. And when I do that then I am going to get now a set of LR1 items which will be of the form it says J0 to GM after the system after margin many of the states in I0 to YN which are having the same core. And once you do this then we construct pass tables as we did earlier there is no difference. So only thing that will happen now is that earlier you had multiple rows for the common kernels but different lookaheads and now I have the same row so what is the implication on this? So J is each of these days is actually a union of many of the items. So implication now is that I have compact pass tables but at the same time I have little less power than what I should do in canonical LR. So in fact I mean we can again find examples where we will see that the languages which are canonical LR but which are not in LAL. Now when I say something is in canonical LR and it is not in LALR what does that mean in terms of the pass tables and what are the kind of problems so when we looked at the first SLR table we said that a grammar is going to be in SLR if what was the fifth condition if it does not have multiple entries in the pass table and then we found that we had an example where we were trying to construct SLR parser at least one cell had multiple entries which all the shift reduced on it and then we went to canonical LR and now we say that the language is in canonical LR if you do not have multiple entries but if there is a language which is in canonical LR and is not in LALR after this merger what does that mean in terms of conflicts it can have multiple entries in the table but what kind of entries these will be will it be shift reduced or reduce reduced can I have shift reduced conflicts in LALR if the grammar was in canonical LR okay so why is that because we are thinking of instance which has to say kernel but it should be okay so at the time when we use that only at the time when we have to come because the canonicals are the same but only the look at the use so only thing that can happen is that if I merge these tables now or if I merge these states it cannot give rise to a shift reduced conflict and the argument is very simple that if there is a shift reduced conflict in LALR and that conflict must have existed in canonical LR that means the grammar itself was not in canonical LR so if grammar is in canonical LR then after merger it cannot give rise to new shift reduced conflicts it can only give rise to reduced reduced conflicts so definitely there are languages like this so since they have the same code so bottom of Jx the x where is the grammar symbol will also have the same code so this is coming from construction so this is how we construct and this is how LALR path table is going to look so what I have done is I have replaced states 3 and 6 by now a new state symbol called p6 4 and 7 by new state symbol 4, 7 and 8 and 9 by new state symbol as 8, 9 so it has 3 fewer states as compared to canonical LR as far as this example is concerned and this is exactly the same number of states as far as SLR path table will go but table actually will be different because it will have more entries so let's not try to even compare unless I really construct an SLR parser only thing we can assert at this point of time is that the number of states are going to be exactly the same and not anything so again looking at some of the properties of LALR parser if I now look at this reduction what is going to happen now that I have now C star D, C star D is my language and suppose I gave only C star D what will happen in case of LALR path table so suppose I equal to LALR path table now is C star D where will it catch that so if we keep on now saying that C star D will get reduced to capital C and then it will see a dollar but since this is the first capital C it will still not be able to accept it because I still have a rule which says S goes to C, C or S goes to A so it will not accept but it will now do additional shifts but before further reduction it will catch that so this is what happens here so in general core is a set of LR0 items and LR1 grammar will produce more than one set of items for the same code and merging this never produces shift reduced conflict but may produce reduced reduced conflicts and SLR and LALR parsers have the same number of states both the path tables are going to have the same number of states continuing on this discussion so merging the result into conflicts in LALR and these conflicts are reduced reduced conflict so here is a small argument which says that why I cannot have a shift reduced conflict so shift reduced conflict will come only if I have an LA1 item of this form it says X goes to alpha dot with the look ahead of A and this says Y goes to gamma dot A beta with the look ahead of B now what is happening here that in this case I am ready for reduction and in this case I am saying that I can shift now if I have such an LA1 item in LALR path tables then there must have been some state from where I arrived in this state so similar conflict must have existed in the earlier state and therefore I cannot have any kind of shift reduced conflicts but but if I look at a situation where suppose I had states like this it said X goes to alpha dot on a look ahead of A and Y goes to alpha dot on a look ahead of B and there was another state where I said X goes to alpha dot on a look ahead of B alpha dot on a look ahead of A after merger I am going to produce a state like this and this state clearly has a reduced reduced conflict earlier this conflict did not exist so this is what can happen when I construct LALR path tables now only thing that is another important thing to remember is that this is actually not how we construct LALR path tables it is not that I first construct LALR path tables and then reduce it to LALR they are going to be direct methods for constructions of LALR path tables only thing is these methods are little more complex and not logically as clean so they are direct but complicated and they are efficient algorithms so normally a tool which implements construction of LALR path tables is not going to take you through construction of canonical LALR path tables and then reducing it to LALR then this route seems to be much better we have at least conceptually understand what is the difference between canonical LALR and LALR and if I look at relative powers then SLR1 is less powerful than LALR1 and that is still less powerful than canonical LALR1 and if I look ahead if I have a look ahead of K then the same thing applies that for a fixed set of SLR1 is still going to be less powerful than canonical LALR so languages which can be passed by this particular parser cannot be passed by this and similarly LLK is less powerful than LRK and why LLK is less powerful than LRK let me think of an argument why LL in general is less powerful as compared to LALR why somebody is left but we know how to remove left precaution so left precaution is no normal so think about it in general argument is that if I am looking at LL parts table or I am trying to do top down parsing then what information do I have I have a symbol which I want to expand and I have K look ahead based on that I take a decision what do I have in case of LR LR I have the full stack information my state information captures whatever I have seen which in case of LL is not possible in case of LL all you are saying is that this is the ground up symbol I want to expand here you are saying whatever I have seen that's extra information I have so that is more powerful to allow parsing and in general programming language is most programming language is actually called in this class LALR so when you look at tool like not a canonical LR parser of course there are parsers like LR which are available for canonical LR but for all practical purpose LALR is sufficient as far as programming language is concerned and again how do I do error recovery in general in LR parsers so think about it now so let me draw this figure for you so that at least you can think about this that suppose this is my stack and I have now a state S and I have a look ahead let's say A and now when I refer to my parser table I find that SA is an error state so how will I recover from this how will we get up with nice solutions in case of a countdown parser so how can I continue parsing suppose I each this configuration how can I continue parsing from this and then what so if I say that I skip some tokens here here and which let's say a symbol B on which I can find an action here what happens in that case conceptually now think conceptually because what you are saying is valid because if I look at my parser table I am saying that from this point onwards I can continue parsing so let me give you an example let me write a piece of program suppose I say A is sign B plus C then P is sign Q R and X is sign Y plus Z let's say I have this right so what is missing here is the plus symbol right so I reach some state where I have seen this I have seen this I have seen this part then I am expecting to see an operator but what I see is an identifier okay so I will say I could have done continued parsing on an operator right so I say skip this skip this, skip this, skip this, skip this and I reach here and I have skipped all this okay does that make sense very good so how do I do that in terms of now parsing so solution that has been provided is that has been proposed is that if I skip everything up to say E colon and then continue parsing from here then I will be good right but how do I now do that in terms of the parser so assuming that there is a correct solution now explain that in terms of the parser look ahead okay so let me try to articulate what you are saying that imagine that there is no error here okay now suppose there is no error here then this would have reduced to say a statement okay so there is some state here followed by a statement then this whole thing would have reduced okay so what we try to do is we try to create a situation where we say let's skip this till the end and suppose I say that statement is a symbol I want to skip till the end of this then I say that whatever is the symbol now so I have to now do two manipulations one that I keep going down into the state so first I identify I identify certain markers and what are these certain markers so for example statement is a marker and I say that in case I am parsing a statement and I find there is an error in the statement then I skip everything up to the end of the statement okay that means basically saying that when I skip everything up to the end of the statement okay I will assume that I have ignored all these errors and as if this error never existed okay so that means first I have to go down and find state corresponding to this okay and then once I have discarded all these symbols okay then in this state okay I push statement okay and then that will give me a valid code right that I will be able to find in my possible and once I have reached this then what do I do I say skip everything up to the end of the statement that means start scanning my input skip everything up to semi colon and once I reach here then continue parsing right so what we do is that another parser always defines what are the symbols on which you want to synchronize on which you want to do error recovery so you can say that when it comes to end of statement that is one thing that is defined end of block you can say end of function choice is up to the parser builder okay two server is building the parser you can say that this term called this is the going to be the granularity of my error recovery okay if you want to be very ambitious and you say that I will only do up to the level of sub expression that can become very complex okay so normally what is done is we try to find certain blocks where we say that if an error occurs then I will skip up to end of that block okay and if I skip up to end of that block in terms of parser I will say that on stack find a state which has a valid go to on that particular symbol then pop everything up to that state push this symbol do a go to skip an input till you have reached end of that statement and continue parsing from there right so this is what will happen that if I say that there is an error I will say skip this whole thing skip it up to semi colon and therefore I will start scanning my input discard all these symbols then I will try to continue parsing from here on on the stack I will pop symbols which will be taking me to this state then I will push the non-terminal corresponding to this have a valid go to and then continue parsing from there make sense so I could have done same thing say for a block or same thing could have called a function okay so I can say that whenever there is an error in a function I will discard everything up to the end of the function so normally this is kept at the level of a statement okay so error recovery is if you detect an error when entering the action table is found to be empty so this is called panic mode and recovery scan down the stack until you have a state with a go to one of particular non-terminal a and then discard 0 or more input symbol until you find a symbol which is which can let it maintain the follow of a and then stack the state this state basically by pushing this a symbol and then resume that one and what is the choice of a normally these are some non-terminal which represent a major program piece so for example it could be state bank it could be a block it could be a function so primarily you have to take a decision as parser generator as a parser developer that what is the error recovery you are going to do and at what level you want to do this error recovery so most parsers will say that I want to do this error recovery but if there is an error in a statement then you try to discard everything up to the end of the statement clear both in terms of concept as well as in terms of understand and input so now we have some parser generators we do not really write parsers by hand so we understand how parsers work internally and what we have are parser generator yank is one bison is another parser generator which is very common and these are really source program specifications are la la grammars if you write grammars in la la they will be able to generate then parsers for this and the format just to refresh your memories we have deprivation followed by two special symbols followed by all the translation rules and followed by all the series this is how the structure of specifications look and this is how I look at block diagram this is how the block diagram looks now you can see that I am using lex and jack together jack is generating this file called y.jack.c in which I have to do a hash input for lex.y.c which has been generated from lexical and then when I use the native c compiler to compile this file it gives me an executable which can take an input program and can give me can do parsing syntax tree to generate syntax tree you will have to write certain actions in your file which will say that we are going to put the process of parsing but in general it will be able to say whether the string belongs to the lexical or not and also one important thing to remember here is from the implementation point of view that there is a variable called y.ywebal in y.jack.c it is a hash defined variable if you just edit this file we say hash defined by y.webal.svo if you turn that on to 1 then you find that when you are trying to generate parsers and your grammar has an error it will give you lot of debugging information so this way we can use to turn on the debugging and can get all the stack information that means every time you see a symbol what is being pushed, what is being popped, what is being reduced all the different information will start coming out as debugging information so that will also help you in understanding how your parser is doing and in case something is going wrong you will be able to figure that out so this is where we will close our discussion on parsing and here is a reading assignment and if you finish this assignment before you make some exam it will be helpful basically look at a book by a whole landscape in Ullman chapters 1 to 4 but you can skip sections 3.6 to 3.9 which basically say how do I generate a finite state machine from regular expressions how do I generate non-deterministic finite automata, how do I reduce that to a deterministic finite automata how I minimize it and so on which I am sure you can go back and start looking into your use force but it will be helpful if you read all this before you make some exam so this is where I will close discussion on parsing and we will move on to guide checking so if you have any questions at this point of time you can discuss that otherwise I do want to be next any questions about parsing anything whether it is top down, bottom up what are your questions, comments you may have at this point so shall we move on to guide checking yes sir, if we look at it out more than half if it is more powerful than I will measure it out if it is more powerful than LR1 sir, why are we stopping we can go to any number of states two reasons we don't do it one, that we do not have pools second, that most languages don't require look ahead of more than one and therefore LR1 in fact LLR1 is sufficient to parse most of the programming that and therefore even pools are not there so even if I still there will be languages which won't be parsed by this there may be languages which cannot be parsed by this so when you go into the domain of something which is beyond programming languages then you cannot parse it using LLR so those who do natural language processing they look at all those techniques as far as programming language processing is concerned we stop at LLR don't require anything more powerful than machine loop so then let's move on to I was checking so here is what we are trying to do in the semantic analysis states this foil is again coming from what we had in the introduction so now we want to check the meaning of the program and also we want to report all the errors we also want to disambiguate all the overloaded operators so here is a question of for example in this example I don't have an example of overloading but we discussed enough examples of overloading where we found that there may be an operator which depending upon the context may have a different meaning we want to exactly find out what are the overloaded operators and what are the overloaded functions for some languages are going to permit even the overloaded functions then we want to do type coercion so in case your expression can permit variables of different types then we want to do type conversions there and static type checking part is going to be that we just want to do type checking we want to do control flow checking which says that I cannot jump into middle of control flow I want to do uniqueness checking saying that all my variables in the same context are going to have a unique meaning or unique name and I want to do name checks so sometimes blocks may require that the begin block and end block may have a name and that name has to be same and so on but this list in no way is exhausted and is actually language dependent so depending on the language what kind of type checking you want to do you will have to ignore a list of types there so this is something which is beyond the syntax analysis and obviously what we are trying to do here is which was not possible something looking at the context regardless of formalism we had so syntax analysis which was all based on context and we want to now look at some errors which are deeper than what syntax analysis would catch so there is when I look at programs and I look at programming languages and we talk about correctness issues there in the program so I am not going to logic part I am still speaking to the language definition there are some issues which are deeper than the syntax so I may permit something in syntax and in fact syntax analyzer always accepts a super good example of the language what we are trying to do now is we are trying to nail it down to saying that whether it is sticking to the semantics of the language I had defined so some language features cannot be modeled using context free grammar formalism and one of the very common features you will find that if I want to check whether an identifier has been declared before use I cannot model it using this context free grammar formalism in fact when context free grammar will look something like this w is sigma star and this is something which is not in context free this you can check so beyond syntax here are now concrete examples so suppose we say that I have a string x and an int y and then I write an expression which says x is assigned y is assigned x plus p now clearly you can see that there is a type error here if I do not permit this addition on string and int but when I parse this this is not an error because all it is saying is as far as parsing rules are concerned that right hand side is an expression left hand side is a variable and that should be fine