 Welcome to part 2 of syntax analysis. So, in this part we will continue our discussion on context free grammars, push down automata and then move on to top down parsing and bottom up parsing. In the last lecture, we covered context free grammars and then began our discussion on push down automata. So, just to do a quick recap, a push down automata n m has a finite set of states q, it has an input alphabet sigma, it has a stack alphabet gamma, there is a start state q naught and a start stack symbol z naught, there is a set of final states f which is a subset of a set of states q and of course, there is a transition function which shows how the automaton behaves. So, delta the transition function is a mapping from q cross sigma union epsilon. So, q is the set of states and sigma is the input alphabet and since the automaton can make moves on epsilon that is a silent move epsilon is permitted and there is a stack top of stack symbol also which is seen before making a move and the move can be to any one of these q cross sigma star. So, finite subsets of q cross sigma star. So, there is a typical examples example given here, from a set q from a state q on an input symbol a and a stack symbol z on the top of stack, it can move to state p 1 r p 2 r p 3 etcetera p m and in that process, it removes the top of stack symbol and replaces it with gamma 1 gamma 2 etcetera any one of them depending on which state it moves and it also advances the input symbol by 1 and the important thing is left left most symbol of gamma i will be the new top of stack symbol and then we also defined acceptance of the language by a push down automaton. One is by final state the other is by empty stack for the acceptance by final state the machine must start from the start state move to some final state and empty the input as well and the stack does not matter for the acceptance by empty stack it starts from the start state moves to some state, but in the process it not only empties the input, but also the stack. So, the state to which it moves is not very important and therefore, sometimes we set f equal to 5 for this type of an automaton. So, here is an example of how the automaton accepts the language 0 n 1 n. So, this is well known you know that it is a context free language and it is not regular the number of zeros equal to number of ones and the ones follow the zeros. So, it is the machine starts from q naught and on a 0 and the stack of stack symbol is z that is there is nothing else, but the start of stack symbol it moves to state q 1 and then from state q 1 it accepts all the zeros finally, when it meets a 1 it moves to state q 2 and removes the 0 from the stack. So, from then onwards in state q 2 it goes on popping zeros against once until the input is exhausted and the stack is also exhausted, but the stack top symbol remains. So, in such a state it moves to q naught and empties the stack. So, this automaton accepts both by empty stack and by final state, because q naught happens to be the final state as well. So, here is a trace from q naught on input 0 0 1 1 with stack symbol z it moves to q 1 one symbol is pushed onto the stack again it stays in q 1 the second symbol is pushed onto the stack. So, now there are no more zeros to be pushed onto the stack, but there are ones to be popped. So, it moves to q 2 pops 1 0 moves to q remains in q 2 and pops the other 0 as well now the input is exhausted and the top of stack symbol appears. So, it moves to state q 0 empties the stack the stack is emptied and the input is already empty. So, it accepts whereas, if the input is 0 0 1 or 0 1 0 it finally, gets stuck in state q 2 with a 0 on the top of stack. So, it there is no way it can move further the input has been emptied, but the stack is not you know q 2 is not a final state and the stack is not empty either. The same is true for 0 1 0 it ends up in an error state. Let us take another example this is a much more important example than the previous one simply because it also shows how non-determinism can be handled in a you know pushed down automaton. The language is w w reverse and the alphabet is a comma b. So, in other words the all the sentences in which there is one part which is w and the next part is the reverse of w. So, for example, a b b a a b is the w part b a is the w reverse part whereas, this is not of the form w w r because there are only three symbols w w r requires that there be even number of symbols. So, whereas, if it had a a a a it would have been w w r as well. So, this requires three states and the final state is q 2 the start state is q naught as usual. The way in which the automaton works is non-deterministic. So, for example, from the state q naught on input a and the stack being empty just the start of stack symbol is present on the stack it actually pushes the symbol on to the stack and then from the same state q naught on input symbol a and the top of stack symbol a it can either remain in the state q 0 and push the symbol a on to the stack or it can move to the state q 1 and pop the symbol from the stack. What it is really trying to do is to recognize the middle of this w w r. So, until it sees w it pushes the symbols on to the stack and once it reaches the middle of the input it starts popping the symbols on the stack against the input symbols. So, that is why and it is a very intelligent machine. So, it can guess whether it has reached the middle or it has not reached the middle it does the same thing with b as well q naught and then b it pushes it on to the stack and q naught on b and the top of stack also being b it either pushes it on to the stack or it can pop the stack. So, this is how it proceeds and in the in the meanwhile if there are other symbols you know for example, delta of q naught comma a comma b of course, it pushes it on to the stack q naught b comma a also it pushes the symbol b on to the stack. The reason being because this is of the form w w reverse the end of w and the beginning of w r must be the same symbol. So, if w ends with a w r must obviously start with a that is the reason why non-determinism is made available only for delta of q naught comma a comma a and delta of q naught comma b comma b the others obviously are somewhere in the middle of w. So, they are all pushed on to the stack once it starts popping symbols it remains in that state q 1 comma a comma a it pops the stack and consumes the input q 1 comma b comma b exactly the same type of move and once it empties the input and reaches the start of stack symbol it goes to state q 2 and empties the stack as well. So, here is a very simple example this input is a b b a it pushes b a on to the stack there is b b a in the input. Now, b a remains in the input b is pushed on to the stack now you know the b a is present in the input and b a is present on the stack as well this b is the top of stack. So, it is time to start popping. So, it goes to state q 1 pops b and here it pops a enters this q 1 epsilon z configuration pops the stack the stack is empty and the input is also empty. So, the string is accepted whereas, for the string a a a it enters the state q 1 epsilon a z from which it really cannot make any more moves neither the the stack is not empty and nor the state q 1 is a final state. So, this is an error state similarly this as well q naught comma a comma a z you know it ends in error. So, let us move on to a description of what exactly in non deterministic and deterministic push down automata are. So, just as in the as in the case of the non deterministic finite state automata we have non deterministic push down automata and similar to d f a we have d p d a. However, in the case of n f a and d f a they were shown to be equivalent in other words the language which was accepted by n f a is also the language accepted by n equivalent d f a. So, we can convert every n f a to a d f a whereas, in the case of n p d a n p d a is strictly more powerful than the d p d a class. So, there are n p d a's for which you cannot design d p d a here is a very simple example which we already saw this w w r can be recognized only by a non deterministic variety of the automaton and not by any deterministic variety. But once you introduce we introduce a marker c in between w o and w r the language becomes deterministic the reason is very simple w does not have c. So, as soon as we see this symbol c we know that it is the middle of the string and now we can start popping whereas, in the case of w w r the middle was not known. So, there was a guess which was required. So, w c w r is a deterministic language whereas, w w r is non deterministic language. In practice what we require are the d p d a's because we cannot we do not have to guess anything we know exactly which move has to be made at which point in time. So, all our parsers are deterministic push down automaton and what is the process of parsing? Parsing is the process of constructing a parse tree for a sentence generated by a given grammar. So, a grammar generates a string we saw that already and the push down automaton accept a string. Now, the parsing part is the process of constructing a parse tree for a sentence. So, we use a push down automaton with some extra actions to construct a parse tree as well. So, basically parsing machine is nothing but a push down deterministic push down automaton. If there are no restrictions on the language and the form of grammar which is used parsers for context free languages are quite expensive. So, they require o n cube time for parsing. So, there are two very well known algorithms the Koch, Younger, Kazami's algorithm and early's algorithm. We are not going to study these algorithms in detail, but it suffices to say that these are based on dynamic programming technique and there is no restriction on the grammar any type of grammar is acceptable including ambiguous varieties. And we are interested in subsets of context free languages which can be parsed in order n time whereas, the general context free languages require o n cube time. Two varieties of parsing one known as predictive parsing and another known as shift reduce parsing are of interest to us. Predictive parsing is based on a class of grammars called LLN grammars and it uses a parsing strategy known as stop down parsing. Shift reduce parsing requires the grammars to be in the LR1 form and this is based on the bottom up parsing strategy. So, let us study top down parsing in more detail and then move on to the bottom up parsing and LR parsing. So, the basic idea is to trace the left most derivation of a string while constructing the parse tree that is the top down parsing strategy. So, let me give you an example and get back to this text. So, here is a very simple grammar S going to A S or C A going to B A or S B B going to B A or S. So, let us consider the string A C B B A C. So, let us look at the left most derivation of this particular string. So, in blue we show the symbol which is going to be expanded next. So, S is the start symbol to begin with we apply the production A A S then we apply the production you know A going to S B. So, we get this potential form A S B S. So, now the left most non-terminal is this S here which is in blue. So, we apply the production S going to C at this point. So, we get A C and remain the two symbols which remain are B and S. So, B happens to be the left most. So, expand B by B A. So, we get A C B A S. Now expand A again by you know A A A. So, A going to B A will give us A C B B A S and finally, S is expanded to C. So, we get our string A C B B A C. So, this is the left most derivation of the string and when we do L L parsing or top down parsing using the L L strategy the parse tree construction happens in this fashion. So, we have just S here which is the start symbol and now there are two productions for S S going to A A S and S going to C. The reason it is called predictive parsing is that we need to guess which production is applicable at this particular point or we need to predict the production which is applicable at this point. So, there is extra information available for that we will look at that information little later, but at this point you know we know that the choice which has been made is S going to A A A S. So, now the expansion happens and the parse tree is in this order. Now S has been expanded. So, now what remain the sentential form which we have got is A A S which is visible here as well. The left most non-terminal is A so that is expanded by the production A going to S B and then the next non-terminal which is left most is this. So, that is expanded by S to C then we have two more non-terminals B and S. So, that S going to this particular thing S going to C has been completed in this step itself. So, B going to B A is the next expansion which happens. So, once that is completed we have this A and this S. So, A is the left most so that is expanded by A going to B A and finally, the left over non-terminal is expanded by S going to C. So, as you can see the parse tree construction and the left most derivation are in synchronization. So, they are synchronized. So, we know which particular non-terminal is expanded in the parse tree in the next step. So, the top down parsing using predictive parsing traces the left most derivation of the string while constructing the parse tree. So, we start from the start symbol and we predict the production which is used in the derivation. So, such a prediction as I said we read extra information and that is known as a parsing table which is constructed offline and stored. So, we are going to study how the parsing table is constructed. The next production to be used in the derivation is determined by looking at the next symbol and the parse table as well. So, this combination tells you exactly which particular production is to be used and the symbol next symbol that we see is called as the look ahead symbol. So, by placing restrictions on the grammar we make sure that there is not more than one production in any slot of the parsing table. So, we will see that if there is more than one production in any slot of the parsing table then we cannot decide which production to use next. So, at the time of parsing table construction if there are two productions eligible to be placed in the same slot of the parsing table then the grammar grammar is declared to be unfit for predictive parsing. So, let us move on and see how exactly the parsing algorithm works. So, the example I showed is actually a trace of this algorithm. So, the parser has the parsing table it is a deterministic push down automaton therefore, it uses a stack and then of course, it has to look at the input as well. So, the initial configuration is the stack has the start symbol s at this point and the input is W dollar dollar is the end of file marker and there is a repeat until loop until the stack has been emptied. So, it is really cannot work after the stack has been emptied. So, we stop at this after the stack empties somewhere in the middle you know there are error messages given as well if there are no error messages given then the input has been accepted. So, repeat let x be the top stack symbol. So, at some point in time to begin with it is s and then it can become some other symbol let a be the next input symbol it could be dollar end of file. So, if the top of stack x is a terminal symbol or the end of file symbol dollar and it is equal to the input symbol a. So, then obviously, it is time to pop the stack the repeat the push down automaton also does this. So, whenever the input alpha input symbol matches the stack symbol it pops it removes a from the input as well that is the input is moved input pointer is moved one step ahead. If this is not so that is if x is not equal to a that means the stack and the input do not match. So, an error has to be reported the next possibility is the top of stack is a non terminal symbol here it is a terminal now it is a non terminal symbol. Now the question we need to answer is which particular production must be used to expand this non terminal. So, for that we use the parser uses the parsing table which is inside it. So, m is the parsing table if the entry m x comma a. So, x is the non terminal which is to be you know expanded and a is the input symbol the next input symbol. So, if this combination gives you a single unique production x going to y you know y 1 y 2 y k then we know that it is time to pop the stack and then expand the symbol x by it is right hand side. So, the right hand side in the reverse order y k y k minus 1 etcetera y 1 with y 1 on top is pushed on to the stack. So, now y 1 is on top of stack. So, and we go back to the repeat until loop I remove the next top stack symbol and match it against the input etcetera. So, this loop continues until the stack has been emptied at this point if there have not been any errors and the input also has been emptied then the you know stack then the machine has accepted the input otherwise the machine has rejected the input. So, let us now trace the same parsing algorithm using the stack. So, here is the same grammar s prime going to s dollar dollar is the end of file s going to a s r c a going to b a r s b b going to b a r s the string is a c b b a c the same symbol string right. Here is the l l 1 parsing table we have still not discussed how to construct it, but let us understand the parsing process first and then move on to parsing table construction. So, this is the rows are indexed by the non terminals and the columns are indexed by the terminal symbols or the end of file symbol. So, for each of these combinations s prime and a s prime and b s prime and c s prime and dollar there can be exactly one entry. So, some of the entries can be you know empty as well for example, s prime and b is empty s prime and dollar is empty s and b is empty etcetera etcetera. And we also see that none of the slots have more than one entry if there is more than one entry then the parsing algorithm l l 1 parsing algorithm cannot be applied. So, let us begin the stack contains s prime to begin with then s prime and you know the only possibility is to expand it on the symbol a. So, input symbol a is you know the next input symbol. So, s prime going to s dollar. So, s prime is removed and s dollar is pushed on to the stack. So, we have still not consumed the input. Now, the top of stack contains the non terminal s the input symbol is still a. So, we expand using the production s going to a a s. So, we remove s and then push a a s on to the stack with a at the top now this a and this a match. So, we remove it and the input moves to the next symbol c. So, the a and c combination have to be looked up in the parsing table. So, a and then the c a going to s b. So, a is removed s b is pushed on to the stack. So, now s is again in non terminal. So, s and c have to be looked up in the table. So, s and c. So, s going to c is the production to be applied pop s push c on to the stack. So, this c and this c now match. So, just pop the input is advanced to the next symbol b. Now, the non terminal b on top and this b is looked up. So, non terminal b and this b. So, b going to b a is the production. So, remove this b and push b a on top of stack. Now, this b and this b match. So, it is there the stack is popped we get this a and this b. So, a and b give you a going to b a. So, a going to b a is pushed on stack b and b are matched and popped. So, now a and a are matched and popped. So, we again go to s and c. So, s and c says push c on to the stack and pop s from the stack. So, we did that c and c are matched we go to the you know dollar and dollar. So, they are matched the stack becomes empty there have not been any errors the input is also empty. So, the string has been accepted successfully. So, this is how the l l 1 parsing algorithm repeatedly goes through the push pop expand stage stages in the algorithm. So, now it is time to understand how exactly the parsing table is constructed. So, as I said the l l 1 grammars are a subclass of context tree grammars. So, when we say subclass we must put restrictions on the context tree grammar again give a method to check whether the restrictions are satisfied or not satisfied. So, let us define a class of grammars called strong l l k grammars and then see how they can be used for our L L 1 parsing strategy. So, let the given grammar be g the input is extended with k symbols. So, the k part is actually the look ahead. So, we are going to see k symbols in the input at a particular time. So, the input also has to be extended by k end of file symbols dollar k k is the look ahead of the grammar. So, we are a lot if it is l l 1 we are a lot to see exactly 1 symbol in the input at a time if it is l l 2 we are a lot to see 2 symbols at a time in the input and if it is l l k we are a lot to see k symbols at a time in the input. So, now we have a start symbol s in the grammar, but it is traditional to introduce a new non-terminal s prime and a production s prime going to s dollar k. Now, s is the old start symbol whereas, s prime is the new start symbol of the grammar. Now, let us consider left most derivations and let us also assume that the grammar has no useless symbols. At this point I will not give you an algorithm for removing useless symbols we will do that later, but let me explain what exactly useless symbols are. Suppose you have terminal symbols and non-terminal symbols which are never used in the grammar. In other words the non-terminals and terminals are not part of any production at all. So, in such a case these symbols are useless symbols. Another possibility is there are productions alright, but the left hand side of the production can never be reached from the start symbol of the grammar. So, in other words we can never get to apply these productions. So, such productions are also useless productions and all the symbols associated with that production are also useless. So, we are going to see how to remove such symbols and productions later. For the present let us assume that everything in the grammar is useful and that there are no useless symbols. So, a production a to alpha in G is called strong l l k with respect to the production strong l l k production. If in the grammar G we have two derivations let us read them carefully. S prime deriving in 0 or more steps the sentential form w a gamma and now it is time to look at the application of the production a to alpha. So, the next step is from w a gamma you get w alpha gamma. So, now alpha and gamma together may give rise to the string z y. So, it is not necessary that alpha gives rise to z and gamma gives rise to y. Some part of you know alpha the string produced by alpha may be in y as well. Similarly, some part of the string produced by gamma may be in z as well. So, either way is possible it is just that the length of the string z is actually the look ahead k. So, k symbols are present in the string z and that S prime derives this entire string w z y and in the middle somewhere we have applied a production a going to alpha. Similarly, let us look at another possibility. So, S prime derives in 0 or more steps w prime a delta. So, now again you know w prime and w are different and gamma and delta are also different, but the non-terminal a is the same in these two productions in these two derivations. Now, we are applying another alternate production a to beta here what we applied here was a to alpha and again from w prime beta delta we derive w prime z x very similar to this z y all the comments I made about z x z y are also true here. So, part of what beta derives could be in x and what part of what delta derives could be in z etcetera. So, if these two productions you know if these two derivations are considered and the question is can we look at the string z at some point and determine that a was a to alpha was applied or a to beta was applied. So, at this point can we determine whether a to alpha was applied or a to beta was applied. So, if these are the two productions then the strong LLK condition says with w and w prime in sigma star alpha must be equal to beta. So, if the look heads are identical the strong LLK condition says even if the derivations had w and w prime different they are two different derivations, but the production which was applied at this point has to be a to alpha with alpha and beta being equal whether we say a to alpha or a to beta it is identical. So, strong LLK condition is a really strong condition which says if the look head is the same at some point then we know exactly which production was applied at a particular point. So, if we know if the productions are all for a particular non terminal are all strong LLK then the non terminal is strong LLK and if all the non terminals are strong LLK then the grammar is also strong LLK. So, let us take an example. So, strong LLK grammars do not allow different productions of the same non terminal to be used even in two different derivations if the look head the first case symbols of the strings produced by alpha gamma and beta delta are the same. So, in other words here the alpha gamma produces z y beta delta also produces z x. So, that means the K look head z is the same. So, because of this it forces that alpha and beta be the same. So, let us see let us check this grammar and see whether it is strong LL1 S going to A B C or A A C B A going to epsilon R B R C. So, here is the proof that S is a strong LL1 non terminal. So, let us see there are two productions S going to A B C and S going to A A C B. So, let us first take this production S going to A B C. So, S prime derives S dollar S dollar derives A B C dollar. So, we have applied you know S going to A B C here. Now, we apply A going to epsilon we get B C dollar we apply A going to B we get B B C dollar if we apply A going to C we get C B C dollar. Now, z will be either B or B or C respectively in these three strings. So, because now w is empty here it is there is nothing w is the null string and A is here and A B C together A B C dollar together form the string which is derived. We are looking at one symbol look ahead. So, after deriving the string B C dollar or B B C dollar or C B C dollar the first symbol happens to be ever look ahead B B or C. Let us take another production S prime derives S dollar now we apply the production A A C B dollar. So, now A has to be expanded again right. So, we apply epsilon A to epsilon we get A C B dollar we apply A to B we get A B C B dollar we apply A to C we get A C C B dollar. So, now we have to look at the you know again we are looking at S. So, the w part w prime part is also empty. So, in S derives A A you know S dollar derives A A C B. So, whatever is the string derived by this entire sentential form you take the first symbol of that that happens to be ever look ahead. In all these three cases z is A because A has already been produced in the first step itself. What is produced later is immaterial. So, in this case the z happens to be different you know z is different in the two derivations in all the strings. So, here for example, we applied you know S going to A B C and here we applied S going to A A C B here the z is A whereas, here the z is either B or B or C in various possibilities. So, because the z part is not the same you know there is no reason to say that this is not strong L 1. If the z was the same then this would not have been strong L 1, but since the z is different we can assert that this is strong L 1, but the non terminal A is not strong L 1. So, let us look at the grammar. So, A apply goes to epsilon or B or C. So, again let us derive some strings from S prime S prime derives A B C dollar. So, we are looking at S prime deriving S dollar and then deriving A B C dollar. So, we are considering this. Now, here also W is epsilon and A derives either you know if you apply A going to epsilon then we get B C. If you apply A going to you know B then we get B B C dollar. So, there are two possibilities for A A going to epsilon or A going to B. So, in each of these cases the terminal symbol which is derived that is the z part is B, but the two productions we have applied are different. So, in this case for the same look ahead we have two choices of the productions and therefore, the non terminal is not strong L L 1. A is not strong L L 1 and we can check whether it is L strong L L 2. So, take the same derivation A B C dollar. Now, you know apply A to epsilon you get B C dollar and when you apply S prime going to A A C B a different derivation. So, here the W prime part is little a and you apply A going to B. So, you get A B C B dollar. So, here the look ahead z is B C and here also the look ahead is B C. The production applied is A to epsilon whereas, here the production applied is A to B. For the same look ahead we have two possibilities for the non terminal A and therefore, this grammar is not strong L L 2. It is trivially strong L L 3 and I leave this for homework because all the six strings this is a grammar which produces only six strings. There can all be distinguished using three symbol look ahead you see. So, B C dollar B B C dollar C B C C B dollar B C B C C B these are the six look ahead they are all different. So, it happens to be a strong L L 3 grammar and you know you can try it out as homework. Now, we have defined strong L L k, but from now on we will limit to look ahead 1 and that would be the strong L L 1 grammar. There is also a weak L L 1 or ordinary L L 1 grammar definition available in classical literature, but it so happens for look ahead k 1 strong L L 1 and weak L L 1 are identical whereas, for k equal to 2, 3 etcetera it is possible to find grammars which are strong L L 2, but not you know L L 2 and so on and so forth. Sorry the other way the strong grammars which are L L 2, but not strong L L 2 etcetera etcetera. So, L L 2 is a weaker property than strong L L 2 whereas, for look ahead 1 strong L L 1 and L L 1 are identical. So, we will just call it as L L 1 from now on. The classical condition to test L L 1 property requires two definitions first and follow. So, let us define them give algorithms to compute them and then state the condition for L L 1. So, the first of a string alpha actually tells you the string you know the first symbol of all the terminal strings which are derived from alpha. So, if alpha is any string of grammar symbols say alpha in union t star then first of alpha is a such that a is a terminal symbol of course, and alpha derives a x in many steps and x is also a terminal string. So, you take the first symbol of this string a x put them into the set and that is the first of alpha. So, the collect all the strings take the first character of that or the first symbol of that and that gives you the first set and by definition first of epsilon is epsilon. So, now the non terminal. So, remember here the first is determined by alpha alone whereas, the follow is different we require some more context. So, let us define follow follow is defined only for a non terminal whereas, you know first is defined for anything it could be a terminal it could be a non terminal it could be a string as well. So, what is a follow of what is the follow of a although symbols a such that now we start from the start symbol and then derive some sentential form in which the non terminal a appears. So, now take the terminal symbol which follows this non terminal a in the sentential form and that is actually all those symbols are in the follow set of a. So, we will have to apply many derivations consider all the derivations in which the non terminal a appears and then look at all the possible terminal symbols which can be derived after a capital a and those symbols are in the follow set of a looks like a difficult set to compute, but it so happens that it can be computed in with a simple algorithm. So, there are two things which are very important with respect to follow first is it is defined for non terminals the second is we require a context we there must be a derivation starting from the start symbol which actually yields a sentential form with the non terminal a in it and then the symbols which follow it are included in the set follow a. So, for example, if we had productions which can never be reached or used from the start symbol then it does not matter if they contain capital a or not we will not be including any of those symbols derived by these useless productions in the follow set of a that is why it is very important that all the useless terminals non terminals and productions be removed from the grammar and only then the first and follow sets are computed. So, let us see how to apply this definition and compute the first and follow and after that we will provide algorithms for computing first and follow. So, here is the good old grammar S prime going to S dollar S going to A S or C A going to B A or S B B going to B A or S. So, first of all first of S prime first of S prime when we try to find out the strings which are derived from S prime the first only production apply applicable is S prime going to S dollar. So, all the symbols which are derived from S will have to be considered for that. So, now what are the symbols which you know which are derived from S when we apply the production S going to A A S we derive the little symbol A and when we apply the production S going to C we derive little C. So, these two symbols definitely will be in the first set of S. So, that is why we said first of S prime is first of S equal to A comma C. So, we already provided how it can be done I explained it. So, S prime derives C dollar and S prime derives A B A C dollar as well it is important that we go until the end of the derivation because the definition of first requires set you know a complete string derived from the alpha here this is alpha. So, that is why the complete derivation has to be seen we cannot stop somewhere in the middle. So, here we have applied A A S dollar, but then we took it to completion and provided A B A C dollar. Now, we have computed first of S prime and first of S. So, what about first of A? So, first of A the informally the production begins with a B here little B here. So, little B has to be in the first set and then the other possibility is S B. So, now therefore, all the symbols in the first of S also happen to be in first of A. So, symbols in first of S are also in first of A. So, that gives you A comma C and then the other one gives you a B. So, A B C are all in the first set of A. So, that is about the first computation. So, now let us look at the follow computation of S follow set has all the symbols A B C dollar. Let us see why first apply the production S prime going to S dollar. So, S is our non-terminal for which we want to compute the follow. So, in this sentential form we have started from the start state start symbol and we got S dollar. So, dollar follows S. So, dollar must be in the follow set. Let us go one step further look at another derivation S derives A A S dollar. So, we got S dollar and for S we apply the production A A S. So, we get A A S dollar. So, now again you know for this A we apply the production A going to S B. So, we get A S B S dollar. So, this S now has this capital B following it. So, let us see what symbols B you know derives and then include all of them in the follow set of S. So, B gives us little b A or S again. So, when we apply little b A we have little b following S. So, B gets into this particular follow set of A. So, the third one let us apply S prime you know A S B, B we apply A S S S. So, B going to S is applied here and that S the second S now derives little c. So, following this S we have a little c and therefore, little c is also included in the follow set of S. So, these are the three demonstration derivations which show how A B and C are included in this and this shows how dollar is included in the follow set. Follow set of A this is A comma C the reason is start with S prime then you derive the you apply S dollar. So, S dollar you apply S going to A A S. So, you get A A S dollar. So, now, this capital S follows this non-terminal A and therefore, the all the symbols in the first of S will also be in the follow of A. So, the one of the symbols is little A because you apply the production S going to A A S and of course, this can be taken to completion. So, A includes is included in the follow set of A and here you apply the production S going to C. So, C is after A. So, therefore, C is in the follow set of A as well. So, this is how you compute the first and follows rather using definitions compute them using the definitions. So, now, let us see the algorithms for computing the first the we will divide the algorithm into two parts. In the first part we see how to compute first for terminals and non-terminals and in the second part we will see how to compute the first set for strings of grammar symbols. So, for each terminal symbol A the first set is initialized to A itself. So, there is no more computation as far as terminal symbols are concerned and by definition first of epsilon is also epsilon and for each non-terminal A you know initialize first of A to phi and this is what is known as a fixed point computation while first sets are still changing. So, even if one of the first sets changes we are computing the first sets for all the non-terminals in the grammar. So, even if one of the sets changes then we have to iterate once more. So, let us see the body for each production P we do this. So, B let P be the production A to x 1, x 2, x n right and we are now computing the first of A. So, first of A whatever was because we are iterating we cannot simply you know write first of A equal to something we have to actually increment the first of A in including it extra symbols because of the iterations. So, we do first of A equal to first of A union first of x 1 minus epsilon. So, A had something in it because of other productions of A now we have a new production for A. So, whatever x 1 begins you know that is first of x 1. So, I will remove epsilon for the present and then include all of them in the first of A now start the iteration with i equal to 1. So, we start here with i equal to 1. So, if epsilon is in x i while it is a while loop which continues. So, if epsilon is in first of x i that means the symbols of x 2 are also actually visible here you know at this point x 1 let us say derives epsilon then the A you know first of A will include symbols of first of x 2 as well and i less than n minus 1 of course. So, we have we should not have reached the end of the production. So, first of A becomes first of A union first of x of i plus 1 minus epsilon. So, if x 1 derives epsilon then we are taking x 2 if x 1 x 2 also derives epsilon then we consider x 3 etcetera. So, that is why we remove the epsilon part i plus plus after the loop ends we would have reached i equal to n because this loop runs still n minus 1 inclusive. So, if i equal to n and first of you know epsilon is contained in the first of x n. So, the last one also contains epsilon x n x 1 x 2 x 3 etcetera x n they all contain a first of all these symbols contain epsilon then we include epsilon in the first set of A. So, we will stop here we will consider examples and follow set computation in the next lecture. Thank you.