 and we started looking at, and we then picked up a small language for which we were trying to construct a set of diagrams which I start to call the foundation diagram and please do not confuse them with the finite state equation because in the foundation diagram we are looking at the implementation details. And this was the language we had and we said that we are going to now construct an analyzer which is going to return finally a token and activate it here. So token is going to be like it could be a relational operation like this and so on and attribute is going to be mutilation operation which is lexic and is our number, it could be the number. And then as you were discussing this we made a foundation diagram for greater than equal and greater than, and we would take a start state from there and say if I see the greater than symbol then I move to another state and then I can either see an equal or other and then in this particular state I was returning a character back into the input state. And once we did this, then we also started constructing the transition diagram which could capture the whole of relational operations which is all the six operators here and this transition diagram consisted of many more final states in fact six because we have six lexings here and in some of the lexings we have returned here so a thumb rule to remember is that whenever you reach a final state with something which is not part of the token with a label other then you tend to return that character back into the input state. So this is where we stop in the previous class. Any questions or comments or thoughts you have before I move on? Does this clear to everyone? So let's take a little more of the state so then you look at identification this also I think I am not sure whether we discussed so let's look at this. So in start state you first look at the letter and in this state you are only going to look at either a letter or a table because of a specification said that we will consist or we will define an identification which consists only of so let's look at these specifications first. The specifications are we have a set of relational operations then we have an identifier which consists of letter followed by zero or more occurrences of letter or really and then we have a number which consists of at least one digit and then we have a fraction part and if we have a fraction part either fraction part is completely missing which is optional which is given by this question mark but if it is not completely missing then it must have at least a dot and at least one digit so I will rule out all numbers like one point I must have at least one digit coming after dot and then I have either the exponent now exponent is either completely missing or it is not there but if it is there then I have a letter E here and then I have an optional sign which is either a plus or minus so even plus or minus could be missing and then followed by at least one digit so these are my specifications and my delimiter's worth either blank tab or a new line and the white space is nothing but one or more delimiter this is what my specifications are so we looked at how to construct how to construct a transition diagram for relational operations and then we have to see how to construct transition diagram for rest of the specifications and then go for implementation of these transition diagrams that is the task in front of us so this is the transition diagram for relational operations and then what we have is a transition diagram for identifiers so I first see a letter then I reach another state and in this state I can see any number of letters or digits and if I see something else then I exit to reach a final state which gives me a recognition of identifier but it also says that I must return the last character back into the input and then we are looking at transition diagram for white spaces so this is really simple I must see at least one delimiter and then I can see more delimiter because I can have any number of blanks here and when I see something else then I reach a final state and I return the last character back into the input state so these are my specifications let's look at now numbers now what I am going to do is I am going to actually construct multiple transition diagrams for numbers and how do these look so in the start state no matter if I am looking at number I must see a digit so this is what we really do that I actually see a digit in the start state now depending upon whether this is an integer or this is a real number which means it has a fraction part I will either see a dot or I will not see a dot but before I reach that state it is possible that I will have more than one digit here so in this state I say I can loop on digit and I can see any number of digits and how do I exit from this so if it is a real number then only I can exit this by looking at dot but there is another way I can exit from this because fraction part may be missing that is the optional part and I will be directly see an exponent part so I can have something like this one each time so what I do here is so let's leave this part aside for the time being and let's see that if I have seen a dot then I must see at least one digit so this is what at least one digit gives me I reach this state and in this state I can see more digits so I must I can actually loop on digit in this particular state now once I have exhausted also if I look at now let's say like this so by this specification what I have exhausted is everything of two digits so in this part state I will see one and then in the next state I will see two and three I will loop on that state then I will see the dot and then four will be through this state and then five and six will loop on this and then in this state now either I can see an E here which means like this or I can see an E here because this part was completely optional so in this state I can either see an E and reach this state or I can see an E and reach this state and if I reach this state then what do I see from my specifications I can either see a plus or minus or that is optional I can see a digit so this is what may happen that I may see a plus or minus but I may see a digit after plus or minus I may see a digit immediately after E so I am looking at situation where I say that I can either have one E plus ten or I can have one E minus ten or I can have one E ten which in this case is by default same as plus that is the interpretation so I have reached this state now I am not going to count and therefore I will say that I will keep looking at more and more digits here and when will I come out when I see something else so when I see something else I reach the final state and in this final state I return the last character back what I have not written here and which is going to be part of the code is going to be construction of the number so if you recall the code I wrote in C when I said when you read a digit then how do you construct the number keep on reading more and more digits keep on multiplying whatever you have seen by so far by ten and keep on adding so if I see one I will take the S key value of one and subtract S key value of zero from this which will give me one and when I see two I multiply one by ten and add difference of this character two with S key zero and so on so that way I can construct this number I mean similarly you can have the logic for the fraction part which will be so this is one transition diagram you see for unsigned numbers let's look at other kind of transition diagram and now suppose I just consider situations like this I don't have an exponent part so what is going to happen I can so this is only going to be a sub diagram so I will not go through the detailed application but you can see that here I have the left hand side number and here I have the fraction part and then I will exit when I see something else so this will be captured so this is actually going to capture real numbers which do not have an exponent part this captures real numbers which have an exponent part and then I can also look at just the integers and just the integers are going to capture something like this which will not have either a fraction or are not going to have exponent part and then in this case when I see others the final state and then I will return the last character back into the input state so this is a possible set of transition diagrams for capturing all possible numbers now I have a little problem and the little problem is that I have three transition diagrams which have start state and I have a number so suppose I want to now say that I want to tokenize this number now which start state I should start with because these are all corresponding to the numbers and if I just keep the first digit that does not tell me what is going to fall whether this is a real number without an exponent or whether it is an integer or whether this is a real number exponent so I have various combinations and just by looking at the first digit I cannot make out what should be my start state so what should I do so suppose I am trying to recognize that I have seen this state what should be my start state should it be this, this or this so one strategy I have been suggested is that I should prioritize my roots I should give priority to some of the rules and then if it does not succeed then I should go to another group right is that what you suggested now what should be my order in which I give priority so how do I figure this out so suppose I am trying to pass or I have tokenize this particular string and I say I give priority to integer share and I give 1, 2 and 3 and then I see about and I say I have reached the final state so what happens is that correct no because then if I have tokenize this part and this is beginning of another token because this will be returned into the input stream and then it will try to construct another token starting with this which is obviously making me an error and then I say sorry your input is incorrect right so that priority order that integer comes first is not correct now remember that we were using maximal merge principle now by using that principle which is the transition diagram which is going to consume the maximum input first one obviously right so that should be my order so what we should do is we should prioritize these rules but we should not be our final failed state unless I have all the rules so let's look at that scenario suppose I have so I will give you three scenarios this is one of the inputs I have this is another input and this is the third input so let me call them now let's see what happens in these three places so when I try to tokenize this it will actually fit here and will reach final state and will say this is the token but now suppose I try to tokenize this by looking at the first diagram because that is my highest priority so this will consume one this will consume two, three, this will consume dot, this will consume four, this will consume five and now what happens I am expecting an E here but there is no E so I say this is an error but is this really an error I will say no this is my second transition diagram let's try this and when I try this I will successfully reach a final state saying that this can orbit tokenize what about 123 123 will take me to these states one, two and three and then it will say I am expecting a dot or E and I cannot go beyond this so this obviously is not the correct transition diagram then I come to this so I look at one two and three and I am expecting a dot again I am stuck one, I try one, two and three and then I will reach the final state so that seems to work so if I prioritize my rules then I will be able to decide it doesn't matter which input I choose I can just go over this first this first and this first and only if all the transition diagrams reach the fail state one after another in this priority order then I will say that I am not able to tokenize this particular input I am only able to face instead of making instead of having a transition diagram with only one final state why don't we make a transition diagram having multiple final states so that we don't have to check it again and again so how do I do that after three there is no dot or no E so which transition diagram you talking about first after reading one to three anything other than dot and E then take it to the final state and make it integer and if there is dot then take it further or if there is E then again make it so you are saying somehow merge all these two same transition diagrams so one observation is that I am unnecessarily cut down multiple transition diagrams I should somehow have a transition diagram where everything will be part of the specifications and then when I consume one here and two three here and I don't see a dot or E and I see something else that should immediately jump to a final state is that what your observation is anyone has an answer why I didn't do that because float ends integer are handled differently then we are trying to convert this into a number floating point number would be handled differently like add integers would be handled differently sure so what is the problem with that if we start assuming that the number would be a floating point number no we don't start with any integer that will be decided by the final state so if I look at this I mean what was suggested if I say this consumes one this consumes two three then I am expecting a dot or E which will take me say finally here then even this is captured by the same transition diagram without an error we should have three different final states one saying that this is integer fair enough so if I say here that will take me to a final state here if I say other so instead of going on this side of this side I will come to that which is amounting to the same thing all I am saying is that I am now returning an integer so why didn't I do that what I do is give the specifications so sir it would be very difficult you have to figure out what are the common part between specification and model so let me delay answer so think about it let me delay this answer for five minutes and then we will come back first let's see how do I implement this because we are talking about implementation and what I have seen here are specifications so how do I implement this okay so this is why actually captures we have already discussed so let's see for a given token must be the longest possible and assume this input 12.34 exponent 56 and starting from third diagram we will accept state which reaches final state after 1, 2 which is clearly wrong and therefore matching should always start with the first transition diagram which in fact in this case was the longest and if a failure occurs in one transition diagram then retract and then forward pointed to the start state and activate the next transition diagram this is what we are doing this is what we discussed and if a failure occurs in all diagrams then all the actual diagrams otherwise I move on right so this is what I just discussed by way of having these exam so how do I implement these transition diagrams I can just have a six statement very simple implementation and all we are saying is that when I look at the next token what is the function of type token and I can go into an infinite loop I can keep on reading characters and all I need to do is look at the state look at the input character and then decide what is my next state and in between I do book keeping to convert that into certain numbers so then what happens if I am in some state 10 I have assumed certain state if I am in some state 10 then what do I do I read the character character is a letter then I say state remains 10 if it is a digit then state remains 10 otherwise this is state 11 and then I wait so what transition diagram is being captured by this code which transition diagram I am capturing here this is identity fun so this is saying that I am in certain state I read and if it is a letter or digit I remain in the same state and then if it is something else then I move into this and then I wait and you can put up all the book keeping information like how you are constructing a lexeme and how you are returning a character back into the input stream and so on but basically this remains the skeleton of the activities you are doing so it is very easy to take a set of transition diagrams and just convert them into a piece of c code so remember we talked about three implementation strategies one implementation strategy was where we write specifications and let some tool do the generation of lexical analyzer second we said was going to be where I am going to write certain specifications and in a systematic manner I am going to convert that into low level c code and third was that I am going to use some kind of low level coding for I where I was going to be done by assembly language so that it can be faster now that will be no different than this because the only thing that will happen is that when I talk about lex character which is actually a read statement that time I am going to use some kind of assembly language because that is going to further improve at least speed of my code readable quality so all transition diagrams can be easily converted into something like this and now coming back to the question which was they saying why don't I merge so here is actually merged in this state I can see others I can reach here in this state I can also see others if I add few more edges then you are able to comprehend and this will capture single transition diagram is going to capture all numbers successfully only problem is that when I start converting this transition diagram into a code the more complex transition diagram you have the more complex your code is going to be because at every state you will have to make multiple decisions now the choice is there is a tradeoff and the tradeoff is you want to have a complex transition diagram and therefore complex code and therefore probability of introducing errors or a set of simple transition diagrams and then simple code and hoping that you will not introduce an error because here remember that we are talking about systematically writing C code we don't have a tool which is going to make sure that I just write a transition diagram or I just take some kind of regular expressions and convert them into C code so tradeoff really is that how complex transition diagram you can afford to have now if I start merging everything into a single transition diagram perhaps it will become very complex but there is no measure where I can say that this is too complex and this is the boundary beyond which you should not go it's really a judgment the programmer has to make so if I take this transition diagram it doesn't look full plot complex then what we had earlier but some transition diagrams when we start merging can get very complex and therefore we don't want to increase the complex so that's the tradeoff you can have so both the approaches are fine you have to make a judgment that which is less error problem and therefore convert that into a piece of code so you mentioned the switch case of course that is one way of writing but like if we have to use multiple transition diagrams where will be where will we get the previous characters that have been lost so you will have to if you have multiple transition diagrams you will have to remember the pointer right that you say that corresponding to numbers I have 5 transition diagrams and this is my start state so if it is really reaching the fail states then I say that if 10 is the start state 11 is the start state 12 is the start state of the numbers then after reaching the failure state jump to 12 or jump to 13 and then it will be fail so little more nesting of the code that input we don't get it back input is stored no question is not clear sorry the input the characters that we are living one by one so they get lost every time they don't get lost right you only have to move your input back pointer back see it's an additional book keeping nothing is getting lost anywhere right so when you say I have multiple transition diagram I can have one more pointer to the input and say that this pointer will remain there so only catch here is that why did I have not this transition diagram earlier this was the point I wanted to bring out that I can have both the specifications both can be implemented you have to make a choice which one is more complex and which one is easier to implement that's the only thing this is like saying that if I ask you to code something if I ask you to solve a problem you can write hundreds of programs which will give me the same solution or which one is right all of them may be right which one is easy to debug which one is readable and so on that's a subjective thing is this point clear to everyone so let's then move on and let's see that the third approach so this is we said specifications first and then I said from specifications how do I transition diagrams and then how do I convert those transition diagrams into directly C code but suppose I don't want to go through that too then what do I do I am going to then use lexical analyzer generators and you already know lexical analyzer generator which is lex this you have done in CS251 everyone remembers that in fact your TAs have already sent to the users manual for both lex and get so input to the generator is a list of regular expressions in certain priority order and priority order is again going to be very important okay and then we have some sort of action which is associated with each of the regular expressions which says that what kind of token has to be generated and what kind of bookkeeping okay and once that is done what is my output? Output is a program that is going to then read so output is a C program which you again compile so if you are talking about lex it is going to be lex.yy.c which are going to again write a small main function and you are going to call that function lex.yy.c and then compile it okay and this is going to then break your input into set of tokens and is also going to report all that so if I look at diagrammatically this is how it looks that lex takes a set of inputs which is token classification gives me lex.yy.c which is the code for lexical analyzer and I have the native C compiler which is going to give me object code and this really is the lexical analyzer which takes my input and gives me a set of tokens and this you have already done so same thing you are going to do for programming languages so for more details just look at the lex map I am not going to discuss lex once again here so if you do not remember something otherwise I will expect that there is something remember the key things to remember in lex is most of the things we lex has lot of function calls and gives you lot of functionality but if you remember the very few things most of the time functionally you can write correct programs they may not be very efficient but they will be at least functionally correct things you will have to remember is at some point of time we need to capture the lexine and how do you capture the lexine in lex what is the function for that there is an array called yy.txt so yy.txt will tell you what is the string which has been matched by a token and yy.txt will give you the length of that and only one additional feature which is going to be useful is the look ahead so if you are looking at context if you recall a slash b it says that match a only if it occurs as a prefix of b so if you remember these two three things and then yy.in yy.in is another function that helps you in inputting and that will tell you how to do it so if you just remember these few things four or five things you will at least be able to write functionally correct program and I am assuming that you at least know your regular expressions so that you can write specifications correct so you don't need to know lot of things about lex just this will not now another thing you should always do is look at this file lex.yy.c and open it using whatever editor you use in debug if you search for a string for debug most likely this is going to be lldbug which is set to by hash define is set to zero change that to one and suddenly you will find that when you start running this program it will give you lot of debugging information so both in lex and in gap if you set the debug sys to one or this variable to one then you get lot of debugging information which will help you in debugging your unit so just go through this four ones four is not very difficult look at the few variables sometimes you will be very efficient with or very comfortable with lex code but advice is not to change the lex code or this generated code to manipulate certain tokens all the time you should try to manipulate your specifications and not manipulate the code unless you want to debug something in production mode it's a bad idea that you first generate some code then go and modify that code next time you try to modify it not work at all in fact I mean we want to focus at this part in not modifications it's true or false so I don't know whether we have tertiary logic you know so if you set it to any number other than zero it will still need zero is false any other number is not there so how does lex work lex again this is just to recapture about some clarity of lex and to capture about what we did in theory of computation when we were trying to construct finite state machines from regular expressions basically what it does is it takes regular expressions which describe the language and which can be recognized by the finite state machine and the facts are standard really that what we construct is first a non-deterministic finite state machine or finite state automata and then to minimize DFA to reduce number of states and this part again you must have done as part of your theory of computation and again if you have forgotten it it is good to know about the tool internals of the tool we are trying to use these are the really steps which are largely logically involved code may not look like key functions like this code may be all merged into each other but logically this is what it does and the code is the code is DFA table so again if you look at lex.yy.c you will find a large table there which is being used table is nothing but a finite state machine if it is telling you that if this is my state and this is my input then what should be the next state and that is it so this is all about lexical analyzer I want to discuss so if you have questions we can discuss those questions here otherwise we are going to move on to the next phase of so is lexical analyzer clear to everyone yes no yes so 2nd February is the deadline your TAs have given for writing lexical analyzer for whatever tool you have 2nd February is the deadline today is 20 seconds 10 days so we have comfortable time in fact we should be able to finish it much earlier my suggestion will be that don't start saying that oh I have 10 days and therefore I will start working on 30th night and finish everything by 31st night and as usual everything will work so no problem and some way 2nd we will be able to submit because most of the queries TAs and I are going to get around first saying why I don't understand this obviously for next 10 days if you don't work you will not understand my suggestion will be start now you know everything about lexical analyzer only 2 days and once you have finished just forget about it review it one day before the submission and that will work fine but if you start working on 31st or 30th most likely you will not be able to finish your lexical analyzer by second so this last lap of marathon kind of situation about it hoping that you will run very fast in last 100 meters and win the marathon that usually does not happen you have to run at a constant speed if you really want to finish it successful how much do you need to learn about the language this is a very loaded question so when you say that I want to learn about the language what do you mean by saying that learn about the language so I can interpret this question in several ways that if I am trying to learn English then I want to learn about how to write a good essay in English or I can say I do not care about that I just need a dictionary and I need a vocabulary so if you are saying that I want to suppose the language called data or a language called chit or language called xyz I do not have to start coding in that language to write a compiler all I need to do is I need to look at the specifications of that language and see that how can I tokenize strings in this particular language and nothing else so if I cannot write a program in that language that is fine as long as I understand it in fact now my solution will be which is the first language you learned see or java see and what was the book you used and how many sections and chapters does it have so go back once again open karnigan and rishi it says reference manual and the first part says programming in C but the second part will say C reference manual C reference manual is the part you need to learn which will say how to tokenize how to get the grammar, what is the semantics and the first part it says how to write programs in C you can forget program and that is true of all languages so look at the reference manual and not the user's manual because you are only looking at references and nothing else so if you say that I am writing say compiler potent then if I cannot code in potent that is okay as long as you know what are the specifications only thing only language you need to learn is the language in which you are coding your compiler that you need to know any questions so everyone is now okay so another question I think we did not meet after that announcement that everyone is now comfortable with their growth they have understood each other and no still fights going on so all those who have fights and disputes with your team partners please meet me tomorrow okay so that I can dissolve them so that you can be productive as fast as possible that is my job you have to come and tell me that why you are not able to work together because by and large by and large not always I have gone by your choices I am sure I mean some people have not got their choices but at least you have one member in the team which you chose and in most cases you have other team which was one of your choices may not have been the first choice but at least you chose so if you still have some problems in working as a team you have to come and discuss it now rather than on 16th of April standing on the dias and saying this guy didn't work I worked and he was not coming that will be a really bad situation for all of us I have seen these kind of things happening in the past so I am just talking from experience and saying let's try to resolve those problems now rather than at the final level okay so shall we move on to the next topic if you have no other questions come on alright so what was the next phase of compiler what was the next phase of compiler okay so good so let's move on so this is what we want to learn now and in the first point I am just going to capture the functional specifications of what a parser does this is input to my parser my input has already been tokenized so I no longer have characters but what I have a string of tokens or words if you want to call it so now I have this f and x parenthesis and I think if I have b and an equal sign compare b is being compared to 0 and then I have a right parenthesis and a is being the size b and the same equivalent and output of my parser is going to be that I am going to convert this into napstack syntax tree okay and I also want to do error reporting that means I want to say that if the string of tokens does not conform to the grammar I have for the language I want to flag an error at the same time I just want to exit out of flagging an error but I want to recover from this situation and flag as many errors as possible in the process of parsing so I want to recover and we talk about strategies for recovery and for error reporting or I am going to model this I am going to talk about context free grammar and you know that how do you recognize context free grammar and table driven parsers push down automators are going to be implemented using table driven parsers so this is really what we are going to talk about in next five lectures or so we will talk about theory of parsing so but this point actually captures the complete functionality of what my parser is going to do and what I am going to now capture this from the lectures is how I am going to do now before I get into these details of parsing let us also understand what this cannot do so if you see here parser is sitting somewhere in the middle of my process of compilation and here what I have is vexical analyzer this is what my input is this is where my parser sits and this is where my type channel sits and then I have a function now two things you must understand one that this parser can do something but it cannot do what type checker is supposed to do also what this parser can do this phase was not able to do and that is why we are doing it here and it just needs to not only needs to check that my input is correct but also needs to generate information for rest of the phases so just by saying that if I have a context free grammar and I have specifications and this input does not conform to the context free grammar specifications I have then this is wrong input and then stop or says this is right input and then stop is not acceptable it must generate enough information so that the phases which come later they can do something more meaningful they can do something about it so first let's see what syntax analyzer cannot do this is a normal situation kind of questions you say you have to check whether the variable is of certain type on which operations are allowed so for example if I say a plus b and if you say is this operation allowed on a and b parser cannot do something like that because that will be saying that in which context this plus operator limits that will not be able to do here for example check things like whether this variable has been declared before I use it so it will not be able to check whether a has been declared before I am using it that's really not the job of this phase that will be done subsequently I am not able to check for example whether this variable is initialized or not so I am not able to say whether this is undefined value and therefore don't use it that again parser will not be able to do these things so basically if you look at all these three points in some way they are talking about context information this one is saying that in what context this particular variable occurs and this is also saying that this variable has been whether it is undefined or uninitialized variable and so on and this is talking about operators and so on so basically whenever you see some contextual information parser cannot handle the contextual information and that's why we say that we are going to model this as context free grammar and also because we are saying that we want to do things here then obviously there were things you are not able to do here otherwise I would have finished in the lexical analysis so what are the kind of things we want to do here which you are not able to do here so now you know that in the language hierarchy why context free grammars are more powerful than the regular expressions because they can do few things which regular expressions cannot so for example they can count I have no way of counting in a finite state machine or in a regular expression so if I want to say do I have a balanced set of brackets that regular expressions cannot push down automata will be able to do it very easy so if I try to capture grammar this is a grammar which is not in regular but this is obviously context free so finite automata may repeat states but it cannot remember the number of times it has repeated that is the limitation so we will have to move to something more powerful framework and more powerful framework is really the context free grammars so you must know very clearly that I can do few things here and I can do few things here what I am doing here cannot be done here and what I am doing here some of those things cannot be done here and they have to be moved to the right check so this also brings me back to the overall model of compilation where we said that each phase is doing a logical activity well defined logical activity so that when I start debugging my compiler I know that if I catch a kind of error I know that where should I go should I go and debug this phase or debug this phase this is a clean separation of jobs yes so at least in the context of where parser sits in the compiler you understand the two boundaries what it cannot do and what should not be doing now you may also ask the question can I take regular grammars are regular grammars proper subset of this are they subset of context free grammars so why do I have laxical analyzer can I just merge laxical analyzer with my parser why do I have to fix again the reason is very simple to do tokenization context free grammar is context free grammar is a more powerful machine than I really need so if I can do something with less complexity let me do it there so this really is a less powerful machinery which is available to me which type of tokenization I don't want to use this for tokenization similarly if I go to even more powerful mechanism and say that these are all subsets capture everything here this phase will become very complex so we are just trying to do logical in each of the phases so syntax definition this is again I am just recapturing which perhaps you already know from your theory of computation that context free grammar as far as we are concerned we are going to define context free grammar as a consist of a set of tokens a set of non terminus so I am not going to talk about all the formal definitions of context free grammar and tools and so on but we straight away go into implementation and see that now I am going to specify languages using context free grammar that's why we do TOC before we do compiler so you have a set of tokens which are terminal symbols you have a set of non terminal symbols and then you have a set of productions and the productions in context free grammar are of the form that left hand side is a non terminal and right hand side is a string of terminals and non terminals and then you have one of the non terminus as start signal this is what your grammar is you have a set of terminals non terminals productions and a start symbol this is how you capture your context free grammar and then once I have the grammar then we say that this grammar actually derives strings so this is really now you can see that right away I am jumping into some kind of implementation so now we say that if there is some string which is specified by this grammar or is in the language specified by this grammar then this can always be derived by the start symbol now I am not worried at this point about efficiency and so on so I can say that let's take the start symbol take all the productions which start with start symbol which has start symbol on the left and if I choose one of those productions and keep on replacing a non terminal on the right hand side by one of its production then what I get is a set of strings which can be derived by this particular start symbol and all such strings are going to be part of the language which are specified by this grammar right this is what my definition is and in fact I mean this tells you that how I am going to do the recognition this in some very neatly captures although may look very non-descript kind of sentence this very neatly captures of what we are going to recognize something so what I will try to do is I will say take a string and see if I can derive it from the start symbol and if I can derive it from the start symbol then I will say that this is a valid string this language and if I cannot derive it then I will say this is invalid right so strings which can be derived from the start symbol of the grammar G from the part of this language as G defined by the ground right yeah so let's look at just one example before we break for today and here is an example and there is a set of balanced parenthesis so as derives as a rule and how many rules do I have here are two rules here one is saying that I have a bracket expression followed by s or as can also derive epsilon and my grammar is another this is another grammar where I say that a list derives list plus digit or list derives list minus digit or list just derives a digit okay and what is the language which is captured by this set of specifications these are all the digits this consists of expressions which have only two operators plus and minus and what are the operands here only the single digits right yeah and digits I can define by saying these are not for instance this is a language which is a list of digits which are separated by plus plus or minus okay and this is really a set of balanced parenthesis so what we do in the next class we will start from this and see that given this kind of specifications so question now is very important okay I will say I have these specifications I have some input and I will try to see that whether this input belongs to the language specified by this particular class okay so let's break here today and then we leave on and please go down everyone go down I know this is an event they have got thirty seven seconds they have got some time so one and a half hour to catch up for the class