 Let's take a seat, let's get settled down, and get in the class. All right, I'm going to have this guy do like three minutes on questions, five or three questions. Any other type of questions? Do you sit in more or less the same spots? I see some familiar faces in some familiar places, just less of you. That's right, you're all here. Okay? So we can do like a regular expression example? Not, because I don't think that would be valuable. I think recitation sections would be an awesome place to do that, or office hours. Just because I want, we need to cover new stuff today. Okay, that's right. So like yeah, it's on syntax analysis stuff, like first sets, follow sets, and those kinds of stuff. I mean we're getting the follow sets now, but I don't want to backtrack too much, and I don't know if I can do a good regular expression example in like the one minute left that I have. Do you have any examples on the YouTube at all? Or are they like previous classes or something like that? Yeah, maybe. Yeah, so any more. And I can look through and search for all my past videos. I'm sure we've done it at some point. Or, I don't think there's a class in here after class. So maybe afterwards you bring up your question again, and maybe a quick example after class. Sure. Cool. Okay, any other questions? Yeah. Are we still doing parsing techniques or are we done with that? We are. We're going to close that up. We have, we have three follow sets. What was this? We just had the slides up. Yeah, so we need to cover follow sets so we can do parsing so we can actually do how to write a predictive recursive descent parser if we haven't talked about it. So how we can actually write that parsing code that we saw based on any grammar once we calculate first and follow set. So we're going to do that. We'll be done with syntax analysis. Then we're going to move on to semantics, which is really interesting. Cool. All right. Let's get back to it. Okay. So we kind of talked about it on Wednesday. We talked about and developed our intuition behind. We kind of saw that, hey, just using first sets, I can't actually tell if I can parse a grammar just, or parse a string. So create a parse tree given, either you can think about it, a string or a sequence of tokens just by peaking ahead by one token. Right? And we saw that we need this new function called follow that represents what comes after a non-terminal. What kind of tokens can come afterwards? And so we talked about the function signature for the follow function. It takes in, in this case, unlike in first sets where it took in a sequence of terminals and non-terminals. Here we only care about the follow of non-terminals. And it's going to return a set of terminals and end of file. Subtuitively, we talked about that, the very kind of, we're going to crash the starting non-terms. This is why it's good to be here. So the starting non-terminal will always have the end of file be followed after it. Right? Because we just kind of know that we, there we have S. It's going to generate any kind of random tree, any kind of rules are going to happen. It's going to result in some string or sequence of tokens. And we know that after that is end of file. So we know that end of file is always going to follow our starting non-terminal. Okay. So let's kind of develop, so we're not going to go into the rules just yet, but let's use this as an example to try to think about what follow means. Right? Should it be based on our understanding of what the semantics of the follow function is and what this grammar looks like here? So we have S goes to A, B or C, A goes to little A, B goes to big B, little B, B, C goes to C, little C or epsilon. So the first thing we want to ask ourselves is, okay, so what's going to, what do we already know just from looking at this grammar? What can we know about some information that we know about some of the follow sets here? It's information. It will be followed by end of file Y. So what, some of the rules we're looking at here, so like this, so if S goes to A, that's the rule we're looking at. So then what do you know about S first? Yeah, end of file has to follow S, right? We kind of just showed that in that terrible tree diagram, right? So then why does, why do you think now that end of file would be an A's follow set because of this rule S goes to A? Yeah, so we go back to our tree, right? We know looking at that rule we have S. We know there's a rule S goes to A, and no matter what happens, right, the end like all possible strings that S can generate is going to be the same as all possible strings that A can generate, and so whatever follows S here is also going to follow A. So yeah, that would be a good thing. So we would think maybe that end of file would be an A's follow set. Does that apply to any of these other rules? And S goes to B, right? It's the exact same situation, right? So that would be based on our previous logic and reasoning. And end of file should probably be in B's follow set. So it's the same truth for C. Yeah, there's no difference there, right? And so one of the key things, I want you to think about that, we're thinking about now, right? When we do first sets, we're looking at, if I want to try to get the first set of S, I look at all the places where S is on the left-hand side. But here when I'm talking about follow, I care about where is that non-terminal used in the grammar? So where is it in the right-hand side of rules? So what about, what else can we know about that? So okay, we got some pretty good information there. If we get the first rule, what are the rules? Or what not rules, but what other things? B, little B follows is in the follow set of BY. It literally follows, right? It's directly after. We have a rule and we don't care it's on the left-hand side. We know when this rule is chosen, right? If we go back to my example, right? It doesn't matter what happened up top. It just happens to be that this is a B. It doesn't really matter. But we know that that's going to generate a B and always a little B when this rule is chosen. And so whatever B generates, we know that what follows it is definitely going to be this little B. And it doesn't, we don't care when it happens. We just know that this rule exists. Therefore, there's some possibility that a small B will follow, can follow a big B. Cool. So we have in B's follow set now, the end of file and little B. So what else? Any other information we can know based on our reasoning and logic? Little C follows C. Yeah, by the same reasoning, little C also follows C. So let's see. Oh, look at that. So smart. I just did all this. Okay, but this is just like eyeballing it, right? And kind of trying to develop some intuition. So let's think about what we were doing when we came up with those rules to try to understand and think about the rules themselves. So this is not the only thing. So I'm getting a little better at this. Or maybe you don't think so. I think so. Cool. Okay. Sweet. Okay. So I had this rule. I have S goes to A. So what was the first thing we knew that we just talked about? What was that one bit of information we started with? End of file is blood. End of file is blood. So starting on terminal. So we know that, right? We've just basically proven that by a terrible drawing. That makes sense, right? S generates the entire string. So there could be, can there be other things that follow S? Not in this grammar exactly, but maybe in a different grammar, right? There's no restrictions on where S can appear on the right-hand side of a grammar. We've got S appearing all over the place, right? But we know that S, the starting on terminal, always has the end of file in its follow set. Okay. So let's think about this. So for our second, when we just went through this, we had S goes to A. What did we do there? So what did this tell us about A's follow set? Yeah, it should be the same as an S? Exactly the same or? We're talking about sets, right? Yeah, it needs to contain, right? And why? Because we can see here that S will generate an A so that anything that could come after S can also come after A. But we need to think, right? If we're trying to derive a general rule, we want to think about the different cases, right? So is this only rules of the form S goes to A? Or if we said A goes to B, does it still apply? What if we threw an alpha before where alpha, as we're using again, is a sequence zero or more symbols? And what about would it apply in this case? Or more cases, I guess, one more. Alpha, B, alpha. I guess there shouldn't be this thing, technically. We'll go with, right, just some sequences. So is this applied at any non-terminal? So I have four choices here. A goes to B. So the rule, right, that we kind of derived was if we have something in this form, we can add, so this would be basically add first of A, sorry, not first, follow of A to follow of B. So that's what we came up with based on this, right? So here we're adding, so this is like a rule in this form. A goes to B. So we have B, so we add the follow of S to the follow of A here. So here we extend this rule, can we think about extending this rule to any of these other circumstances? Does it make sense? Or is it just rules of this form? Do you want me to draw trees? Do you want me to draw trees? I can see it in your ass. Cool. So we saw this case, right? A went to B, which went to whatever. And we know that whatever came after A is the same thing when it comes after B, definitely in this case. Well, let's go to this case. A, if A, I have alpha, which could be any number of things. I have a B, and then I have a delta, which is any number of things, right? So this could go to anything. This could do anything. And this could do anything. So does this tell me anything about the follow of A? Like, how do I relate follow of A to follow of B here? What was that? Follow of B is going to continue to follow of A. Really? What if this delta is just a lower case A? Right? Because delta can be a sequence of terminals and non-terminals. So it's just a lower case A than by our other reasoning that we talked about. Well, we know that little A should follow of B here. That's the only information we can gather here. So we have follow of A tells us what happens here after this whole thing. But does it tell us anything about what follows B? Not necessarily. Okay, so that's this case. Try this case. A, B, alphas, sequence of terminals, non-terminals. B develops anything. Delta can go to anything. And fundamentally, the follow of A tells us something here. So again, we kind of have the same thing. So does the follow of A tell us anything about the follow of B based on this rule? No. All right, so we have our final case. It's fun, I've actually never done this before. Delta creates anything. B goes here. And then A, right, A is this whole thing. So the follow of A is what's after here. So now does the follow of A tell us anything about the follow of B? Why? It's the right most of our rules, right? Because we know whatever A expands to, B is the right most. So whatever follows A must also follow B. So now we're going to have to extend our rule a little bit. The rule is if we have something of the form of this, because why did we use beta in this example? If we have A goes to any number of sequence of terminals or non-terminals, then beta, add follow of A to follow of B. So let's revisit this diagram. Or, eh, let's go here. Let's revisit this diagram. No, sorry. I'm going to revisit this one. So why didn't we rule this one out? Why did this not tell us anything? What condition caused us to rule this out? Yeah, maybe there's a non-terminal after B. But what if we got rid of that condition? Or what if we said, what if we knew that delta could go to what would help us? Epsilon. Epsilon. And how do we know if delta goes to epsilon? Yes, if it's in the first set, right? If the first of delta contains epsilon, then what? Follow of A. Yeah, then we should add. So we know, just like our diagram, right, if this can go to epsilon, there's a certain case where it will go to epsilon and go to nothing. And that means what follows A also follows B. So we have, well, that was the second rule. So we basically have our second rule here. And maybe you notice some parallels. Let's think about, like, one. So we have A goes to, let's say, alpha, B. Let's do it a little bit differently. Let's say it's B, what do we want to call it? B0. Let's say B0, B1 all the way up to BI. It's a little different from the notation we were using before. And so similarly, you remember how the first set rules we were talking about i's and zero's and epsilon's and first sets. So what would that mean? What would my equivalent kind of here be? So if I have the rule in this form, but I need another condition because we know I just can't put, I can't add the first of A to the first of B0 all the time. So what's my condition? Yeah, epsilon exists in the first of B1 and B2 all the way up to epsilon existing in the first of BI all the way to it, right? Then my first of, was it the first of A to the first of B? Yeah. This is the problem when you start mixing them. Somebody that's easier for you to follow in the lower case. So just like on first sets, right? Essentially these two rules give us an algorithm. What does this rule say? This is the second rule. Like in English kind of. Yeah, if it's the right-most symbol, then always add the first set of A to the first set of B. So I have the first set, sorry, the follow set of the left-hand side to the follow set of B. The right-most. Well the left-hand side, sorry, you have the follow set of the left-hand side non-terminal to the right-most non-terminals follow set. We'll go over this in a second. It'll be a lot easier. And this says basically, so I'm making it ring, it's not a very effective right of the change. So this rule, so this rule that we just came up with basically says, okay, and there's an epsilon in the right-most, there's an epsilon in the first most of the right, sorry, there's an epsilon in the first set of the right-most symbol, then we can add the follow A to the follow of the next one, right, and this will keep going. Although this looks actually slightly different as we'll see how this applies, because we really only care about B-zero, because we're going to try to calculate first of S, first of A, first of B, first of C. So we do this and we say, okay, if I'm interested in the first of B-zero, well, if it's at the very end, I know I can always add it by this rule. And if there's epsilon's in the first set of everything after me all the way to the end of the rule, then I know I can also add the first of A to the first of B-zero. We're going to have five rules. Okay, let's think about our second one here. So we had B goes to B, B, B-lilby. B goes to B-lilby. So what would this be like in a rule? So the right-most, okay, let's change it slightly. If I change this to A, does that change what you were going to do? So remember what we're getting out of this. What's our goal here for this part? What do we add to B's follow-up set? Any ideas for a minute? Yeah. So what did I do? So what did we do here when we did it for this one? What did this rule tell us about B's, big B's follow-up set? It has little B. And we know that, we're talking about that, it's because it comes directly after big D. So does it matter then what this symbol is on the left-hand side? It could be A, S, B, D. It doesn't matter at all. It's in the right-hand side of this rule. So, so this was we add, so what are we actually adding here? The terminal. So let's think about how this situation would change. So let's add this. This is a little thing. Not again. Why does that happen? We're going back to C's. It makes no sense. Okay. So now on this, very similar, but what do I know about the follow-up B? Follow-up B is D. Follow-up B is what? Follow-up B. Which is what? The set containing D. Right? So the way you should check, and this is the way we're trying to come down and think about things, is okay, I added this guy, this terminal here to D. What was I really doing? In this case, are you adding this terminal, like big D is a non-terminal, right? So it seems different. So what if you were adding what in this case? The first set of big D to the follow-up B. Just like what you're doing here, with the first of little B, the second-hand little B, and you added that to the follow-up B. So if you're going to rule the form A, it doesn't matter. It goes to N. Does it matter what happens when it comes to before B? It doesn't even come to before because I only care what happens next. What follows B? Literally, the follow means we don't care at all about what comes before. So I have B followed by oh, we'll go back to C. Followed by delta, there should be anything, right? For this case, we don't really care about what comes afterwards. And so if we have this in this form, then we do what? Let's go C and follow-up B. This makes sense drawing-wise because we have A, we have whatever junk's happening here. We have a B, we have a C, and then we have delta, right? So we know that this is going to generate something and C is going to generate something. And what I care is what is the first thing that comes after B is the first thing that C starts with. And because C follows B in this example, then the first of C must be in the follow-up B. So now, just like I did when we were doing rule three, what case should I consider now? Case where C has epsilon, right? What if I had C or epsilon? Or here, whatever, B or epsilon, right? So just like right here, the idea is when C is epsilon, this thing goes to nothing. And so what is what follows B? The first of delta in this case. Exactly. And since we're super good at this, we can write it in this form. Right? So we have the fifth rule which basically says if we have a form A goes to alpha B let's go zero B1 all the way to BI Wait. Oh, it depends on the general. BI B I plus one all the way up to BK so this is all the way to the end. So let's say, so I want to add BI plus one's follow set to B zero. So what would that mean? Yeah, B1 through BI must have epsilon in their first set. So if this so if the rule is on this form and B epsilon exists in the first I'm going to get that because I want to write this faster all the way up to then just like before I'm going to add at first of B which one? I plus one to follow of B zero That's it. Five rules. Yeah? C to the follow of B Ah, because we don't know the shape essentially do you think we're at like this, right? We don't know the shape of whatever this tree is that C is going to generate, right? The first test first set tells exactly what characters can this tree, this string this sub-string that this generates start with but it doesn't tell us anything about what how many strings how many strings and sequences there are right whatever the third character is or if C let's say always generates a terminal then what delta starts with does not matter to the follow of B because whatever delta starts with can never follow B to think about it like this so if we have S, A goes to B little C, D right? So I know that this rule if this is my only rule this gives me a parse tree like A and B little C B right? So B can do anything and can have first sets the first set is going to tell me what character happens here the follow of set is going to tell me what happens after D B will do whatever the first set tells me what character what token B starts with and the follow of set tells me what comes afterwards so does D's first or follow of set affect B's here because there's a C here I'm missing the point like why if B here D follows B D follows as well when we talk about follow we don't want all the characters and all the tokens that can follow right? We don't care about the sequence just like with firsts we don't care about what is exactly that first character that could be there every follow set will contain exactly one element no it's only talking about the element that directly follows this non-terminal in the resulting so let's think about it like this for first sets so if I have A goes to B, C, D right? when I think about a tree I have A, little B little C little D but the first set here I only care about B B I know that all possible strings that A generates they will only ever start with B and I don't care what any of the other strings are that's in there so similarly to follow sets if I B goes to A B so to answer my own question can have multiple elements when the construction rules say that a single non-terminal goes to several different sets of terminals and non-termals that's when the follow set would have multiple elements so you could use like for instance here you could have A goes to big A, little E or you could have A, A, F whatever there's going to be two in any all possible strings in combination of these rules the only two terminals that can ever follow the A non-terminal in this grammar and when I say follow me directly afterwards are either A or F that's it E or F yes, the first ability is the first of little F in this case R, A, E and F only care about what comes afterwards and just that first character terminal so for here that's why because this C essentially occupies that next terminal after B and it can never go to Epsilon it can't go to nothing so it doesn't matter to look at D at all with the formal definition of follow sets I didn't it was formally defined on the first slide I'm missing so that I think that we just came up I had to come up with the intuition that the follow set are just a set of things that clearly can follow exactly, yes so it's the set of terminals and or end-end-of-file characters that can appear immediately after the non-terminal A so also to clarify the end-of-file does not appear in every non-terminal's follow set correct, it may not but it will appear in one of them at least one always discarding non-terminal because of the first rule so it seems like if these, it seems like these rules are a lot more difficult than first sets maybe you're different to me, it seems a lot scarier but actually they're a lot simpler the first rule is super easy always apply it, add end-of-file to the starting non-terminal you do that once and you're done the second and third rule just basically say they're basically the second rule is the case where you are the last element and the third case is is there epsilons between you and the last element and the fourth and the fifth are about adding first sets so these are also the similar type of rules so let's go through the example again, so we need first sets, why do we know that we need first sets? we need first sets for rules four and five for rule three, right? so we need to calculate first sets first so calculate first sets so what does this mean when you're doing it on a test you actually get the first set right because it's going to mess up everything else also programming is it worth it to start on follow sets if you're not passing all the test cases of your first sets? probably not it depends on how close you are on a deadline you might have to do some game theory things about how much you can get done but if you really want to make your first sets rock solid that way you know there's not a problem because once you start coding follow sets now you have to start worrying about is my logic for calculating follow sets correct or did I mess up first sets so it's a lot easier to think about these tasks in terms of really different stages like stage zero is really just all about you reading in the data into some data structures that you can iterate over and manipulate right and once you have that then you've read in all the data correctly you're good to go to start doing first sets and to start calculating over that data so anyways that's how I submit to the program okay so same thing like before I'm not going to go into the example but we have follow sets and we have follow sets we don't want to get into any crazy recursive situations so we're going to do the same way start with empty follow sets for all non-termals in the program and then just like first sets we're going to keep applying all of these rules to all the first sets in turn until we get to a point where the follow sets do not change and then we know we don't have any for any more information and we're done calculating follow sets so the rules are actually the way we wrote them handwritten and these are pretty much the same so if s is the starting symbol of the grammar then add end of file to follow s super easy thing to do you only have to do this once are these slides, I know there are the video but are these just the slides available because the last four lectures I'm finding that my notes are basically a long series of instruction rules yeah all these are online ok and we know, we talked about if we have a rule so here I switched around b's and a's a little bit because here I'm thinking I want to calculate the follow of a that's what I want to calculate so if I have a rule of the form b goes to whatever and a is the very last right most symbol in that rule then I know I can add the follow of b to the follow of a then here I guess I did it a little different then here I care about a again so I'm trying to say what do I do with the follow of a I say well if epsilon exists in all of the symbols after a to the end of the grammar they can all go to nothing and so we can add the follow of b to the follow of a this is the rule that we came up with we use b zeroes and b here I that's a little consistent and of course you can think about these c's as either in this case you can think about them as either terminals or or you can think about them as symbols terminals or non terminals because your program is probably going to consider them as symbols in your grammar but you can calculate a c zero is a terminal you know that is epsilon exists in the first set of terminal epsilon exists in the first of little a no no it can't right the first of little a is the second containing a you already calculated the first set ok so then these two rules basically say how do I propagate follow sets from the right from the left hand side symbol to something on the right that's essentially what these are saying the next rule say ok what about where a is actually used itself because I know a rule in this form is telling me that c zero follows a right and so this rule ok good so this is something we didn't talk about when we were doing the first set why do we subtract out epsilon here yes epsilon cannot exist in the follow sets right follow set is terminals and end of file that's it and so you can think about this rule 4 is another base case you always always always if you're calculating the follow of a add the first set of whatever comes after it minus epsilon to a's follow set always do that 100% of the time and the fifth rule basically says ok but if there exists an epsilon in first of c zero then you can add the first of c1 minus epsilon to the follow of a and if there exists a epsilon in the first of c1 you can add the first of c2 minus epsilon to the follow of a and if there exists an epsilon in the first of c2 I don't even know if I'm saying it right then you can add the first of c3 minus epsilon to the follow of a just like adding the first sets you keep going until one of these symbols has an epsilon does not have an epsilon in its first set so let's think about c0 c1 c2 right so or maybe even c3 so by rule 4 we will always add c0 to a for the first of c we will always add the first of c0 to the follow of a makes sense it's literally going to be that right but there exists a case where c0 can go to epsilon to go to nothing which means that the first of c1 is going to follow a but there is a case that c1 could go to nothing as well and that means that in that case the first of c2 will follow a so what happens if there is epsilon all the way c0 to ck what do we do? it's rule 3 right it's crazy you only have to worry about it it's kind of already done for you yes just like when we calculated the first sets we were only adding to these sets so the sets are only growing bigger because we don't have epsilon in the follow of sets it doesn't make sense as we define the follow of sets we only care about we only care about terminals terminals and end of file alright we'll throw our handy dandy example that we used last time s goes to abcd a goes to cd or little a big a b goes to b c goes little cc or epsilon d goes little b or d or epsilon I'm sure you all remember this these are the first sets that we calculated it's in your notes right just kidding right so we need these here just keep them here in time as those reference and then we follow our algorithm right so we're going to first initialize all the follow sets to the empty set except for well I guess alright we can do this cool okay so we have all the rules on the top hold on we have all the rules on the top of the grammar and we have all the first sets that's all the information you need okay s which rule applies 1 right rule 1 applies s is the starting symbol of the grammar we add end of file to the follow of s okay then how do we apply rules 2, 3, 4 and 5 where do we apply them to what do we care about right hand side so we need to find rules so and the important thing in here some of you we want to find all the instances where s appears in the right hand side of the rules and then apply all these rules to all of the instances right so if you think about a case like a goes to b little b b little c what would be in the follow of b here right it should be b and c right because you only care about you care about the follow of these what are all the possible terminals that could follow b and here even though there's b in this one rule right it appears twice so we have to apply rules 2 through 2 through 5 to this b and then apply rule 2 through 3 to this b this is where the left hand side appears in the right hand side because you said s because we were concentrating on s but if anything is ever in a left hand side do you study all of its instances in the right hand side and apply the rule of each instance yes so it's all so you can think of the things on the left hand side are all of the non terminals but here's a specific difference in why I make this distinction on the when calculating first sets we look at like if we were calculating the first of s we look at all rules where s is on the left hand side those are the only rules we care about but when calculating follow sets then s appeared in the right hand side of rules so if we look at every single rule to see if s appears anywhere and here we see s does not appear anywhere so none of rules 2 through 5 apply so if our production is a gives bb then in the follow side of b first of b yes the question is if we have a rule a goes to bb bb right so by rule 4 it would say if we were looking at this instance of b we would add the follow of b the first of b minus epsilon to the follow of b and then when we look at this instance we would see that rule 2 applies and we would add the follow of a to the follow of b it's important so follow of s into 5 oh ok we're talking about a so now we're talking about a so how many rules does a appear in the non-terminal a appear in the right hand side we have s goes to a b c b and a goes to a b a right so there's 2 rules so we need to look at each instance in turn so we look at the first one and we say is s the starting symbol of the grammar or is a the starting symbol of the grammar no right it's not going to apply you can also think of this as a rule every single time you just initialize the starting non-terminal with the end of 5 that could be a different way to do that then we say does rule b apply is a the right most non-terminal sorry not in the right most non-terminal is a the right most symbol in this grammar this rule sorry in this rule that we're looking at right now no so rule 2 doesn't apply if rule 2 doesn't apply will rule 3 apply oh maybe sorry so then rule 3 says is there so rule 2 says am I the last one if no then is it possible for me to be the essentially the last one because there's everyone after me all the way to the end of the rule can go to epsilon can go to nothing so it's the same thing as if I was the last symbol so are there epsilon's in the first set of b c and d nope you can stop to b so rule 3 doesn't apply rule 4 so we add the first of what to the follow of a first of b minus epsilon so b to the follow of a and then we say does rule 5 apply is there an epsilon in the first of b no so I've done rule 5 does not apply cool then I have to look at this other rule right I have to remember I have to look at every instance of a in the rules are we over time nope, shoot alright we'll go over this example again I'll just like go through it in slow motion when we're watching this video pretend I'm saying what he's saying