 Okay. Welcome everybody. Hope you had a good week-long break. I miss everyone. Fork 1 is due today on Blackboard. Homework 2 should be posted right now. It's due in two weeks. And Project 3 will be up sometime after class. So I'll send an email out to the class mailing list when Project 3 is released. You'll have a lot of time on that. Questions on class stuff before we get started? Yeah? When you do the first iteration or whatever, do you stop after the last and after the last ever changes? When all of them don't change. So you keep iterating over first sets until the no first set changes. That's how you know you're done. And we're going to go over, so I know there's been questions. People have come to my office hours and today's office hours. So the plan for today is we're going to go over follow sets. We're going to go through an example for that. And then we're going to do whole end-to-end example using a grammar with first sets, follow sets, and proving that there is a recursive descent parser for this grammar, and then actually writing that recursive descent parser. So hopefully after this class all the first set problems, questions should go away. Any other questions before we get started? Alright. So to refresh everyone's memory, including mine, we've talked, we just went over first sets. You have hopefully been looking at first sets in your first homework assignment. And the purpose of the first set, right, is the first set is the set of all tokens that that non-terminal can produce. What those strings start with. Whereas a follow set is kind of a little bit different. So what we want from the follow set, we want to know for a given non-terminal here, in this case A, what are the set of terminals that can appear after this non-terminal. So we're thinking like what comes after, in this case, A, as far as the resulting string. And so here we have the type, if you will, that's returned by follow A, is the set is terminals. And here we're going to represent the end of file by the dollach sign character, right? So if you think about what follows S, the starting symbol, well, S produces the entire string, right? So what follows S is going to be the end of file, the end of the string. So here that would be represented by the dollach sign. Questions about general follow? What it is, at least the definitions? So we're going to go back to the example, the kind of simple example that we've used before. And so we're going to use this. We're going to look at this. We're going to talk about it, think about why some of these follow sets are what they are, what they mean, how that makes sense kind of intuitively. Then we're going to go over the rules and then we're going to go over how to apply those rules and kind of the mechanical algorithm to compute follow sets. Okay, so here we have our grammar. We have S goes to big A, big B, big C. A goes to little A. B goes to big A, little B. Or B, C goes to big C, little C, or epsilon. So what's in the follow set of S? What could follow S? Anybody brave enough to raise their hand? Yeah, end of file. Yeah, so I kind of already said it. So it could be braver if you're using my words, right? So yeah, anything else? Can anything else follow S? So how would you answer that question, I guess, would be better? What was it, empty string? So there's no epsilon here in the follow set, because we're specifically talking about tokens. It's either going to be a token or it's going to be the end of the string. So in that case you kind of think of epsilon as the end of file in this case, meaning there's no more tokens after we parse S. So the way it really did look about this is think, okay, where in first sets we were concerned about what's on the left-hand side of the production rules. Here what we care about is where does this non-terminal appear on the right-hand side of the rules, because that defines what appears after it. So the first set of S, S doesn't appear in any of the right-hand sides of any of these rules. So the follow set of S is just going to be the end of file. So we know that if we parse S, there should be nothing after it, because S generates the entire string. Does that make sense? The other way we can use this is if we're writing a parser and we parse S and then our parser returns and then we say, okay, well the next token better be an end of file, otherwise we didn't parse it like there's some kind of syntax error, something's gone horribly wrong. So what about A? What follows A? Anybody? Anybody? Yeah. A and end of file. A and end of file. How does A, so you're talking Lilay, right? Right, Lilay. Why does Lilay follow big A? So you gotta think about follow, right? In the, like, afterwards. Just end of file. Exactly. So why end of file? Because there's no hierarchy. Yeah, that's a good way to think about it right now. So, yeah, so this first rule, right? So we look for big A's on the right-hand side and then we say, okay, well, S goes to big A, so that means whatever follows S must follow A, right? So then we can add the dollar sign to A and we don't see A anywhere else on any of the right-hand side so we don't do anything there. We say, okay, well, whatever is after A has to be the end of the string. This makes sense, right? Because if we generate A, the only string we're gonna generate is a little A. So there's nothing after that string. There's nothing, no character after that would reach the end of the input string. What about B? We wanna check. Why small B? Can you speak up? All I can hear is my booming microphone voice, so we gotta be loud. Right, so the first production is big B goes to big B, little B. So we really don't care about the left-hand side here, we care about the right-hand side, we care about big B, little B. So we say, what's gonna follow a big B? Well, in this case, clearly it's gotta be a little B after a big B, right? So we add B in there. So what was the other one you said? Did you say something else? What was it? End of file. The second production of the first rule, right? So S goes to A or B or C. The rule S goes to big B, well, by the same reason that we put the end of file into the follow set of A, the end of file goes into the follow set of B. All right, what about C? So we wanna, yeah, in the back. Question. Yes. Is there anyway a string to be generated that a little A follows a capital, the production of a capital A? So think about parsing, right? So we generate a string in this language. If we generate an A, what are the possible, so we're gonna generate an A, that generates some string. What's the next token gonna be? Can that ever be a small A after that? No, because, right, so we have S is going to produce a big A, and big A is gonna produce one single A, and there's never gonna be anything after that. So yeah, so that's why little A is not in the follow set of big A. But by that same logic, this is why little B is in the follow set of big B, because when we have a big B, what could follow it is a little B, because of this rule B goes to big B, little B. So whenever we choose this rule, some string where we choose this rule, there's going to be a little B after the big B. But the end of file is there, because if we chose the rule S goes to B, and then B goes to little B, well there's nothing else. We've reached the end of the string. We generate a string of length one. So the end of file is at the next token. So we're gonna do C. We're having other questions? Remember, we're not being super precise right here. We're just trying to kind of look at this, see if it kind of makes sense. So in the future example, are you gonna show words like another C after it, or like an F? Yes, I believe so. I don't have it memorized, but yeah. It'll be more complicated, so we'll get to see exactly how it works. So what's the follow set of C? Let's go to the thing. So no end of strings, remember, or no empty strings. So yeah, so little C end of file. So why little C? Yeah, it's the first element after, it's the first symbol after a big C, right? So we know that if this rule gets followed, there will be a C following, little C following the big C. What about the end of file? Back to you. Yeah, so the same reasoning why it's an A and B, because we also have the rule S goes to big C, and we know that S, what's gonna follow S is always gonna follow C, and what follows S is the end of file. Cool. Okay, questions kind of on the intuitive here, or why certain things are or are not here, or yeah, will it always be there? No, it just happens to be how this grammar worked out. Any other questions? Yeah. Is there some rule for removing the end of file? Because I would imagine that A would actually have a follow set within the file. Why not? Because B doesn't change. What else can a big B ever come after a big A? Yeah, so is there ever a way to remove them? No. The rules here can be very similar to first sets, but different. Yeah, it's a good question, because it seems like there's too much, but the reason is because of the way this grammar is, we have S goes to either A, B, or C, and because of that, nothing's ever gonna follow A, except for the end of file. More questions before we continue? Okay. So now we go over the fancy formalism of the follow rules. So I want you to bear with me, and we'll see hopefully in detail how they work, so we can get an understanding of how to apply these rules. So the first thing we do in calculating follow sets is we first calculate first sets, and we'll see why we're gonna actually use those in these rules. So you can't do follow sets without first sets. It kind of hopefully makes sense a little bit intuition when we're talking about what's the next character that comes after a given non-terminal. Well, that's probably gonna be the first character of another either non-terminal or terminal, so we may want those first sets. Okay, then we're gonna do the same general structure. We're going to initialize all the follow sets of the non-terminals to the empty set, and then we're gonna apply the rules over and over until the follow sets do not change, and to go off the question earlier, until all of the follow sets do not change. So we're gonna keep applying these until we've applied all the rules everywhere, and the follow sets for every non-terminal and the grammar do not change. Okay, so the first rule should be, it's pretty straightforward, it's what we talked about the very first thing I said, right? So if the starting non-terminal, the starting symbol of your grammar, the end of file is always in its follow set. It makes sense. S, the starting non-terminal, it's gonna generate the entire input string, so the only thing that better be after it is the end of file. Does that make sense? Okay. Then we have a rule, so the important thing here is where in first set, when calculating the first sets, we're looking for non-terminals, we're looking at the left-hand side of the production rules, right? So if we wanna calculate, let's say the first set of A, we're gonna look for where A is on the left-hand side and use those rules to generate the first sets. But here, because we care about what comes after, we're gonna look at where the non-terminals appear on the right-hand side of the rules. So I'm gonna show you the rules, and they're gonna be a little backwards from what we described them, but hopefully that makes sense. So here we're concerned with calculating the follow set of some non-terminal A. So if we have a rule of the form, B goes to alpha, where alpha is some sequence of non-terminals and terminals, it could be an empty sequence, followed by a non-terminal A, then we're gonna add the follow set of B to the follow set of A. Does this make sense? So this is kind of the same intuition behind why we add in first sets, why we add the left-most symbols first set to the first set on the left-hand side, because in this case, whatever follows B, whatever comes after B, is gonna have to come after whatever the right-most symbol is. Yeah, question right there. Alpha is any sequence of terminals and non-terminals, zero or more. So it just represents basically whatever. So what this means is the last, so if A is the last symbol on the right-hand side of the production rule, we're gonna add the follow set of B to the follow set of A. Make sense? Maybe that makes sense, but why is that way? You'll get the why hopefully later, but does the rule as it stands understand how to implement it? Okay, so the next one, so just like on first sets, on first sets we started at the very left and we said whatever is the first set of the left-most symbol, we went to the left-hand side. And then we would go further down that string depending on if there were epsilons here. So we're gonna apply the same principle. So we're gonna say, basically we started at the right-most side and add B, the left-hand side rule, add the follow set of B to the follow set of A. And then we're gonna say, can we go to the left? Can we move one more symbol over? And when can we do that? A is as an epsilon in its first set, exactly. So if A can go to nothing, well then we can move one over and say, okay, whatever follows, let's look at this example. So here we have a rule in the form, B goes to once again alpha, any sequence, zero or more terminals, non-terminals of A followed by C zero through K where epsilon is in the first set of all of the zero through Ks. Then we're gonna add the follow set of B to the follow set of A. So it basically means, hey, if, so we always add the right-most, we always add the left-hand side non-terminal, we're gonna add its follow sets to the right-most rule by rule number two, right? And then if that right-most rule has an epsilon in the first set, well then we can move one more over and add the left-hand side's follow set to that one and then we can keep doing that for however many of the right-hand sides have epsilon. Questions? Yeah. You, I think you have a, because the rule is S goes to B, so by rule two here you're adding the follow set of S to the follow set of B, not the other way around. So that's why I wrote it backwards with the A's and the B's flipped, right? Here we're talking about A's, that's what we care about. So here we care about things on the right-hand side. More questions? Okay. And then we get into, so this is just, this is propagating the follow sets, right? But we did something else. So we had a case where we had big C and little C and we said, well it makes sense that little C is gonna be in the follow set of big C because of how that production rule is. So that's what we're gonna do here. We're gonna say, okay, we care about, remember again, we care about non-terminal A, big A. And what we're gonna say is, okay, whatever the next symbol is, in this case it's C zero. So here we just have a rule in the form B goes to alpha, remember once again, alpha is a sequence of non-terminals and terminals followed by A, the non-terminal that we care about. And so this says, what's the next symbol, C zero? Let's add the first set of C zero minus epsilon to A, to the follow set of A, right? So this makes sense. So whatever follows my rule, so whenever I choose this production rule I know that whatever follows A has gotta be whatever starts with C zero because otherwise it's not a valid string. And we take away epsilon because we don't care about epsilon in follow sets. So no epsilon's in follow sets. This makes sense? Okay, then the last rule is, well what happens if C zero has an epsilon? Well then C one, we can add the first of C one minus epsilon to the follow set of A. And what if C one has an epsilon? Well then we can add the first set of C two minus epsilon to the follow set of A. And what if C two and so on? So that's what this next rule says is starting from the terminal, the non-terminal that we're interested in, A in this case, we can keep adding the first sets of the following symbols as long as they contain an epsilon on their first set. Which makes sense, right? So here if C zero can produce an epsilon, well that means there's two possible outputs here. We can either have A, C zero, C one, or A, C one. And so that means whatever is in the first set of C one can follow A. Questions while we read and absorb and meditate on these five beautiful rules. Yeah? It looks like you're really adding C i plus one to the first set of C i plus one, but wouldn't you add the first set of C zero and C one and C two? Exactly, yes you would. So this rule, so C five is general, right? So, and this was kind of a question that came up in office hours on the mailing list. And this goes for first sets as well. So yes, this rule applies for, so let's say all of C i through C zero here. You would also add C one because you would have applied this when i is equal to zero. And then you would also add C two because you'd apply it when i is equal to one. So you'd go out one here. And then you would have added it for all of those values. So yeah, that's why it's a nice way to think about it because it's abstract and it defines all those cases, but mechanically how you operate it is you start at the following, the symbol following the non-terminal you're interested in, and you say, okay add that first set to A's follow set. Does that symbol have a non-terminal? Or sorry, does that symbol have a epsilon in its first set? If so, then we add the following symbols first set to A's follow set. Do you start on the far right at this place or do you start immediately to the right of whichever one you're interested in? Immediately to the right of whatever one you're interested in. Yeah, so I like to apply these two separately. So whenever I say, okay, I mean, we're going to get into it, but when I say, okay, I'm going to apply, I care about A, so I look, is A, the first thing I say is, is A the last, the far right most non-terminal inner production rule in this production rule that I care about? If the answer is yes, then I apply rule two. If the answer is no, then I say, are all of it to the right to the end of the string epsilon? Are epsilon's in its first set? If so, then I add rule three applies and I'm going to add the follow set of B to the follow set of A. And then I worry about what's after. Then I say, okay, let's look at A. What's directly after it in this specific rule? Add that first set to my follow set. Okay, is there an epsilon there in that first set? Then we can keep going, so, yeah. So for example two, we're adding follow B to follow A. Is that to define follow B? Does this define follow B? Yeah, number two. It defines follow A. Defines follow A. So we're adding follow B to follow A because here remember we care about follow, we care about A right now. And remember, because we first set all of the follow sets to the empty set, we can do that, right? So we know the values to start and if it's empty set, it's empty set. And we'll get to B on its turn and we'll look at where B is located on all of the right-hand side rules. We'll apply all of these rules to B and then when we come back around, we'll get it eventually. So that's how the propagation happens. More questions, mistakes, three spots, typos. Let's look at an example. I think this will help clarify things. Okay, so we're going to use the example that we used for follow sets already. So we're going to use the grammar S goes to big A, B, C, D. We have the rule A goes to big C, big D, or little A, big A. B goes to little B. Big C goes to little C, big C, or epsilon D goes to little D, big D, or epsilon. And so we're not going to do it again, but we have the first sets of all of those because we calculated it a week ago. So what's the first thing? What do we initialize all of our follow sets to? Empty set. Empty set. Yes. Perfect. Okay. Oh, and then I'm going to move this a little bit so we have the rules here. Okay. So what we're going to do, we're going to step through this step-by-step and show exactly which rules we're talking about at this time and what production rule we're talking about. Any questions so far before we start this process? So you should get the feeling that you're sitting there, that man, this is a very mechanical process. And the answer is yes, it is. This is an algorithm that we're following, a mechanical process to do this. And so in the next homework assignment, you're actually going to program, you're going to create a program to do first sets and follow sets. So you'll see that they are very algorithmic and mechanical. But you need to know how to do it by hand first, right? So that way you can check the computer. Okay. So the first thing we care about, we're going to start with the follow set of S, right? Where we always start. So we're going to go through all the rules, right? One through five. So does rule one apply? Can you really rule one, people in the back? Kind of? Yeah. Swint thing? Okay. Yeah, sorry. Sometimes you have to feel a lot of information on these screens. Okay, so yes. Is S, so S here is the starting symbol of the grammar. That means we add the end of the file to the follow set of S. Right? So that's really easy. Rule one applies. Okay. Is S in any of the right hand sides of any of the rules here? No. No. So then the other four rules don't apply. Because they only apply to non-terminals on the right hand side. So we added the end of file to the follow set of S. And so we're done for S for right now. Okay. Then we look at A. So how many rules does A apply? So I guess the first question is, is A the starting terminal of the grammar? No. No. So rule one does not apply. How many rules does A appear in on the right hand side? Two. Two. Right? So we have here S goes to big A, big B, big C, big D. And then A goes to little A, big A. So this is why so follow sets are definitely a little bit more tricky because you have to identify all of the places that the non-terminal occurs on the right hand side. Okay. So we take each of the rules one by one where A applies on the right hand side. So let's first look at the first one. S goes to big A, big B, big C, big D. Okay. So we first ask, well does rule one apply? Or he said no. Doesn't apply. Right? A is not the starting non-terminal. Okay. We apply rule two. Does rule two apply? So the question is how do you know if rule two applies? Somebody want to... Right? Yeah. Right. So one thing to make sure is we're going through this non-terminal by non-terminal. So if you look at this rule, yes. This rule applies to like this follow, the calculation of the follow rule applies to this production because there's obviously a symbol on the right-most side. But we don't care about D right now. We're only concerned with non-terminal big A. So the question is, is big A the right-most symbol in this rule? No. No. So this rule doesn't apply. What about three? So are all the symbols from A to the right-most of the string is epsilon in the first set of all of those? No. No. Which one fails? B. B. I don't know if you say these things. You don't know the answers. That's why I have you. Okay. So the third rule here doesn't apply because there's not... Remember, this is all of those symbols following A have to have epsilon in their first set. And we know we have the first sets here. We can look. First set of B? No. There's no epsilon in the first set of B. So rule three doesn't apply. Okay. Now rule four. So rule four says we take the next symbol in the rule. So the symbol following big A. And we're going to add the first set of that symbol minus epsilon to the follow set of A. So what is... So in this example, what's C zero in our rule? B. B. Big B. Exactly. So we just follow this rule. We add the first set of B minus epsilon. So the first set of B minus epsilon is the second containing B. And we're going to add that to the follow set of A. So that rule's done. So this rule doesn't apply anymore, right? We've done it once. Now, what rule five says is if epsilon is in the follow set of B, then add C's first set minus epsilon to the follow set of A. And if C has an epsilon in it, add the first set of D to the follow set of A. So we can look at the rule here. We can see... Well, no, this doesn't apply because C zero, in this case big B, doesn't have an epsilon in it. So this rule doesn't apply. So then what's in the follow set of A that we just calculated? B. B. Oh, B. Oh, no. Okay, yeah. So that's what we've calculated so far. But we're not done yet because we have to do this for each of the places where A appears in the right-hand side. So we looked at one place and now we have to leave the second place. So the second rule is A goes to little A, little A big A. So we asked, does the first rule apply? No. It doesn't, no. The first rule doesn't apply. It's not the starting terminal. Then we ask, does the second rule apply? Yes. Is A the very last element on the right-hand side here? Yes, it does. So what do we do? We add the follow set of, in this case, A to the follow set of A. Okay, great. So if this recursion kind of throws you, remember we've already calculated the follow set of A. The follow set of A is the empty set. So we've got the empty set to our set that we're creating now. No problems. Okay. Is there any characters to the right of the A or any symbols to the right of the A that have epsilon's in them? No. So rule three doesn't apply. Is there anything after the big A on the right? No. So this rule doesn't apply. And is there anything after us to the right that has epsilon's in it? Also no, because they don't exist. So this rule doesn't apply. And then so we've calculated here that the follow set of big A is little b. And so once again ask yourself, why is this? Well, look at the very first rule. S goes to big A, big B, big C, big D. So whenever there's an A, whenever A generates, the next thing that's going to occur in the string is going to be whatever big B produces. And big B is always going to produce only a B. It's going to start with a B. So after we parse an A, the next character is always going to be a B. Or is always going to be at least a B. We're not done yet. So it could be more characters. But we know it's at least going to have a B. Questions on how we did A? Yeah. Yeah. So use rule three. That one? First A. Yes. So in this case, yes. So here we're looking, well, OK, this three, we're looking at this rule, so you probably want that one. Yeah. So in this case, we're looking at C zero here is big D. So we look, is epsilon in the first set of big B? Is it? No. So rule three doesn't apply here. Oh, we, we, ah, ah, ah, sorry, sorry, sorry. OK. I didn't say. OK. Rule three is about, sorry, you got me. So rule three is about if there are epsilon in first set from after us to the end of the string, then we'll add the first set of, the follow set of S to the follow set of A. So this is about rules two and three are about propagating follow sets from the term, the non-terminal on the left hand side to the terminal that we're interested in. So if there was an epsilon B, C and D's first set, then we would add the end of file to the follow set of A. Does that make sense? So if, if B, C and D all go to the empty string and don't produce anything, well, that means the next token after A can be the end of file. It can be the last thing that we read. It's not the case here, exactly. So here rule three does not apply. So rule four here says that now we look at what's the following to add the first set of the following symbol to our follow set. And that's where the little B comes from. Yeah. There's a lot of rules, but as you can see, once you kind of get the hang of it, you just keep applying these. Yeah. So the order here doesn't matter at all. So remember, the order here is just a, you can think of it like syntactical sugar that we have. So we don't, this could be written as two different rules. So one rule goes to C, D, and another rule after that A goes to big A little A. So by, when we talk about, we're looking at a rule. We're looking at one of those OR cases. So we may have to look at each of them if the non-terminal appears both in both cases. Any questions about A? Yeah. That would be true if we had follow B because it's technically I would say you had an end file. So, in this case, yes. So if epsilon was in B, big B here, we would say rule three applies because epsilon is in big, the first set of D, epsilon is in the first set of C, epsilon's in the first set of D. That's the entire symbols after me. That means I add the follow set of S to the follow set of A. And the follow set of S we've already calculated is the end of file. No, no. So you only add the follow sets of the left-hand sides. So we'd only ever add the follow set of S to the follow set of A. Because right now we only care about A's. So rule two says if we're the last most one on the far right, we're going to add the follow set of S to our follow set. But this rule doesn't apply because A is not the right-most character, right? And this says, hey, if between U and the end of the string there's all epsilon, well then that means add the follow set of S to your follow set. And then the question is, okay, I think where you're going is, so if let's say epsilon was in B, we would add the end of file character here because of this rule. And we'd add little B because of this rule. And then we'd look at five and say, okay, five applies because epsilon is in the first set of big B. So then we go and add the first set of big C to the follow set of A. And then we say, is there an epsilon in the first set of C? Yes, there is. So then we go and add the first set of D to the follow set of A minus, the first set of D minus epsilon to the follow set of A. Make sense? More or less? Okay. I think, I mean, one, that example I think will come up when we get to B. So it should be fairly clear. Okay. So we calculated on this time the follow set of B we calculated is little B, or the follow set of A is little B. We applied all the rules on the first grammar. We got something like follow set of A. On which first grammar? Sorry. On this rule? The first one is S. That rule? Yeah. So we got follow of A as B by rule three. So when we go to the second grammar? No, no, no, no. Rule three doesn't apply here. Because rule three only applies if after A there's a first set in everything after A. There's not an epsilon in the first set of B. So rule three doesn't apply. Rule four, I mean. Okay. Rule four. Yes. So you can get a second B. It's all B for the follow of A. Yes. Then you move to the second grammar. So that second grammar. Yes. So now you will consider what are already the follow one we got on the first grammar. For all the rules you will So in this case it doesn't really matter because you're adding it to itself. Right? You're adding A to A. So this is the only case that will ever come up. So I just prefer to say like, think like you're creating a new follow set. You're not done yet until you apply all the rules everywhere. So I would use the empty set here. Exactly. Yeah. So until we've gone through calculated it, now we know. So we went through, now we know the follow set of A. So now we can move on and calculate the follow set of B. So how many rules does the follow set, does the non-terminal B occur in? One. One. Right? The very first rule. S goes to big A, big B, big C, big B. So we go through all the rules again. Is B the starting non-terminal? No. No. Okay, good. Is B the last element on the right hand side of this rule? No. Yes? No. No. No. Right? So big B is not on the right hand most side. So then rule three says isn't the case that after B until the end of this rule that there are epsilons in all of those symbols. The epsilons in the first set of all those symbols. Yes. Yes. Right? Yeah. So epsilon is in the first set of C. Epsilon is in the first set of D. And so we can say, oh, we can apply this rule. So we can add the follow set of S to the follow set of D of B, which is the end of file. Yeah, the end of file. Good. Okay. But we're not done, right? So we applied rule three. Now we have to say we have to apply rule four. So rule four is just add the next symbol after us. Add the first the first set of the symbol after us to our follow set. So here we're going to add the first set of C to the follow set of B minus epsilon. Right? So the first set of C we can see is C epsilon. We take out epsilon. We add C to the follow set of B. Right? So this rule... Oh, sorry. Did I hit one more? Sorry. Okay. So rule four just says the next one, right? So the one directly after the symbol we care about. So here we have big C's after us. We can calculate the first set of big C. We've already had it. We take out epsilon. Then the question we ask is, okay, for the fifth rule, is for the fifth rule to apply, is there an epsilon in the first set of C? Yes. Yes. So that means we can apply the one after C, the first set of the one after C to the follow set of B. And so we're going to add the first set of D minus epsilon, so which is the set containing D. We're going to add that to the follow set of B. Is there an epsilon in D? Yes. The first set of D? Yes. But there's no more symbols to move, so it doesn't matter we're done, right? And so now we've calculated, there's no more B, there's no more rules that have big B in them, so we've calculated for this iteration the follow set of B. Yeah. So the question is, could you apply rule five if there was no epsilon in the first set of C? In D. Yes. You could still apply it. So because you don't have the next symbol. So to apply it once you'd say, is there an epsilon in the one right after me in C? If the answer is yes, then you add the next one. You always add it. You don't care what's in it. And then you go to the next one and say, okay, is there an epsilon in the first set of D? If the answer is no, then it's no and you don't continue and add the one after. Yeah. If there was you'd go on to e and you'd just add e. It doesn't matter if it's just do it. You'd take the second containing epsilon. You'd remove epsilon and you'd have the empty set and then you'd add that to it. It doesn't change anything. You'd still do it. And then you apply and say, does that rule apply again? Then you'd add f. And you'd look at f. You'd always add to f and then look, is epsilon in the first set of f? And if so, you'd move to G and so on. Questions on B? Yeah. If there's no epsilon in the first set of D? Yes. We would still add G regardless. Exactly. Because, so this rule doesn't, so kind of the key thing here is, right? So we're adding the first, it's like hearing the formalism, we're adding the first set of C I plus 1. Right? But our condition says C from 0 to I has an epsilon in its first set. So you can see this says nothing about what's in C of I plus 1. We don't care what's in the next one. We're going to apply this rule anyways the same thing with first sets. Not in the exact same way, but the same idea. Okay. Now we're going to look at C. How many places is big C used on the right hand side? How many rules? The first rule, the second rule and the, I don't know how to count it. Third, fourth rule, something like that. Okay. So we just do it one by one. We go through all the rules, all the places that big C exists that is C a starting nonterminal. It's not. None of them except for S. Okay. And this rule is C at the very end of this rule on the right hand both side. No. No. So rule two, rule two does not apply. Okay. Rule three from C to the end of the right hand side. Yes. Is there Epsilon in all of those symbols? In the first sets of all those symbols? Yes. Yes. Right. Epsilon in the first set of big B. Yeah. So we can apply this rule. So how do we apply this rule? Well we take the follow set of S and we're going to add, add it to the follow set of C. So we've calculated the follow set of S is the second containing the end of files. We're going to add end of file to C's follow set. Okay. Then we simply, so this is another thing, right? We're going to just take, we look at the next symbol after C in this rule. And we're going to say, does this rule, is there a symbol after C? Yes. Yes. It's big B. So we're going to follow this rule. Just take the first set of big, of big D minus Epsilon from it and add it to the follow set of A. So in this case, we're going to add the second containing D. So now we have the end of file and little D. Yeah. It's a good question. You can probably replace that with alpha and it would be the same. I don't know. Just, exactly. Yeah. In this one. Yeah. So rule four, yeah, exactly. Simply says the next one, which is kind of simple. So these, there's some parallels between these two sets of rules, right? This says, is it a very last right-hand-most side? And if yes, then you can apply this rule. So it's very binary in that sense. The same with rule four. Apply the next one if there is one. And then, exactly. Yeah. Exactly. So we only care about the next one for rule four. Exactly. Okay. So we've added little D to the follow-up set of C. Now, we're going to see if rule five applies. So, is there an epsilon in the first set of D? Yes. Yes, there is. But there's no symbol afterwards, so it doesn't matter. There's nothing to add after D. So, rule five, I think you can say does not apply. Because even though the condition may be true, there's no I plus one to add. Nothing to add. So, in that case, wouldn't we also be getting the same result for A implies C, big C, big D? Say that again, I almost knocked off the mic. Distract. Wouldn't A implies big C, big D, give us no new information? Because that's what we just did in S implies big A, big C, big D. Yeah, yeah. Pleasure. I understand the question. So you mean the production rule? No. No rules that are applying to what we just did are going to apply to A implies big C, big D. We don't know yet. We haven't looked at it. That's the, I mean, okay, I guess you're kind of trying to cheat ahead and say, can I use the information I know here because I just calculated something for C and D for rules four and five. And because that only mattered with those specific symbols, can I reuse that later when I do A goes to C, D, right? I mean, yeah. So does that, I mean, would that work or is that somewhere you have to, you always have to be really confident in it? So we're going to go over this as a very rigid mechanical process you can follow. Okay. So that's, that's what I'll say about that. Yeah. On the exam, what do you think of these rules? On what? On the exam, what do you think of these rules? Yes, the rules will be on the exam, but I wouldn't, like, if you're flipping back and forth to see if you're going to have problems. But, if it's like a refresher just to look and be like, oh yes, okay, I'm not missing something obvious then that's how I would use those. Can I memorize ten rules? Okay. Okay. Okay, so we've looked at C in the first rule. Now we're going to go to the next place C is used, which is in this rule A goes to big C, big D. And we go through the same thing. We say, is C at the very end of this rule the right end of the sign? We can't apply this. Is there an epsilon though in all the symbols after C to the end of the right hand side? Yes, there's an epsilon in D. So we're going to apply this rule. So we're going to add the first, we're going to add the, sorry, we're going to add the follow of A to the follow of C. And so the follow of A is little B. So here in this case, so in this case right, we did follow these rules because here we care about the left-most side of the rule. So for these rules, okay, so what's the next symbol right after to the right of C in this rule? D, D, right? So we add the first set of D minus epsilon to the follow set of C. So we add little D and then we say, okay, is there an epsilon in that big D? The answer is yes, but there's no symbol after that and this rule doesn't apply. There's no I plus one here. Okay, but then we go to the third, right, so this is not the only place we go to the third rule where C is used and we go to these things again. Is C the last element of this rule? Yes. Yes, it's the right-most side of this rule. So we add the follow set of C to the follow set of C. Okay, cool. It didn't change anything. Okay, is there something, is there anything after epsilon's after us? After C? No. No. No? No. Is there anything after us? No. No? Is there anything after us with an epsilon in it? Also no? There's nothing after us, there's nothing after us with an epsilon. Okay, so we've looked at all the places where C exists in the rules and we've calculated that the follow set of C is end of file D and B. Is that okay? Sure. Okay. Cool, yeah. If you were deriving it in what ah, okay, and if we were deriving a sentence in this grammar when would a little D follow a big C, a little B? Yeah. It would be ah, so it's this rule here. So we have we go S goes to A, B, C, D and then we have A goes to C, D and then if D goes to epsilon, D basically disappears, which means B goes to little B, which means there's got to be a little B after us. So remember, it doesn't mean that it, so it means it has to, one of the, the next token after C has to be one of these tokens, otherwise it's not a valid string in our grammar. Good question. What about for D? Okay, how many places does D use? How many rules? Three. Three times. We're going to do the same thing. So we say, is it the last element here? Yes. Yes. So we apply this rule directly. We have the first, the file set of S to the file set of D, which is the end of file. And then we say, okay, is there anything after it? No. That has epsilon? No, there's nothing after it. Okay, so we don't follow that. Then we say, let's add the next one. Is there anything after D? No. No, so we don't, this rule doesn't apply. No, so this rule doesn't apply either. So we added, from this rule, we added the end of file. And now we look at this rule and we say, okay, is D the last most, right most symbol here? Yes. Yeah, so we add the follow set of A to the follow set of D. So this is going to add little D. And then we say, okay, there's nothing after us. So it doesn't matter if they all have epsilon. And then we say, okay, do we add the next one? Is there something after us? No. So we don't care. And we don't care if anything after us has an epsilon because there's nothing after us. Okay. Then we look at the third rule and we say, okay, all right, is D the last element here on the right, the right most element here? Yes. Yes. So we're going to add D. We're going to take the follow set of D, add it to the follow set of D. It's empty set. So that doesn't change anything. On the right-hand side, nope, this rule doesn't apply. Is there anything else after D? No, this doesn't apply and there's nothing after D, so this doesn't apply. Okay. So we calculated that the follow set of D is dollar sign and little D. Yeah. What if there was a little C after D? Like here? Yeah. Then we would say that we would add the first set of little C to the follow set of big D. The first set of little C is little C and so we'd add exactly. And so, yeah, it's not in this example but in the last example that was why we had one of the symbols in the follow sets. I can't remember which one it was. Okay. So the question is are we done? No. Why? Because stuff changed. Because stuff changed, right? We added stuff to the follow set to change so we have to do this whole thing again. So are we really going to okay, we'll go through it kind of so that you can, but it's here so you can go through it and verify on your own that you apply these rules. Okay. So we look at S. We say that, look at S we say, okay, rule one applies. S is the starting non-terminal and then we see is S in any of the right-hand sides? No. No. So none of these rules apply. So we get the empty string. We look at A, we look at this rule and we say, okay, is A at the very end no, two doesn't apply. Is, for the third rule, is there epsilon in everything after A in the first set of B, C, and D? Nope. So that doesn't change. We add the next thing after A's first set to the follow set of A. So we add the first set of big B to the follow set of A, which is B and then we go to the next rule. We apply everything, doesn't change. Then we look at B, we look at B in this rule does rule two apply here? Don't we want to say yes? You want to explain? But yes? No, you just want to say it? Okay. Good. Always, you got to think positive, right? Yes. Yes. Okay, no, no, no. Rule two does not apply because B, we're talking about B, B is not the right-most symbol in this rule, right? But rule three applies because epsilon is in C and D after here. So we can add the first set of S to the first set, I'm sorry, the follow set of S to the follow set of B. And then B doesn't appear anywhere else, so we're done with B. We know that it's B, C, and D. Oh, we didn't add C and D. Okay. Well, you should go through and do that. All right, let's do that. So then we add rule four. So we look at rule four and we'd say, okay, we add the first set minus epsilon of the symbol after B, in this case is big C. So we add little C to the follow set of B. And I would say, is there an epsilon in the thing right after B? Yes. That means rule five applies. It means we can apply the one after that. So we add the first set of D minus epsilon to the follow set of B. And we say, is there an epsilon in that? Yes, there is. But there's no symbols after that, so it doesn't matter we stop it. So we get this. Should there be a where? Where would the B come from? Remember, follow sets only get propagated off of the first two, the off of rules two and three. So we only ever add the follow set from the leftmost symbol of a rule to one of the symbol non-terminals on the right. So this kind of goes that question is, can a B ever follow a big B? Well, what can follow a big B? A C or a D? Right? And Cs can only produce Cs or nothing. Ds can only produce little Ds or nothing. So a lowercase B, a small B can never follow a capital B. Because you think about it, capital B itself is the only thing that can generate a little B. Okay. Then we look at C. We apply the rules. Is it the last one? No. So rule two doesn't apply. Rule three applies because the last one is not. Rule three does apply because epsilon is in the first set of D. So we add the first set of S to the fall set of C. It's kind of hard when you're trying to talk through these fast. But if you go slowly, it's very simple. And then we add rule four. We use rule four to add the first set of D to the fall set of C. And then we look at this rule. We look at this rule. We would apply it and see that nothing changes. Then we look at D. So we look at this. Yeah. No. Well for in general, no, in general, that does not matter. It does matter on your homework. So just make sure that that is clear. Okay. I mean the order here is kind of actually, if you think about it later, the order is the order that we derive them from the rules. That's how I tried to keep it, but there's no guarantee just that it's kind of tricky. Okay. So we apply all these five rules to the D here. We'd apply all these five rules to the capital D here. We'd apply all these five rules to the D here. And then we'd see that the fall set of D has changed. So nothing changed. So we're done with fall sets here. We've reached, these are the fall sets. Yeah. Why can't we follow C and follow D in the second column? We added follow A of the follow A which is small p from the same column. So there's not a reason why nothing changed. Yes. Yeah. So nothing, I mean, happened when we first went through it. It just happened to be that way. There's no guarantee that you only have to do it three times. So yeah, you just have to keep cranking this because follow sets could propagate through rules based on rules one and two, sorry, rules two and three. So you just have to make sure you keep going through this until you've you haven't changed any of the follow sets. Questions on follow sets? Yeah. Final step, wouldn't it be sufficient if we just check the rules? Only those two rules are going to make the changes without a change because the first is not going to change the rules. Yes. That's a good question. I'm teaching you how to do it so you'll always get the right answer every single time. So you can think of it as like a parody check, right? So if you like do those four and five rules again and you get something different then something is very a problem. Yeah, they're based on the first sets and the first sets were made fixed. Yeah, it's a good optimization. More questions on follow sets? Are they easier than first sets? It's not like a it's not a hypothetical question. I don't know. No? Maybe? Yes? I don't know. You haven't done them so maybe maybe they're not They're a little bit trickier but the same general idea. So as long as you can keep these rules straight and apply now we get back to why do we spend all this time like a long time going over in excruciating detail the calculation of the follow sets and the first sets and is it because I'm a super very mean person? Maybe but mostly it's because we're going to use them to prove that a grammar has a recursive a predictive recursive descent parser. So the whole point here so if you kind of break these rules up predictive is the thing that we care about so the question there is for each rule can we know which of the production rules we're trying to follow that this string followed and this will be very clear but you have the rules where A goes to something or something else when we're parsing and we have the string we want to know well which one of these can we follow so that's where predictive comes in recursive we'll see why and descent means we go from the top down and so specifically for predictive what we want is that each parsing step there's only thanks there's only one grammar rule that we can choose so that way we can be very efficient we're not going to do any backtracking so you can write a parser that tries every single combination of parsing rules until it finds something that fits but that wouldn't be predictive because you're trying everything so here we want to be predictive we want to be precise okay so these two conditions have to hold for the grammar to support a predictive parser and they are the first one is it makes sense why you think about it and why we've been talking about this so the first rule is that if you have if we have two rules with the same left-hand side symbol so A produces something in this case alpha and A produces beta then the first set of alpha intersected with the first set of beta better be the empty set so what does this mean yeah there's no ambiguity here I know exactly just by and the the important thing is here just by looking at the next token right because the first set is the set of possible tokens so I know that by looking at the next token I know exactly which one of these rules applies A goes to alpha or A goes to beta and note that this applies for all rules here so the other kind of trick here is the important thing is that remember alpha and beta are sequences of terminals and non-terminals so you don't just take the first rule you have to consider the entire rule and we'll get into that okay but this isn't enough so this tells us what to do if we have a choice of which parsing rules we want to parse so the second rule is well what happens if epsilon is in the first set of A right well then how do we know if we go to nothing or if we go to choose one of our rules and so that's where this comes in so if epsilon is in the first set of A then the intersection between the first set of A and the follow set of A better be the empty set right so this kind of makes sense so this means we can tell if if A goes to epsilon that means A is nothing so the follow set tells us what are all for A and so but if if that one of those tokens is the same as the first set of A then we don't know if we should choose one of those rules or if we should choose the epsilon rule does that make sense exactly yeah yeah so it's just a good way to formalize it so then you know so if I ask you hey here's a grammar can you write a predictive descent parser predictive parser you follow these rules and show the way you do that is you show for every rule where there's two rules you show that those first sets are the intersection of those first sets is empty and then you show for anything that has an epsilon in the first set you show that the intersection between the first set of A and the follow set of A is the empty set okay so what are the steps so you want to create a predictive maybe it's because you find out on your job it's the best thing to solve the problem that you have so what do you do you first create a context-free grammar right so you kind of we've seen we've seen context-free grammars you create one somebody gives it to you you calculate first and follow sets right then you prove that the CFG allows a predictive recursive descent parser based on those two rules we just talked about and then you create a different recursive descent parser using the first and follow sets that you calculated so it's pretty clear we're not going to get to it today but we're going to set it up for Monday so a good example that I found in the real world is email addresses so I think I may have hinted to it earlier in the class but it kind of seems like a trivial problem if you think about it right I mean what would you try say say a letter yeah something at something dot something right kind of name at domain dot tld or whatever yeah so it turns out that in the real world it's not really this simple and that a lot of things that are actually valid email addresses will fail this check some things that are definitely not valid email addresses will pass that check and then you can include in the local part in double quotes a string right so you have the string of cse340 in double quotes at example dot com yeah this is a valid email address I mean it doesn't go anywhere so the whole reason why I use example dot com is that domain is guaranteed to not exist or host anything it's actually one of the RFCs so yeah you read the email RFC this is a legitimate you can actually have at signs in the left hand side of the email address as long as you include it in double quotes you can include a slash in the double quotes and because you know like when you're writing this back in the I don't know 80s or whatever this RFC you're like well but you know if you're using double quotes to quote something you have to be able to quote the double quotes you can use double quotes in your email address because why wouldn't you want to so this is also a valid email address and so you can put because of this you can not only this you can actually put email addresses inside email addresses and then you can have so so some people have seen like you have your name and then followed by brackets an email address right so that's actually can also be a valid email address depending on how you're using the email address especially if you want to send mail like Gmail needs to know how to parse those so to like Gmail this is a this would be should be a valid email address but it's like crazy I have an at sign within the double quotes and I have like a name so my name here would be test and then space and then example space at hello with an email address of test at example.com so these are all valid email addresses and this is like madness right so so they're so crazy email addresses in general are so crazy there's a company called mail gun does anybody use them or know what they are get spam from them so they're a company that provide an API for developers to send out emails and so they actually the core part of their business model is how to deal with email addresses and so they realize that this would be super valuable so they open source this tool of how to validate email addresses and so they actually it turns out so you can I'll have a link to this code in a second but you can go down the code and they implemented a parser for email addresses so they implemented a recursive descent parser which is really really cool here's the email address you can check it out so we'll go over I just wanted to talk about the CFG so this is what I did is I found their tool I ripped out their context free grammar so this is so these are all the tokens in their grammar so quoted strings there's quotes around the strings that we looked at and Adam is pretty much like an identifier just any kind of bits dot Adam and white space and this is their crazy context free grammar so this is their entire context free grammar so this is doing two things a showing you that what hopefully what I'm teaching you is not going to be completely useless this is stuff that actually exists in the real programmers do and implement these things and two on Monday we're going to go through an example of this simplified email address CFG we're going to go through it we're going to calculate first sets, follow sets prove that has a predictive recursive descent parser and then write that predictive predictive recursive descent parser thank you quick question yes you should that's about it