 Good morning, everyone. Thanks for coming here for our unofficial American holiday yesterday. So we have to cover all the sets today, right? I think everybody realizes that. Oh, there you go. I'm sorry, you just put that in front of me. OK, so but first, I want to let you know Project 3 is posted on the website. You should be looking through and, oh no, you should be looking through and reading it now. You should be thinking about it. Really, you should get started on it now. You have lots of time to work on it. Lots of time to work on it. It's due March 4, but you will need all of this time. So if you're starting this project in March or even the end of February, there's going to be nothing I can do to help you. It's a complicated assignment, but it's really cool because we're putting to practice the things that we've talked about in class. So you're going to be reading in the description of a context free grammar. And to do that, we've actually given you the description of the grammar that you're going to be reading in the context free grammar. So you have to write the code to interpret and that context free grammar. Then once you have that, then there are. So this would be an instance of a context free grammar. I'm not going to go into the details right now. We're more in depth on Wednesday or today if we have time. Then there's three different tasks that you need to do. So one, tasks case 0 is a simple just output some information about the grammar to make sure that you've read it in properly. Case 1 is calculate the first sets of the grammar and then output those first sets. Case 2 is calculate first and follow sets of the input grammar and output the follow sets. And so kind of skipping down to the points. So this is kind of the point right now. There'll be test cases on the servers for each of these things. On the submission server, you'll be given in that zip file those test cases. We'll be a subset of the actual test cases that are on the server. So you won't have all of them. So you have to be responsible for creating your own test cases. Quick questions on this at a high level. I expect you to spend some time reading this. You'll be getting very, very, very familiar with it. You can do C or C++. One important thing I will note now I'll try to note further on, you're going to be reading in an input language, right? So you can for specifically allowing you, if you want to, you can use the lexer.c and the lexer.h from project 2. But this is why we have this important warning here, right? That's for possibly a different input language with different tokens and symbols, right? So you can use that, but it's up to you to then modify it to work on this assignment. It's not just a drop-in replacement. If you try to do that, you will encounter a problem. Yes. But it's a good starting base. So you can rip any code out of there, it's fine. You can rip functions out of there, whatever that skip spaces function is kind of useful, whatever you want to do. Or you can just write something, a new lexer, or whatever completely from scratch. OK, and to help motivate the start early, start now, start reading the assignment now, start thinking about how you're going to program this, right? Because this will help make sure that you actually understand how to do first and follow set calculations for the midterm on Friday. So it's good practice in that aspect. Also, so this is last semester's project 3 grades, reverse sorted. So there's, I mean, if you don't start, these are people who started late, didn't submit anything, submitted stuff that didn't work, didn't compile, didn't pass any of the test cases. So just like project 2, it has to match exactly the output, being kind of close doesn't count. But you can see that it gets up in the 70s, gets into the 80s. And then we had a lot of people who got 100. So there's a significant amount of people that got 90 or above. But it's hard, and it's supposed to be hard. So it's meant to challenge you, challenge you and your skills and your design skills. Because you're doing this basically all through the scratch. And we will plug me. OK. So now let's talk about first sets. So why were we studying first sets? Why did we even care about first sets? You guys said so? Part of the reason, yes, but the deeper reason. I think it will eventually make it easier to obtain parse tree. So our end goal is to, we want to write a predictive recursive design parser. So we want to be able to tell, just by looking at one token from the input, which rule to apply. So we said that, hey, if we calculate the first sets, then that can help us distinguish between which rule to follow in doing parsing. So for instance, if I have a grammar like this, S goes to big A, big B, little C. I have A goes to little A or epsilon. And I have B goes to little B, big B, or little A. So then it was calculating the first sets here. I'm not going to go through all the steps. So let's first start with the easy case. What's the first set of A? Little A or epsilon. Little A or epsilon. What's the first set of B here? B or A. B or A? And so then what's the first set of S? So A, what was it? A, B, or epsilon. A or B, A, B, C, or epsilon. Oh, yeah, A, B, A, B. So A, so we get the A, the small A from the fact that there's a big A here. So we add the first set of A minus epsilon to the first of S, right, B, A. Now, because there's an epsilon in the first set of big A, then we add the first set of the next one minus epsilon to S. So the first set of B minus epsilon is the secantating B, the secantating A. So we add B. Do we add then the first of C to the first of S? No, because there's not a first. Exactly, there's not an epsilon in the first set of B. If there was no epsilon in big A, would it just be for the first set of S, or it just be A epsilon? So if there was no A here like this, we're just like that? Yeah. So then what would be the first of big A? A. Just A, yeah. And then the first of S would be just for one. We add the first of big A, right, minus epsilon. And then we say we add the next symbol if there is an epsilon in the first set of big A, right, which there's not. So we don't go further down the string. We don't look at anything else. Yeah, so because of this, we know that all strings that S possibly generates. Yeah. Do you have to start with A here, or would we start with S first? Do not. You could do it in any order, but to do it like this without doing a table, I would start with the base cases that it's clear. It's just going to be based on that. But if you do it, it doesn't matter what order you do it in percent. No. But essentially. Exactly, when you're programming it, or when you're doing it in a table, it doesn't matter the order. As long as you keep going until you reach, until nothing has changed by finding the rules. Yeah, so based on that question, but I'm guessing you already answered it. So for our assignment, if we were to get a longer S, I don't know, like a 20-digit or something like that, like a 20-letter, and if we wanted to calculate the first of S, we could just do that without counting the sub, not the substance, but first of S, first, right? OK, so if I just asked you to calculate the first of S, then yeah, you only really need to see first of A, right? Then you can look at first of A, but for parsing, we care about all the non-terminals. So that's why we do the whole table with all of them, right? Because at some point, we're going to be parsing A, right? We want to decide between these two rules. And so then we need to know what's the first of A or what's the first of this other type. So yeah, there's definitely cases where we'll need to know. And there you go. Well, we'll see in a second. OK, so let's go back to parsing for a second, right? Let's say I have a string. So is this string in this language? Does this grammar generate this string? No, not C. Oh, no, C is up there. That's the language. Right, so how do we tell? How do we tell? How do we prove that this string came from this grammar? OK, so first set. Yeah, that's the first check, right? We can see, OK, S, does it start with either an A or a B, right? If it doesn't, we can throw one right away, right? But the way we prove this is we show either a parse tree or a derivation that says, hey, this is exactly how you can get to this string starting from S, right? So let's try to draw the tree from here, right? So this is what parsing is going to actually do. But let's draw the tree here, right? So we have S, which rule of S production rule do I choose? A, A, this, right? Yeah, this one. There's only one choice. I don't have a choice. So it's got to be S goes to big A, big B, little C, right? Now we need to decide what does this A go to, right? So looking one character ahead, which of these does this go to? You can't tell when one character gets a look at the next. I can't tell. I have the first of this A is A. The first of this is that one. So why can't I tell? Can't we say it goes to all three because there's two A's. So it could go to A and the first A and then A and B. What do you mean all three? There's only two rules here. We're trying to decide between these rules, right? So we're trying to draw the tree and say? The second, the middle part is B and A, which would come from the first B in the straight. So can we say the first one comes from terminal A? The first terminal A comes from A, and then what was that like? The B and A would come from terminal B. But how do we know that just from looking at one character, right? We're only looking at this one character. Because it's alone and by itself it's in terminal A. And it's the first one in the sitting line order. So we can select that one. Isn't it A or epsilon? One of those. Yeah, so that's the choice right now, right here. So we have to make a choice here. We're trying to parse this A. We need to decide just looking at one character ahead, does this big A go to little A or does it go to epsilon? So why can't we tell? We have first sets, right? Can we be able to tell? Both the first sets of A and B include lower case A. Why do we care about B though? Because S is A, B, and little C. A, B, and little C. So B essentially comes after A. If it was like, let's say it's like this, if it was like this, would we be able to tell? Yes. Yeah. We know it has to, we look at one character ahead, it has to go to A. We can choose this rule, right? But exactly what everyone is pointing out, right? The fact that we have an epsilon here means that, well, if this goes to epsilon, then this A that we just read could actually come from the B after it, right? It doesn't necessarily have to come from the A. So now we get to this problem of even though, for each of these rules that we just look at this B, so we can look at the first character and say it's either this rule, right, which does little B, or it's a little A. We can distinguish between those rules. If I take A, I have a little A and an epsilon, well, theoretically I could distinguish between these two rules, except that it depends on what comes after A, right? So here, if this goes to epsilon, well then this starting character that I'm looking at could have come from whatever came after A, right? Which is the purpose of epsilon. So the problem is right now, first sets don't help us. They don't help us to distinguish this case, right? Typically when we have epsilon, first sets become, aren't actually enough because really then the question is, okay, what does B start with? If B starts with an A, right, then I know that A should go to epsilon, but if B can't possibly start with an A, let's say it can only start with a little B, right, then I know I need to choose this rule A goes to little A and A won't go to epsilon. So as it is, this grammar is actually... The problem is that second A in the language of B. So the problem is that second A in the language of B. Exactly, but it's only a problem because B follows A here in this rule. So the language as it is right now, we can't actually do a predictive parser to determine which rule to take here between little A and epsilon. But we need more information. We can't just use first sets. If we had, we could change this grammar around and have it be B, A, right? So here I can say, okay, this definitely goes to a B. I can say, okay, we have B, A, a little C, B, right? Looking one character ahead, do I know which one of these two rules to choose? Yeah, it's got to be this one, right, little A. Yeah, so that consumes that input. So now I go back to A and I'm saying, okay, just looking at this B, can I decide which one? I have to choose epsilon, right? Yeah, it's definitely got to be epsilon, but is it correct, right? We know that it can only be little C, right? So yeah, so this part here, so now I actually know here that it's a parsing error. If I were to change this, get rid of that. Okay, if I were to change it here, where I have C here, right, so I get this C. Now I can accurately determine between these two, right? So I know from looking at this rule, where A is used on the right-hand side, I can tell that, hey, if A goes to epsilon, then C must follow big A, right? So if when distinguishing this rule here, I can say, okay, it's either little A or it's little C, and if it's little C, I choose epsilon, otherwise I give up, right? I'm done. So this gets us to the concept of the follow sets. So first set describes the all strings that that subtree can generate, right? So the first set of B describes the starting terminal for all strings that B can generate. The follow set says, hey, what could start after B, right? So all possible combinations of terminals that come after B. What's that character that can come after B is generated? Let's look at an example. So let's get into some types. So when we think about the follow set, it's going to return a set of terminals, right? So it's the thing that can follow this character, right? But can epsilon ever really follow the character? Yeah, not that we care about, right? We're going to be looking at the string, right? There's going to be no epsilon in the string, so we want to know what that actual character is. Now, look at here. What follows S? So if you draw here, right? It's going to generate some big tree. What's that first thing that's going to be after S? End of file. Yeah, end of the file or the end of the string, right? Because S can generate the whole thing. So we'll use the symbol actually from regular expressions of the dollar sign. S sign represents end of file or end of input. So we want that in our follow sets, right? That kind of makes sense. So our follow sets are going to return either a set of terminals and the end of file. A set containing either terminals or the end of file, which we're representing with the dollar sign. And the input is going to be a non-terminal. So we have a grammar, right? We're looking at grammars. S goes to ABC, big A, big B, big C. S goes to little A. B goes to big B, little B, or B. C goes to big C, C, or epsilon. We want to ask, just kind of looking at this, right? Let's try to develop some intuition by looking at this. So what's the follow of S? End of file, right? And we know because it's the starting non-terminal, right? So that all, you know, S is going to be the root of all possible parse trees. Is in every context for a grammar, S going to be exactly the end of file? No? Why no? Is the end of file always going to be in S's follow set? Yeah, so that's actually our first rule that we'll look at, right? So it's the starting non-terminal. It's always the root, right? There's nothing about a context-free grammar that says S can't actually be a rule here on the right-hand side somewhere, right? So S will always be the root, but that doesn't mean it also doesn't appear as a non-terminal end of grammar. But that's fine. We know how to deal with that, right? We just expanded that one. So what follows A in this example? Big A. And where do we look, right? We talked about where to look for first sets. Where do we look for follow sets? Which one of these rules, these production rules? Whatever would come directly after the first A? Big A? Can B come after A here? No. Bars are OR, right? So this means there's three different rules here. Or a B or a C. So B actually, big B can't actually follow A, right? Do you look at this rule? Does this rule tell us anything about what can come after this rule here? It's a terminal, so after would be nothing. We're talking about big A, right? Isn't it anything? So think about it in terms of the tree, right? So you kind of write with this. So we have our tree. Well, this is cool. You guys see that? So we have our tree A, right? This rule describes everything that comes... This is really hard to actually write like this. That's way more sense than the other one. So we have our A here. This rule describes everything that can come here. So this is why for first sets we care about that because we care about what's here. What's the first terminal that this tree could possibly generate? But for follow sets, what do we care about? Whatever comes after. Yeah, right? So this rule tells us absolutely nothing about what comes after this first set model. It's really terrible. That's why I'm not an artist. So this rule doesn't tell us anything, right? So when we calculate follow sets, we don't want to... We only care about looking where the non-terminal, in this case A, is used in the rule itself here. Or on the right-hand side. Where it's on the left-hand side. The left-hand side only tells us what it actually generates, but its usage tells us what could possibly come afterwards. So from looking at this, what follows A? The same thing that follows S, right? S will generate A, right? So we know that from this rule we have S... That's terrible. S goes to A, right? And then there's nothing after A here. So it's got to be whatever S, whatever follows S also follows A, right? So A is going to also be the end of file. So what about for B? Which of these four production rules do we look at to try to calculate the follow-up of B? Well we have to look at B first, because we have to figure out what the first in B is. No. We don't care about the first in B. So we may not care about it. That's not 100 percent true. S and B? Yeah. So we care about rules one and three, right? Why rule three? Because we just made a whole terrible drawing about why we don't care about looking at that. Because it references itself. B is a recursive rule. It's on the right-hand side, right? That's what's important. So we look at all the rules where B is on the right-hand side, and that occurs in rule one and rule three. Right? So for rule one, what do we say is the follow-set of B? Dollar sign, right? End of file? What about from here? What follows B here? B. Lowercase B. Lowercase B, yeah. Right? So we know whenever we see a B, the next thing after a B, better be either the end of file or a little B. So what about for C? What follows C? Which rules do we look at? One through four? One and four. One and four? Right? So what do we get from the follow-set of C for rule one? Dollar sign. Same thing, right? Dollar sign. What about here? What do we get for the follow-set of C? Lowercase C. What about the epsilon? Doesn't matter. It just leads back to dollar sign. No, the epsilon doesn't matter, right? We don't care because we're only looking at the usage in the rules. Right? We're only looking at where is big C here and what follows right after it? Rules for a second. So we kind of already have our first rule, right? So we know that the end of file should be in the follow-set of S, right? So if we have a rule like this, how do we calculate the follow-set of D? We can find where D is on the right-hand side. So we just have this rule? Just that rule? This is where it's used, yeah. Dollar sign? Yeah. Why? Because there's nothing after it. Because there's nothing after it? So then what do we do? Dollar sign. Is it because there's an epsilon there? Is there an epsilon there? The follow-of D. Why? Think about the tree, right? So S, right? We have S. You know, S is going to generate some tree. It's a little better tree. So we know the first rule here that we're looking at, S goes to A, B, I didn't believe myself in a room here, C and D, right? So now we think about this tree that D generates. What do we know about its relation to S? When the D tree ends, the S tree ends? Yes, because it's the rightmost non-terminal, right? So whatever tree that D generates, right? What we care about here is what possible terminals or end of files could be after S, right? So if I say, okay, I already know that. If I've already calculated it for S, right? And I have a rule that says, well, the rightmost symbol here, D, generates the rightmost tree. So this means that the end part here is going to be the same. It's either going to come from D or it's going to come from S, right? So then we should add whatever was in the follow of S to the follow of D, right? And we know this because it's the rightmost symbol. I think I may be going in the wrong order, but that's fine. So then, kind of by the same logic, let's say I have the same rule, right? But now I have, I know that, let's see, first of D is, let's say it's D epsilon, right? So does this change this rule? No, right? We should still do that all the time because it's always on the right-hand side here. So how does this knowledge that there's an epsilon here in the first set of D, does that change? Do we do anything different here? It's picking D, like little D. So you pick a little D and then after that you do follow of S, which is just the empty set. And it's the same thing for epsilon. Okay, let's think about it in a different, a little different way. So just using this kind of logic, where do I move follow set of S, right? So here I can basically say I add the follow set of S to the follow set of D because D is the rightmost symbol, right? Can I use that logic here with, I don't want to erase this, I don't want to draw it again. Let's just move the screen up so it's gone. Can I use that logic here to add the first set of S to the first set of C? I want to do follow of C. I want to see where follow of S goes. So I always add the follow of S to the follow of D. So for any, and it doesn't have to be S, right? So this kind of says for any rule, for any production rule you can add the left-hand side's follow set to the right-hand's rightmost symbol's follow set, right? Because they're going to be the last part of the tree. Well if the rightmost follow set is empty string. We don't have enough knowledge for that. We don't know that or we can't assume that at this point. So then do we add the follow set of S to the follow set of C? Yeah we can't because when we look at the tree we see okay, X, C is going to generate some part of this tree, right? But what follows S doesn't necessarily follow C, right? Because there's a D in the way we can see that, right? Unless we know that D could possibly go to epsilon, right? So we know there's some possibility that we have, right? A, B, C, D. So we have some possibility that this goes to epsilon. So what does that tell us about C's tree here? That it could be instant halitane with rightmost. It is the, in that case it would be the rightmost some tree here, right? Which means that whatever follows S also must follow C. So what if there's an epsilon and C's follow set? Then B. Then B, right? And then what about the follow set? The epsilon and the first set of D. Exactly. That's why we need first sets, right? Because we've calculated the first sets tell us that hey, this could go to epsilon. This is one way to get it to go. So basically I'll kind of write, this could be rule two. These aren't the actual ordering numbering that I use, I think. But this rule is kind of like base case. These two rules are very related. They tell us how to propagate the left-hand side to the right-hand side. So follow set. So basically this would be like follow of S to follow of C if epsilon is in, sorry. If epsilon is in the first of D. And right, so you can probably assume that this is the way to write this mathematically. So it works for any kind of I in here. You can always add the other one if there's epsilon's in the first set here. So we'll see that in a second. But the logic here is very simple. So we can always add the left-most side to the right-most side. Follow set. And then we can add the left side to the second one if there's an epsilon in that first set. There's a little C. Just confusing. Let's get to a little E. A bit. That's a little weird. Go back to C. So what do I know about from this rule? So I can apply those other two to calculate follow of S into follow of D. But this rule, what do I know about this rule for the follow of B? It's going to be C. Why? Because C is a terminal. And what do we know about C, what do we know about C from this rule? Follow of B. Yeah. No. Follow of B. So let's change it slightly. Let's go back to our original one. Now we have big C. Let's say I know, because I calculated this, I did it correctly, that the first of big C is the second containing C. Now what do I know about the follow of B? The first of big C. The first of big C? Why? Because the first, it can be the first of big C, as long as the first of big C is not epsilon. Yeah. So it's the first of big C minus epsilon. And we can also know that because epsilon is not in our follow of S. So that's part of why the textbook helps, is that it should only be terminals or end of file dollar sign. If you ever have epsilon in a follow of S, you've done something wrong. So just because they're next to each other, I know that it goes back to the trees again. I know at some point this rule is going to be chosen, and I really don't want to draw the tree again, so I'm just going to go back here. So I know that B is going to generate some tree, and so if I, ah, I do have to draw it because I think it's important to draw it bigger. So C generates something, D generates something, B is going to generate something, and A is going to generate something. So in this picture, what is the first of C? It's a little C, but what is it in this picture? It's right here. It describes all characters here that could potentially, let's see if I'm pushing it hard. It describes all terminals that C could possibly start with, that subtree. And then if I look at subtree B, this is exactly what I'm looking for. I'm looking for what are the terminals that could possibly come for after B. I've already calculated the first of C, so I know I can add that first of C minus epsilon to B. It changes slightly, and we still have this rule. And what if we say that the first of C is now C epsilon, and let's say the first of D is the second time D. So what's the follow of B? The first of C minus epsilon. And then because there's an epsilon in the first of C, then I can add the first of D. To the follow of B, yes. These do get a little tricky. And so I'd also do this here, obviously, if I was trying to calculate the follow of A, I'd use the same thing. I would say, okay, look at this rule. Add the first of B to the minus epsilon to the follow of A. If there's an epsilon to the first of B, then add the first of C to the follow of epsilon, or a follow of A, and so on and so forth. So these are basically rules four and five, right? It's that big C over there. Where? Here's big C, this is little C. And the follow of B above the line, that's a little C. Anything in the sets, right, are terminals. They've got to be terminals, so those are little C's. These are non-termals, also non-termals. Which one did you define as rule four? Rule four is the simple case, just always add the one after you, right? So if you want to calculate the follow of B, you should add the one after you. Add the thing that comes after you. Add their first set minus epsilon to your follow set. Done. Always do that. You never have to think. The only time where there's a possible case is if there's an epsilon here. And of course, if there's something else after we're here like E, and there's an epsilon in the first of D, you would add the first of E to the follow of B. It's just a simple recursive thing you keep applying. Questions on the intuition or ideas behind these rules, we can look at what they exactly mean. We're going to measure up if the order is different. So if we have a non-termal A, we want to calculate the follow of A. What we saw, we first need the first sets, right? So when you're doing your homework and stuff, you want to make sure that your first sets are pretty rock solid, right? Your first sets depend on the first sets. So if your first sets are broken, it really doesn't make too much sense to spend a lot of time coding follow sets when you've got something fundamentally wrong. Okay, so we're going to do this in the exact same way. So it's going to be fairly straight, hopefully straightforward if we've been following the first set calculation. Follow sets should be easy. So we first are going to initialize everything to the empty set, right? Okay, all the follow sets are empty sets. And then we apply these five rules that we just derived until nothing changes, right? We're going to do that in steps. So the first rule is the rule we talked about, right? For the starting symbol of the grammar, add the end of file to its follow set, right? Which makes sense, right? If you're the root, then end of file has to come after you. Okay, we did do it in the right order. That's good. Cool. So if we have some rule of the form B goes to alpha A, right? What was alpha that we've been using as kind of a symbol? A sequence of symbols. Yeah, a sequence of symbols. Non-terminals, terminals. We don't care, right? Just anything. This just means the rightmost one. Epsilon could also be a sequence of like zero, right? In the case that there's just B goes to A, right? Then we add follow B to the follow of A, right? So the way I like to think about it, this first rule, base case, right? Always do this. These next two rules tell you how to propagate follow sets, right? So this says, hey, I can always add the follow set of the left-hand side of my production rule to the follow set of the rightmost symbol. And this next rule says, okay, what if A has an epsilon in its follow set? Or in its first set, sorry. If A has epsilon in its first set, then we're gonna add the rightmost symbol of alpha, right? And this just says we can keep doing that as many times as we want. So for all of these, let's call it, we're using C here, right? C zero through K, right? As long as there's epsilon in all of those first sets, we can continually add the follow set of B to the follow set of A here. We can do this as many times as necessary. So these are just about propagating follow sets, right? So you only need the first sets here to deal with this third rule. So our next rule says, okay, what do we do when two things are next to each other in the rules? So this just says, if we have whatever rule here, and actually in this case, we don't even care about B at all. B has absolutely nothing to do with what we're trying to do here. We say, okay, if we want to calculate the follow of A, we just take the next symbol, right? Add the first of the next symbol, whatever it is, non-terminal, terminal, right? Take the first of that minus epsilon and add it to the follow set of A. Yeah. Question on rule three. So for example, if C zero, so let's say C one, C two through C K, R is just epsilon, and then C zero is a little C or epsilon, then if you do follow of A, it will be little C or in line, right? That's what the next rule gets to, is what to do. What to do if there's an epsilon in the first of C zero. But in this case, you just always add, this you always do, so it doesn't matter what's in the first. They just said that you always add. Yes. Exactly. Yeah. So all these rules keep adding and changing things. Yeah. They don't tell you what it exactly is, right? Even this first one. Right. It doesn't say that S has to be the second thing. Exactly. So then the next rule just says, okay, if there's an epsilon in the first of C zero, then you can add the first of C one minus epsilon to the follow of A. And then if there's an epsilon in the first of C one, then you can add the first of C two to the follow of A. And you can do that for as many times as there's epsilon in those first sets, right? Because we calculated, okay, we know that there's an epsilon in the first set of C zero. We know there's some combination of rules that can be followed by C one, whatever the first of C one is. And then the same thing, C one could go to epsilon and so, okay, then we can add C two, right? This just says it in a mathematical way. That means we can do this as many times as we want for all possible Ks that meet this criteria. Questions on this? All right, let's run through an example. The nice thing about follow sets, so if you think about it, right, do these two change based on what the current follow sets are? Does the first one change? So this is only adding first sets to follow sets. It's first sets to follow sets. The first one just adding dollar sign to the follow set in this particular case, right? So actually rules two and three are the only ones that change. And basically these say how the first sets should propagate. These kind of say how to populate the first sets. So I don't have like a proof for it, but all the times I've done it, follow sets kind of converge very quickly. So you apply the rules, and then you do a few steps and then you're kind of done. So let's look at an example. I think it's the grammar we've been working with so far in the slides examples, right? We have S goes to A, B, C, D. A goes to big C, big D, or little A, big A. D goes to big B. C goes to little C, big C, or epsilon. D goes to little D, big D, or epsilon, right? So we've already calculated the first sets of all of these, right? So what's the first part of our algorithm for calculating the follow sets? Initialize them all to zero. Not zero. Empty set. Empty set, right? So we initialize them all to zero. We have our first sets. We have our rules. I need to really do that better. Okay, right? So we do our first rule. We say, okay, let's calculate the follow of S, right? So we say, does the first rule apply? Yeah. And then we say, is S used in any of the right-hand sides of this production rules? No. Exactly. So the other rules, right, rules two through five only apply if it's used on the right-hand side. If it's not, then, hey, don't worry about it. Okay, what about calculating A? Which of the one, two, three, four, five rules am I going to look at? One, two, three, four, five. What am I looking for? How do I know? I want to calculate the follow of the A here. You look for uppercase A on the right-hand side. Yes, I look for uppercase A on the right-hand side. So which of these rules has uppercase A on the right-hand side? Rule one. Right? Let's start with rule one. This is first. Right? So we only look at this and we say, is it the starting non-terminal? Nope. Is it the right-most symbol? Nope. Is there epsilons all the way from A to in the first sets of all the symbols from A to the end of the grammar? To the right-hand side? First of B is little B, so no. Right? So this rule can't apply. This says there has to be epsilons in all of them, BCD. Yeah, BCD. Does rule four apply? Is there something after A? So what's the symbol that's after big A here? B. B. So let me add the first of B minus epsilon. Let me add the first of B minus epsilon to the follow of A. So we have B. Then is there an epsilon in the first of B? Nope. So we're done. This rule doesn't apply. So now we've got to move to our second rule. We have to look at every place that A is used and we say, okay, we know it's not the starting non-terminal, right? Is it the right-most symbol here? Yes. Yes. So which of the follow sets am I going to add to the follow set of A? Yeah, the follow set of A, right? How do I know the follow set of A? I'm calculating the follow set of A. Because A is also the left-hand side of that rule. Right, so this says you can add the follow set of the left-hand side to the follow set of the right-most one, right? This B is just a placeholder that says if the left-hand side of this rule, right? So here I can take the follow set of A and add it to the follow set of A. How do I know the follow set of A? Currently empty. Currently empty. I've calculated it, right? This is why I pre-calculated it. So I add the empty set to there. It doesn't change anything. I say there's nothing after us, so this rule can't possibly apply. Then we say there's nothing after us, so we can't add anything that comes after big A here, right? So this rule doesn't apply. This rule also doesn't apply. So the follow set of A is a little B. Then calculating the follow set of B, right? I look at this rule, the top rule, right? So I'm going to say it's not the starting on terminal, right? Is it the right-most symbol? Are there epsilons in the first set of everything after B to the end of this rule? Yes. Yes. So I have to add the first of what to the first of B? First of S. First of S, right? So the first of B is going to have an F end of file in it, right? Who calculated here? The follow of S. There's an epsilon in what comes after it. From the end after it to the end of the grammar, right? So we check the first of C and the first of D when there's epsilons in here. So then we say, okay, this rule just says add the first of whatever's after it to the follow of big B. So what's the first, what's the symbol that comes directed after it? C. C, capital C. So we take the first of C minus epsilon, add it to the follow of B. So B, what can follow B right now is end of file and C. And then we say, does rule 5 apply? Can we move on? Yes. Yeah. Yeah. All right, we can move on because there's an epsilon in the first of C. So then what, which symbols, which symbol and what do we add to the follow of? Yeah, the first of D minus epsilon under the follow of D. Right, so we're going to add little D and we're done. Right, we've gone through there. We have dollar sign, C and D. For the follow of C, which rules do we look at now? One and two and four. Yeah, one, two and four. Right, one, two and four. So here we say, we know something's starting non-terminal. We say, is it the right most symbol? No. Is there epsilon in the first sets of all the symbols after us to the end of this rule? Yes. Yes. Right, so then we add the follow of S to the follow of C and give us the dollar sign. And this rule says, add the thing that comes after us. What's the thing that comes after us? D. Yeah, so take the first set of D, subtract epsilon, so add D there. Is there anything after D? No, so this rule can't possibly apply. So we have, I forgot it already. Dollar sign and D, right? Then we have to look at the next rule where it's used. So we say, is it the right most symbol? No. No. Are there epsilon's in all of the symbols from C to the end of this rule that we're looking at now? Yes. Yes. So then we add what to the follow of C? D. D in the first set of D. The follow of A. Yeah, remember these two rules are about propagating the follow of A, right? So we do the follow of A to the follow of C, which is little D, little V. And then we would add the first of big D to the follow of C, right, which we've already done. There's no symbols afterwards. This doesn't possibly hold. And then here we go through this again and say, okay, right, same thing. This applies. It's the right most non-terminal. So we take C, add it to the follow of C to the follow of C. There's nothing after us. So none of these apply. So we have the follow of C is dollar sign B and B, right? And this is an important thing, right? This little B came from the follow of A. So we're five minutes over. I'll just continue doing this and record it and put it online if you want to, or you can leave. I don't know. It's up to you. We can go to the last one real quick. You got to go, you got to go. But it'll be recorded. Okay, now I want to calculate the follow of D, right? So which rules do we got to look at here, the production rules? One, two, and five. One, two, and five, yeah, right? Any place there's a big capital D. So we say does this rule apply? Is D the right-most symbol? Yes. Yes, so we add what? The follow of S. Follow of S to the follow of D, exactly. Does this rule apply? Is there anything after it? No. Does this rule apply? Is there anything after it? Nope. Does this rule apply? Nope. So we leave the next rule, right? So we've added the dollar sign. Then we say, okay, is it the right-most symbol? Yep. Yes. So we add the follow of A. The follow of A to the follow of D, right? So we have dollar sign and little B. We go through this again. The right-most. Right? It doesn't have it. It doesn't have it. Then we look at the next one. We say, is it the right-most? Yes. Yes. So we add the follow set of D to the follow set of D, right? So we have the empty set to what we're calculating. Doesn't change anything. There's nothing afterwards. So we get dollar sign B. And so cranking through these all again, we would get the same thing, which I'll let you do your practice on your own. There's also for, I'll be releasing the practice midterm today. I will say the slides have a pretty good, we won't go over this, but maybe touch on it a little bit in class, but I won't go super in-depth on to it. But it turns out emails are super complicated. So there's an example of a context-free grammar for emails, and we walk through calculating this simplified grammar's first and follow sets. So one way to self-check is go through, use this grammar and generate first and follows on that on your own, and then check with the slides to make sure you're doing it right.