 All right, everybody's here today, but you're not here to say something. Having some technical difficulties. I haven't looked at it yet. All right, cool. Start to the late start. Technology, all these computer things. Anybody have any questions before we have started? So the basic idea here is we'll kind of go over, discuss the practice of midterm one, and then you can ask, feel free, you know, ask questions. I'm going to be asking questions kind of leading it, but this should be kind of a discussion section type study thing of getting you ready for Friday. So any questions before we start off? Yes. Do you format all of this to tell us, like, no multiple choice, no three calls? No, not necessarily. For this one more or less, yeah. So I guess I should state, I don't know, we can talk about, at the beginning, questions I hate. So how many questions are going to be on the midterm are questions I hate getting. Because in this there's like floor questions, but not really floor, because there's a lot of sub-questions. If there's one question on the exam, you can bet that it's going to be a very difficult question. There's a hundred questions on the exam. They're probably going to be easy. So I just, I'm going to make it so it's reasonable, right? But it will test to make sure that you're paying attention and up to all the concepts that we're talking about, right? And you can actually apply these things and be able to think with regular expressions and context-free grammar. So just in general, that kind of question. So like number of questions, yeah. Format, I probably wouldn't do through a false here, just because it's, we've covered some stuff, but we can have some cool examples that these will test you on things. Yeah. Any other? So you have 50 minutes, no notes, no nothing, pencil, pen, and that's it. There'll be this first set and follow sets here, so you don't have to, well, I'll put it this way. So if you're referring to this a lot when you're doing the exam, right? You probably should have studied this more so you kind of understand these rules, right? They're just for reference, just in case you kind of can go and check the rules to make sure you're applying them correctly. Any other questions before you start it? I was doing, I was actually, we're just studying, we're just studying for just in front of an exam, and they're section replies. And I mean, they're homework-concerned, they're basically super different than ours. But what they're expected to do is when they do their follow-up first sets, is to write the time when you process that specific character in the rule that you use. For our exam, are we going to be expected to do that? What does that mean? Oh, write the rule? Yeah, so they're not doing tables like we learned in our class, there's no exponents to show how they process them. Yeah, just showing which rule you're applying. So you're not expected to necessarily put the input in the rule, right? If you get the right answer, you get the right answer, right? Yeah, I don't care if you look at it and just write it down, but I will tell you that that is not an effective strategy, because when you make a mistake, then I just say that's wrong and you lost 20 points, right? The reason I asked you the question is just because I looked at their homework, and it was all like a bunch of exponents and stuff, and I was like, okay, I don't know if we should do that on the exam or not. No, you don't have to do that, but it helps you, you can totally do it. I'll be able to read that, so it's totally fine. Anything else? Yeah, I mean, they're doing different homework assignments, they'll have different exams. I think they're slightly ahead of us, but that's fine. I'm not worried. From a code perspective, when you write out code, do you care if it's like balance C or if it can't be like just C to code? For this exam, you won't have to write out code. For the next ones, you will when we get into how to create a parser, a predictive parser based on first and follow sets, that will be mostly pseudocode, but not something... I'm not going to ding you if you missed like a semicolon or something, but if things aren't in the right order and you're doing things out of order, right, then that's just pretty cool. Like C Java-like pseudocode basically. We don't have to worry about that here. Any other questions? If this does not be super boring and just me doing a midterm, we're going to do this together. All right, so problem one says, consider the following regular expressions here, right? So we have seven regular expressions. Some of them are building on each other, right? So we have letter is A, B, C, or D. Capital letter is capital A, capital B, capital C, capital D. Digit, four, five, six, seven. Alpha is a letter or a digit. You can coordinate it with digit star. You can coordinate it with letter star. So I hope, and nobody was confused by this dot. It's just kind of hard to just read the period. So this is a concatenation operator. Do you have any questions? These are the type of questions I'm totally happy to answer during the exam, right? If you have a question about what is this symbol. If you have a question like, what am I supposed to do when there's a star here? That's not a question that I will answer. I'll just tell you to do your best. It happens. So row is digit or letter followed by any combination of digit or letter of capital letters, right? Any number, zero, more digit or letters. Phi is letter star. Any number of capital letters followed by digit. W is a digit followed by a capital letter or a digit followed by any number of letters, of lowercase letters. So one thing to look at. Do we define what sigma is? Technically by defining letter and digit and other letters we did. So all of these regular expressions are written in terms of one, two and three. So we can use the tokens here to basically say, okay, I know it's going to be either an A, B, C, or D or capital A, capital B, capital C, capital D, or four. Our alphabet could be bigger but for these regular expressions, these are the only symbols we care about. If there's any other symbol, then it's definitely not going to match any of these regular expressions. All right, so the question's asking us, right? So exam taking skills, right? Read the questions carefully. Actually, personally, what I go through is I read every single question first. If there's anything that I can immediately jot down and answer, I'll jot it down. But otherwise I go through, I read everything and then I go back to the start. So then my brain starts kind of thinking that's a problem even when I'm actively working on this first problem. So, technique that works for me, it would work for you. Okay, so it says, read to the following, fill in the blank with is an element of or is not an element of. Recall that L of R of regular expression R is the language of R, right? So what is this asking? What's this asking us to do? Is A44 in the language of the problem? Yeah, so is the string capital A44 in the language described by the regular expression alpha or is it not? Well, that's what it's asking. Yes, that is what it's asking. So we have to do that. So how do we decide? Can we match this with alpha? It's saying, is there some combination here? It's also saying, if we try to generate every string, the language described by alpha is gonna be either start with a letter or a digit, so it could be, let's say A, it could be any number of digits, four or five, and then it could be zero or more of these characters, right? So this is obviously not just, this is just one example, right? This is an infinite set, it's gonna go on and describe every possible string that alpha could generate. So the question this is asking is, is this string in that set? Yes. Yes. Why? Because it uses a single capital letter, which- Single capital letter? Matches here? It uses two digits, four more, which satisfies digits. It uses the empty string which satisfies capital letter star. There we go, right? So this one to nothing? This one to nothing? I'll do the thing. And the empty string is just assumed? Yes, yes. Because you can concatenate it, if this goes to the empty string, if this goes to epsilon, write letter star and concatenating it with whatever we came before, is what came before. All right. So the string 47, the language described by row, 47. So we first look here, it's either going to be a digit or a letter. A digit. We match up the digit, and then we say letter or digit star. Matches. Matches one case of a digit, and then we stop. Yes. This is definitely in. 47, is it in the language described by phi? No. No. It's only allowed to have one digit. But I have a digit here. I only love that one. Two digits. Two digits. Yeah, right. So this letter star clearly has to go to zero, because there's no capital letters in here. So that means the next thing is this digit. There's four matches this digit. Right? There's no letter at the end of the string. Exactly. There's nothing here. There's nothing else to possibly match it. So there's no way the string 47 is in the language described by the regular expression phi. This phi, right? Yes. You have to start with the letter, right? You have to, yes. No, it doesn't have to start with the letter. It can be a single digit. Oh, epsilon never mind. Yeah. So it could be four, right? So is four in the language described by phi? Yeah. Yes. Oh, so this is definitely not. Okay. Seven, four. Is that in the language described by row? Yes. So I have the digit, or no, this is omega. Sorry. This is the row. All right. Digit seven matches. Then we go to the next character. Is it the next character of big letter, capital letter, or digit? Yes. Yeah. It's a digit, so that matches. Epsilon. And this goes to epsilon. There's nothing more, so that matches. So exciting. A-D-A-E. Is that in the language described by alpha? Yes. No letter. Can't suggest. So I have a letter followed by a digit. The digit has to go to zero followed by a number of letters, right? This should be in it. No E. No E. This may have been why we did this. Yes. Right? So capital letter only matches the symbols A-B, capital A, capital B, capital C, capital D. Tricky. Tricky indeed. See if you can't attention. I can't believe it. So now I'm going to check for everyone and maybe there won't be one, or maybe there will, because I just said that. So E does not match capital letter. It's not in any other language. So I'm looking at lowercase A, big A, capital A for capital D, lowercase A. Does that match row? So let's hear this somewhere right here. So is it start with a digit for a capital letter? Lowercase letter. Lowercase letter, right? So every string that row generates has to start with either digit or lowercase letter, right? So is this string in the language described by row? No. Because the last letter is a lowercase letter. Yeah. Exactly, right? And we can know just from looking at this very first character, right? So we know that all strings that row generates have either start with a digit or start with a lowercase letter, which this is a lowercase letter. Yeah, it's the last character. I'm getting ahead of myself. All right, so that matches. The letter here, the letter here? Yep. Perfect. Okay, then it says, then we match any number of digits or capital letters. All right, so this capital letter? Yes? Yes. The digit, yes. Capital letter, yes. But that one, no. But this is no because that's a lowercase. No, it's not. So it's not in there. Almost. All right, fee. Does this match fee? No. Because we're going to bat in like 50% as a group. No. So we have a letter star, right? So capital A, capital B, capital A, do those all match a letter star? Yeah. Does four match? Yes. Digit? But then there's another digit. There's another digit. Nothing afterwards. Yeah, exactly. So it does not match, right? If this seven does not match anything, that's why you're surprised. Okay, let me just describe by omega. So a digit, digit, letter or digit, letter, followed by lowercase letters? Yes. Oh, for dad. Oh. So that's definitely in there. Feel good? Everybody agree? We just got 100% as a group. On this part? Yes. Okay. So, what's the next part of this question asking us? Not parsing. Tokenize it? Yes, tokenize it, likes it, yeah. Parsing is turning into a tree. Parsing. Well, clearly it's parsing, but yeah, we want to be specific since we have specific names to these things. Okay, so we call getToken repeatedly until the end of input is reached. The sequence of token returned is the following, right? So I assume longest prefix matching is used. So if you don't know the longest prefix you're matching, you should absolutely look that up. And ties are broken in favor of tokens that appear first in this list. So let's think about what this question is asking. What am I supposed to write here on this line? A list of tokens. Yes, a sequence of tokens, right? So what are tokens from these? All of those. The left-hand side, left-handed arrow. Yes, the left-hand side, right? So specifically, I don't want to see, you don't want any lowercase a's, uppercase a's, 4's, 5's, 6's, right? Those are not going to go in there. We only want to see, these are our tokens. Okay, let's do this because of screen issues. I'm going to have to, I can't see all the regular expressions. I wish I could move this stuff around. Maybe like that screenshot tool. Yeah, I probably could. Ah, that's too much work. I don't think it's worth it right now. I'm not doing it on the floor. I'm going to write it in later. So you want to double my work on this one? In effect means you'll get a lot of that. Leave the rules up there. Yeah, I'm going to do this. So can somebody write what's the string that we're talking about? 4, 7, 8, lowercase a's, 4, lowercase d's. Too many d's. What was that? You got too many d's. Change that to 2nd d to an a, capital a. And remove capital d. 2nd capital d to a capital a's. Remove the first lowercase d also. No. No, I don't stress that. 4, 7, a, a, b, a, 4, a, b, 4. The last is a, b, 4. 4, 7, 8, b. You're all fired. So we have, these are our regular expressions, right? These are our input language. So what's the, what are the columns? So we're going to make a table, right? So what are the columns of our table? So we'll copy these kind of down one at a time, right? So we have 4 for our string. So before we start, what has the potential to match? All of them. All of them, right? Any of them. Yeah. Do we have to make a good table? Mmm. In case you're showing good work. Yes. Yeah. Yeah, absolutely. So if you mess, because especially, particularly if you're doing this, if you mess up the first get token, right? And let's say you end up parsing all of this, when really you should have only parsed the 4 and the 7, then that's going to throw off all the rest of your answers. So you just drop an answer down here. It's going to be completely wrong from the start, right? So at least the other way we can say, okay, we'll knock off some points for that first mistake. We can see that the second, like after you made that mistake, you were doing it correctly. But of course, if you mess up on all of them, that's not good. How many points will that be worth? I don't know how to say it. It's like, I have a hard time using the table. Like, I can get the right answer. And we were just talking about that earlier. Can you? Why me? It's something like sometimes I leave out a potential one or something. Is that going to be able to reset on the board? Yeah, I mean, you got it. That's part of being, you know, I mean, it's a, to me it's a good tool to help you make sure that you're doing it correctly, right? Because you're trying to, like, you have to think about all of these possible matches for all of this input and trying to be like, who is the last, the longest match? But doing it this way, right? You're only doing it one at a time. So to me it's a lot easier to think about, okay, I'm going to just match 4, right? So I know, I'm going to say, okay, who matches 4? We'll digit, right? Digit matches 4. Especially if you have to backtrack or anything like that. If you go farther, to me, it's like super, I don't know. I mean, I've done it maybe a little bit more than you, but not that much, yeah? Are you, for this particular one, you're just creating this on the answer. So if it's right, you're not going to go look at our table. Correct. But you're taking the gamble. I will, you will, okay, so for the question of how many points does it work, I have no idea, I can't say now, but it will be here. So you'll know exactly how many points every question has worked. So in case you want to do some weird game theory thing, while you're taking the test, like where do you spend your time? Then you end up spending too much time doing that. Yeah. Okay. We match 4, digit matches, does any of the others match completely with the digit? No. Let's take it one by one, right, because this is when we can get in trouble. So does, so letter clearly doesn't match just before. Capital letter? No. Digit, yes. Does alpha match? Yes. Yes. This is why it becomes important to go through these. Does row match? No. Yes. Yes. Does omega match? Yes. How do we know if it matches? Or does it have to go? It has to go all the way to the end of the regular expression, right? Exactly. So it has the potential to match. So now we have to do the potential. We're kind of doing this separately. So does digit have the potential to match? No. No, right? We've gone to the end of this regular expression. Yeah. If we parse any more digits, it's definitely not going to match. Right? We must be. Yes. Yeah, so very good. Is V? Yes. Yeah. So now we want to think about the potential. So definitely not letter, not capital letter, not digit, right? Because it can't match any more of a digit. What about alpha? Yes. Yeah, that's the potential, right? We matched this first digit, but we could match more digits and more letters after it, right? We don't know that we're actually done there. So this has the potentials. What about rho? Yes. Yeah, also has the potential. And phi? No, right? Yeah. So phi definitely doesn't match, or it doesn't have the potential to match. Omega? Yes. Yes, right? It still has the potential to match. So which one is the longest match? And how do I know? It's the first digit. It says the first one. All right, so I have four things that are matching here, but specifically because of the rule here, right? Ties are broken in favor of tokens that appear first in the list, right? Which is the assumption we've been making from the start. So it'd be digit one. Yep. Then we'll look at the string 4-7. So which of these seven regular expressions do we compare against the string 4-7? The ones with the potential list. Yeah, just these three, right? You don't even have to look at anything. So we look at alpha and we say does alpha match 4-7? Yes. Digit. Yeah, matches. Does row match 4-7? Yes. Yeah. Does omega? Yes. Digit. Digit. Yeah. And then? All those are awesome. All those also have potential. This is a star. Row is, this is a star. Omega has a letter after it. We have row. So it matches one digit, right? Because this digit's in a star, it could keep matching more digits or actually letters, right? So that's why we get to the end of the line. It's a lower case. What do we have with the next digit? The next A lower case. That's why you use the table. Yeah, this is why you use the table, right? You're not looking at this input string at all. We're just looking right now at 4-7. This is the string that we're considering. So I just have a question about the process that you're doing right now. So when you first did the table in class for the match column, you just put one of them. You didn't put all of them, like you're doing it right now. I think before it only actually matched one, I believe. So we put all of the match, like it matches all of these. I mean, in class I remember you saying for potential you put all of them within for the match, you put the one that matches. I mean, I guess we could do that. We kind of did that up here, right? So this is just, I mean, an extra up. So I'm asking, what's the point of the match column? Because you're putting all of them. Well, I put all of them that match. So you're putting the one that matches, right? Which would be the one higher up on this list, but I kind of want it to be complete so we go through and think about all of them, right? So I'm putting all of them there. Yeah, yeah, yeah. But it's just when you put the longest match, right, you got to know how to resolve these five, these four that match. What's the, so what are we looking at for the potential column? The potential means starting with this string, right? Does the regular expression, does the first part of the regular expression match this? And could we match, if there are stuff after this, could we match this regular expression? So it's basically like a prefix. It's saying like, okay, is four seven a prefix in there? And could there be more, right? So what the match says is four seven, the string four seven exactly in the language described by this regular expression. And let's say it hasn't hit and hit the end of the string. There's always more. Yeah, exactly. Yeah, we don't know that yet because, and that's why we can do this without even looking at the input string because we just operate just on this string here. And so this helps prevent you from trying to look forward and see what's going to happen and who's going to be what potential. Four seven a now. So okay, who's the longest match? The first one in the list. Yeah, the one higher up on this list, which is four, which is alpha. All right, four seven a. So does four seven a match alpha? No, it doesn't match alpha. We can do potential. Does it have the potential to match alpha? No. No. Is row, does row match four seven a little a? No. Four seven up the way. Omega. Yes. Digit, digit, lowercase letter. Yeah, so omega matches and it has the potential to match. So we do omega three. Four seven a a. Same thing. Omega, omega, omega four. So I get to say this was a problem that was on the actual midterm one last semester. So the things I'm telling you, like some people stopped here and they're like, okay, this is the longest match, right? Which completely messes up. If you parse this thing, right? And to start parsing from here, it's going to be completely different. The little b's included. So a, a, d. Omega. Still omega, right? Yeah. Yeah. This match omega? No. So nothing, nothing. And we say it's omega five, right? So we know this is going to be the first input that's parsed. That minus that. Yes. It goes back. Every point. Yes. It goes back one once. Yeah. It goes to five, right? So we take this five and we take from the input string. We go one, two, three, four, five, right? So these were, we basically say are omega. Right? So on the line, right? You don't put the five. You just put the omega and not the five. Zero seconds. Oh. And just because I have weird space constraints, because the resolution here is terrible, and we want to look at these rules, I'm just going to erase these. We're starting fresh with the integration. Yeah. We'll start basically as if we were right below this. And we'll start parsing from dA four. Cool. All right. So we take the string d. So which have the potential to match before we start? Everything. Everything has a potential, yeah. Okay. Does letter, lowercase letter match? Does that have potential to match? No. Uppercase letter match? Yes. Does it have the potential to match? No. No. Digit match? No. Alpha, does alpha match? Yes. Yes. Does it have the potential to match? Yes. Yeah. Does row match? No. No, because it has to start with a digitor letter and it has no potential. Does phi match? Yes. No. It doesn't match, right? Phi says it has to end in a digit. It can't possibly match. But it has the potential to match because the prefix matches. Omega? No. No. It doesn't match and has no potential to match. Right. So just alpha and phi. And what's the longest match so far? So we only have to look at these two regular expressions, right? So alpha, does alpha match capital D, capital A? Alpha, letter? Letter. Yes. Oh, yes. Yes. Does it have the potential to match? Yes. Yeah. Right. So this first letter matches the D, right? Yes. The digit has to go to zero, right? It's the epsilon. So zero more times. And then capital letter matches, but we could always add more capital letters at the end so we have the potential to match, too. Yeah. It does not match, but it has the potential. Exactly. So what's the longest match so far? Alpha. Alpha with two. Right. So DA4. So does it match alpha? No. No. No, right? Does it match alpha? No. Why? So first letter matches the D, right? Because the second letter is a capital A. This digit star has to go to epsilon, which means right now we're matching capital letter star, which matches the A, right? But letter star does not match four. So this means that alpha doesn't match and doesn't have the potential to match? No. No. There's no possible way, right? So then what about phi? Capital letter, capital letter digit? Yep. Not potential. Does that have potential? Does it have potential? No. No. Does it have phi? Three. Should you call it at that point because you know there's no more potential? Yeah. So at this point, right? There's no more potential, so that's when we would go before as we went until there's no more potential. So here we see there's absolutely no more potential here. You could do a little self-check to make sure that phi doesn't match DA4, a little lowercase a, right? That would help ensure, that's essentially what you're doing by saying it has no potential. It doesn't matter what this symbol is afterwards, it's never possibly going to match. So then we'd say, okay. This is our thing, and we know three is three characters from here. So then I know my next one is phi. Let's see if I can do this here. I'm trying to think of how this works out, but I actually don't remember. Still a little distraught at this. There are cells sometimes. Oh, no way. There's lines. So any other questions about what we're doing or the process? Okay. The string A? All. So before, yeah, all of these are possibly valid, right? So does it match low letter? Yeah. Does it have the potential to match letter? No. No, right? It's at the end. It can only ever match that string. Capital letter? No. Digit, no. Alpha? No. No. Rho? Yes. Yes, potential. Does it match? Yeah. And it has potential. You've got potential, Rho. Phi? No. No. Omega? No. No, right? It has to start with a digit. There's no possible way. Cool. Okay. So what do I put as the longest match? Letter. Letter. Okay. A-D. So now we just have to look at Rho. So does this match Rho? No. Everybody see that? It does not. It does not match Rho. Is that a potential to match Rho? No. No, right? No matches, no potential. So what's the longest match so far? Letter 1, letter 1. No. Right, so possible mistakes. Some people would return Rho here instead of letter, right? By not properly doing the quartering rule. It's important to remember that all of these are possible tokens that we can return. So you've only eaten A and D, right? Ah, so what have I eaten? A. Right, because of this 1. Because letter doesn't match A-D, right? It only matches lowercase a. So then we know it goes like here. Oversell this. Blue D. Letter. So letter. What digit? This. No match, no potential. Rho. It's a match and a potential. It's a match and a potential. Letter. Phi. No letter. Start digit. No. Omega. No. So the longest match so far? Letter 1. D4. So just D4 match Rho. Yup. Yup. Is it the longest match? So it matches, does it have a potential to match? Yes. Yes. We don't actually know we're at the end of the string. I mean, so we know the longest match so far is Rho 2. Right, we're basically now at D4 kind of end of string. So there's nothing more here. We stop, obviously. And we say what's the longest match? It's Rho 2. So this splits up into the other ones. D. D. Cool. Questions? Should be pretty simple. I mean, you'll have plenty of scratch papers. You don't have to worry about space too much. So you can, there'll be three extra pages. All right. So when you hit the end of the string, it's just whatever you had. Yes. Whatever the longest match is that matches to that end of the string. Right. So some things to check yourself too. I just forgot. Right? Unless we explicitly say how you handle not found token or something like that. Right? Like, I think I had somebody ask me like, during the test, what do I do if there's not a token answer that question without saying that like, well, I probably wouldn't give you a question that doesn't match a token unless I specifically told you what to do if there's not a token that matches. Right? So think about those things, too. All right. Can we jump to first and follow some of the long time? Sure. You want to do this one? This one's fun. Problem three? No. Okay. We can do that. So we have this grammar. We'll just do first and follow. Right? This thing. We have first. Let's do sort of here. S. Right? So the first start out is all empty sets. So I want to calculate the first of s. So I look at these two rules, right? And I see, okay, the first of s here is going to be the first of d. Right? Add minus epsilon. Add to the first of s. The first of d is empty. So I have nothing and I don't change anything. I look at the next rule. I add the first of f minus epsilon to the first of s. That does nothing. So it remains as the empty set. And I look at d. Right? And when I look at d, I only look at these three rules. Right? So I look at this first rule and I add, okay, the first of d is, oh, this little d. Right? Whatever the first of little d is, the first of little d is the second tending d minus epsilon. Add that to the second tending d. I say, do I go on? I go on if there's an epsilon in that. There's clearly not. This d adds the same thing as d. And this d is epsilon. I go to epsilon for e. Right? I just look at this one. Basically the same thing as d. Yeah. So I add, from the first one, I get little e. And from the next one, I get epsilon. This f, little f, epsilon, from g, I get dg epsilon. Right? Am I done? Nope. Nope. Can't stop. Can't stop. Can't stop. Okay, s. We're going to add the first of d to the first of s minus epsilon. So that gives us little d. Can I say, is there an epsilon in there? Yes. Yes. So we got to go to f. Add the first of f minus epsilon to the first of e, which is lowercase f. Is there an epsilon in there? Yes. So then I go to the next one. Add the first of s minus epsilon to the first of s, which is just f again. Right? I say, is there an epsilon in there? Yes. So I add the first of lowercase g, which is lowercase g. Do I go on to capital G? No. Wouldn't you also check the second s rule? Yes. Now I have to also check the second s rule. S goes to capital F, adds the f. Then we say, is there an epsilon in there? Yes. So we go on to g and add the first of g minus epsilon. So that adds a d there. D is already there. D is also already there, so we would not write that again. Then we say, okay, we've reached the end of the string. Is there an epsilon in the first of all these symbols? Yes. Yes, which means we have to have epsilon to s. All right. D is not going to change. No terminal. E is not going to change. F is not going to change. G is also not going to change, so right? Yeah. This is why you put the epsilon to s there. The second rule, rule five says if there's an epsilon in the first sets of all of the right-hand side symbols, then I add epsilon to the first of s. Okay. Is there an epsilon in first of f? Yep. Is there an epsilon in first of g? Yep. So then I add an epsilon. Yeah. Why do you put the little g again? Do you put it in the terminal? Not because, because the first of this, the first set of lowercase g does not contain epsilon. That's why. It's an ending terminal. It's a terminal, yes. There is no epsilon in the first set of terminal, right? Which means I can't add the first of little g to the first of s there. Exactly. And I could, I'll leave this to you to do this one more time to see that it converges here. I believe somebody checked, but I'm pretty sure it's going to converge here because these d, e, f and g don't change. But this is something that you should not assume on the test. You should do it one more time to make sure. Yeah, there's no answer. Okay. So I first apply this. What do I always add to the starting non-terminal and non-terminal's follow set? End of file. Dollar sign, yeah. I could even do that here if I wanted to. That's totally fine. But I'll just do it here because, yeah, again. Okay. So I want to calculate s's follow sets. Where do I look in these rules? We have s followed by something. Where you have s on the right-hand side, right? That's the important thing. So you completely ignore this whole side, right? We don't care at all while we care a little more. But for deciding right, we don't care at all. We just look here and we scan through and we see that there's no s's, right? Rule one. Right? So I say, okay, great. That's it. So now d, the first set of d. The follow of d, sorry. So now I need to go through every single rule here and find where d appears. Right? So I look here. So I first say, is it the last rule here? Nope. Are there epsilons in the first set of all the symbols from after d to the end of the rule? Nope. No, because of this g, right? And then I say, okay, then I add, to the follow of g, I add the first of f minus epsilon. Right? So I'm gonna say f. And then I say, is there an epsilon in the first of f? Yes. If so, then I add the first of the next symbol to d's follow set, right? Which is f again. And then I ask it one more time, right? Is there an epsilon in there? Yes. And then I add the first of lowercase g to the follow of d, right? First of lowercase g is g, right? By the first rule of first set. Is there an epsilon in the first of lowercase g? No. No, so I don't add the first of capital G. You don't add epsilon. I'll close it yet. Okay, that's just this rule, right? Actually, that's just this occurrence of d, as we'll see if there's an important point here. Okay, then we look at this d, and we say, okay, is it the last? No. Is it the rightmost symbol? No. No. That's to be the rightmost symbol of the rule, right? And we say, are there epsilons from this rule to the end of the rule? No. So rule two and rule three don't apply. So we take d and we say, okay, then whatever's after it, we add the first of that minus epsilon to our follow set, right? So the first of f is f. f here, that's all good. We can't go on anymore. There's no more symbols after that, plus there's no epsilon here in the first of lowercase f. So I look through my rules. There's no more ds, so this is good. Okay. Now e, right? We look for e here. We see it here. So we say, is it the end? No. Is there epsilons all the way from here to the end? No. Is there an epsilon? No. So then we add the first of whatever symbol comes after it into our follow set. The first of whatever comes after it is e. And then I can't go on anymore, so I stop. So then I look at this rule, right? Here's an e. So I add to say, okay, is it the right most symbol of this rule? Yes. Yes. So I add the follow of e, the left-hand side, to the follow of e, right? So I use the follow of e here, the last known value of follow of e, which is just empty set. So that doesn't change anything. There's nothing after it, so that doesn't change anything. Follow of set of f, this is where it becomes very tricky. This is the thing that messes up people. So we have to consider every instance of f in the rules, in the right-hand side rules, right? Which is what we were doing before, right? We would consider d here, d here, right? But here we have to consider f, every place that it occurs. First of all. Yeah. So we're going to make sure when we look at this f, right? We add, we say, okay, is it the right most? No. No. Is it the, is there epsilons in the first sets of all the things after s to the end of this rule? No. So then we add the first set of whatever comes after s minus epsilon to our follow set, right? So we have to add the first of capital f minus epsilon, so we add f here. Then we say, is there an epsilon in that? Yes. Yes. So then we add the first of the next symbol to the follow of epsilon, of the follow of f, right? Which is g, which is g, right? So then we actually do the exact same process again with this f, right? So it's important that you do it both, because if you just did it with this one f, you won't get f is in the follow of f, right? Because there's two s after each other. So if we do this here, we'll get the same thing. This will just add g, right? So we'll see this so it doesn't actually change anything. So then we look at this rule and we say, okay, so is it the right most? No. No. But is there epsilons in the first set of everything from it to the end of the rule? Yes. Yeah. So we add the follow of s to the follow of s, right? So this question I think came up on the mailing list, right? You should just use the last known value, right? These follow sets we're calculating, right? They're never wrong. So we should take the last known value of whatever that is. So here we should add the dollar sign to here, right? Then we add the first of capital G minus epsilon, right? To our follow set, which is going to be d and g. Just d. Just d, because g is already in there. Good. And then that's it for there. There's no possible way to move on. I look for f, f, f, f. Here I see an f. Is it the right most? Yes. Yes. So I add the follow set of f to the follow set of f, which is the same thing. Then for g, then we look for all instances of g. So I add the follow of s to the follow of g, right? Which is end of file. There's nothing after it, so I don't add anything there. The same thing here. I add the follow of s to the follow of g. G appears here, so I add the follow of g to the follow of g. Right? Yeah. Sure. When we have epsilon, is there epsilon and epsilon? We never have epsilon in follow sets. Follow sets are only terminals and end of file. Because the rule says we always take out epsilon when we add it. But the type is we never have epsilon there. OK, so we can see. I don't think we've reached complete saturation. I think they changed one more time. But I can't really remember, and I'll let you work out what happens here. I don't think so. When there are two non-terminal f's, does it matter if we consider the second one? Yes. You have to consider both of them. It doesn't matter the order that you consider them in. We have to make sure. Just think about it if it was like this, right? S goes to D, F, A, F, B. Right? You need to make sure you consider the first one, right? If you only consider the second one, then it's going to be a problem. Yeah? Shouldn't the follow of what? E. E have in. Well, except E never reaches the end of S. So we never add the follow of S to E. But it's still the last, the right most. So no, it's the right most symbol here, right? So we add the follow of E to the follow of E. Right? You have to follow the left-hand side rule to this rule. Like up here, for G, we add, because it's on the right-hand side of this rule, we have the follow of S to the follow of D. Right? If there was, there's like a D here, right? We add the follow of G to the follow of D. Right? It's not anything special case to S. Cool. Well, yeah. I mean, you guys can go. We can do some of the rest of the... Yeah, we can go over three real quick. If people need to leave, you can leave. If not, this will be recorded, and I'll just keep rolling. All right. All right. So this problem says to, we want to design a regular expression called IP to match IPv4 addresses. Right? So then hopefully we have a description of what exactly IPv4 address is, otherwise you have no way to do this. Right? So it says IPv4 address is composed of four octets separated by periods, and you probably have to look really carefully, but this is a slightly different font than the other period. I thought about removing all the periods, then it was going to be weird. Anyways, each octet is an integer between zero and 255 inclusive, which means we include both zero and 255 with no leading zeros, just like our hex values, just like our numeric values. We don't want to capture leading zeros. Okay, so you're probably familiar with the IPv4 address 127001. Are you familiar with that? Yeah, I'm sure. Okay, well, yes, very good. Actually, fun fact, it's the entire 127 subnet. So anything that starts with 127 also goes to your local host. So yeah, hey, it's either here or there, but interesting fact. Okay, a note in bold, right? Yeah? After you go through this... Okay, thanks for the note. Okay, so this is saying represent a regular expression that matches the symbol dot as slash dot, so as not to confuse it with the regular expression concatenation operator. Because we already have the concatenation operator. You also do not have to use the concatenation operator if you do not choose to. But just for this, since I explicitly called it out here, this is what you should do to match the period, this is what you should do. Mathematically speaking, couldn't you also use a double bar? A double bar? Yeah, because I've seen that also be concatenation. Oh, oh, I mean mathematically speaking, it all depends on what you define as your symbol. So in this case... I've just seen that, I've just seen that. They use that in, I don't know exactly, but like actually on PHP, they use string concatenation as dot, which is incredibly annoying when you come for another language. Okay. So let's kind of go to the end. I think this is actually one of those things we can actually write pretty easily the end regular expression, right? So we can basically say, okay, what do we know IP is? Octet dot and octet dot. Yeah. Octet dot octet. Are you the 0, 1, 2, 3, 4, 5, 6, 7. And we know here because it says that we can, you know, that this is not matching octet, right? So we're going to have to actually define a regular expression called octet, right? But we can see that, hey, our final regular expression is actually pretty simple, right? As long as we can write a regular expression to match an octet, we can say, okay, it's an octet dot octet dot octet dot octet, right? And there's implicit, you know, concatenation. Do you want to go out at the end? No. Like you can see, wow, let's... So four octets separated by periods. So this period over here were not separated octets. Can the middle values, for example, 1, 2, 7, 10, 0, and 0, can both the zeros be anything between 0 to 255? Yes. Yeah, right? Every octet is an integer between 0 and 255. But it can't be 0, right? 1. Of course it can't be 0. 0 is 0. Right, you can't have this 0, 0, right? You can't have 0, 1, 0, right? No leading zeros. Leading zeros would be, like, unnecessary zeros in the same frame. So how do we write this? So, okay, let's talk about what do we need to be careful of when we write this regular expression? Making it so we don't have a repeating zero. Making sure that the number can't go over 255. Making sure that there's no repeating zeros, right? That's one thing. What's the other thing? Also making sure that the number can't go over 255. Yeah, so it's really easy to write a regular expression that could match IP addresses, right? We want to make sure we're writing our regular expression such that it doesn't match things that are not regular expressions. Right? Yeah. If we were to write it, I'm going to ask, if we have to write a set of digits between 1 and 255, do we have to write all 255 or 255? I told you we were talking about that. Do you have a notation you would prefer that we would be able to use? Yeah, it's completely different. Yeah, you could do that. If you come across this and you're like, it's kind of like when you're being interviewed, right, on a whiteboard question, right? Your very first instinct would be, I'm going to do the stupid brute force approach. The stupid is kind of a bad term. It works. But it works. It's correct, right? If you write out every single number from zero to 255 in a regular expression, it's just not entirely fair. I will absolutely give you 100% of the points. You can also do dot, dot, dot. No, you have to write everything. Heck yeah. You can just dot, dot, dot your way out of it. Because there's a better way. That's the whole point is to make me think about how to do this in a better way that doesn't require you writing out every single 255 character, right? That's what I'm saying. But hey, if you want to do that, it is correct, so absolutely. So let's start with something that's not correct. So I'm probably going to need a helper function or a helper regular expression, right? A digit? Digit P digit. So is this correct? No. So what's wrong with this? There's two things that are wrong. We could have any number between zero and zero. Right. Yeah, exactly. So we can have a number over 300 or 256, right? If we live 256, that's wrong. So this is from our description, right? 256.0.0.1, right? Is not a valid IP address, and we specifically don't want to match that. Okay. So think about it in terms of numbers. So what does digit match? Digit matches any of the first 10 digits. Not digits. Yeah, numbers, right? Zero through nine. It matches zero through nine. So if we think about, we want to match essentially the range zero to 255, right? So this expression matches zero to nine. So we can try to split this up into different ranges and write regular expressions that match different chunks of the ranges. But we have to be careful, right? Because we can't just say zero to 255, right? Because we're dealing with digits. So yeah, so somebody said we'll probably need a P digit, right? Which is the same thing, but without the zero. So this matches zero through nine. So how can I write a regular expression to match maybe the next range? P digit digit. What do you want to call that? Let's call it 10. It's called 10. Because that's now 10 through 99. Right, so this is the range, what? 10 through 99. 10 through 99, right? So then can I just do a hundred and do P digit digit digit? No. No, because I get into the problem, right? Yeah, absolutely. So then what do I do? 10's concatenated with... Well, there's two ways. Just do hundreds and do one digit digit. Well, but the problem is then you could get, like, 299. No. If you just do one digit digit, it's anything between 100 and 100. So we're gonna do this? No. No, but then you can get 100s to 100s. Right, this doesn't do 101, right? Because the 10s with this P digit, right, specifically doesn't allow us to do that. You could also do 101 digit or 10s. But I think digit digit works just fine, too. Yeah, I mean, this is a little cleaner, right? So then what does this match? 100 through 100, 99. Right, we're getting there. And we'll get how much less we had to write out than 255s. 256 seconds. So now what do I have to do, though? Can I just do the same thing and put a 2 here? No. No, right? Because then I'll match up to 299, which I don't want to do. So I think you'd have to create another sub-digit set that's only 0 to 5. 0 to 4. No, 0 to 5. 3 to 5. Couldn't you do, like, we're gonna be easier to define zero by itself just to have that, because we cannot have a beating zero, right? So you can just define zero by itself and then for the hundreds, you can either have a 1 or a 0, right? You can have 100 something or 200 something, right? For the hundreds. But you can't match 256. Yeah. So how are you gonna write it so that it matches 1, 99, but not 256? Exactly. So then you do 0 for a second and a third, okay? 2. For 0, you just write it as a 0. 0 equals 0. First equals 1 or 2. Second, you can equal 3, 4, 5. And then third, it's 6, 7, 8, 9. So then if you want to write 199, you do first with any combination of third or second. And you can have 800s. I have to look at it. I don't know if it's on my head. I don't know about that any combination, because the 4s could mess it up. But we can keep going in this direction, right? So we have to do something with a 2, right? Yep. So we can do basically, what? 0, 1, 2, 3, and 4. 2, 3, and 4. 5, yeah. 10 digits. Right? Followed by a digit. Yep. And then you can also do 4, that, 0 to 5. And 0 to 5. Do another one. That's confusing. Right? So this gets us 200 to 249? Correct. What was alpha? 0 to 5, 0 to 5. Oh yeah, you can't just do 5, can you? 5. Yes, that's right. No, not that the 5 went there, but the whole thing was correct. No, that's 250 to 255, which now covers the whole range. So now we can say that an octet is a digit, is a... digit, right? Digit or... 10s. 10s. Or huns. Alpha or beta. I don't think we wrote 255 characters, but... it'd be longer to have digits to have a right. Just to clarify, you said we couldn't do the dot dot dot because it was just cheating. That's not a regular expression. What's the regular expression dot dot dot operator E? It seems reasonable like 4 dot dot dot 244 to 255 dot, no? It seems also fair like... design a regular expression. That's not a regular expression. A regular question is finite. Can't we split this into three components? In what sense? For example, in the first, we know we'll start with either a 1 or a 2, right? If we consider the case of a single 0 as a different case, it'll start with either with 1 or a 2, right? So that will be our first one. Then everything from 0 to 9 for the second one. No, because you can't... that would do 299, right? So this is the problem, is if you just have a regular expression like 1 or 2, right, concatenated with... Sorry, sorry, sorry. Digit, digit, right? At the end of the day, if you have 299, you basically can't consider 1 and 2 the same because they don't... the things that come after it have to be different. So for that specific case, we can do a... for, you know, cells 200, we can take one case and put an R, right, and combine it, concatenate it with the other one. Yeah, that's what we did here. So we defined all the cases, and then we concatenated it together at the end. So it's either got to be these things, right? That's cool, let's come up. So you recommend doing that? So it's just... The purpose of the question is to get you thinking about how to design a regular expression, right? I mean, ideally, you want to write it in the most exact way possible that's as precise as... Yeah, but I'm going to ask you, is it better to refer it to a rule like that or just like... you like... You mean if you're going to write one thing called IP that's this whole thing of like words that's going to be like multiple lines? I mean, I would say that that's probably not the way to be the most clear in your thoughts, right? You know, if it's going to span more than like two lines, I would say, yeah, I want to like not... not do that. But yeah, this is only one possible way, right? It's not all possible ways to write this. So, on the problem, it's asking us to do first and follow. I've gone through this like three times. This is... I'm getting this from my follow set. And I just have a hard time thinking that it would be all of them for all of them. Except for that. I don't know. Is it? I mean, I've gone through it three times and if it's not right, I would like to know what I'm doing wrong, I guess. If you're applying all the rules correctly, then it could absolutely be the case that that happens. There's nothing inherent in the grammar that says it can't do that. Right. It just doesn't seem like a kind of homework problem, I guess, where it's like, okay, it's that one. Oh, same, same, same. It's not the answers that the steps you take to get there, right? And making sure that you make sure that you're applying all the rules. Well, yeah, so with the test on Friday, if I'm doing the steps wrong, I would like to know, basically. Yeah. So, I mean, the best way to do that, right, is to go through these examples, right, and make sure that the steps you were following matched up here. And I, I mean, I did these last night and this is what I got. Perfect. So, I mean, yeah, I think that's fine. I mean, I'm like, I said, yeah. All right. Here is the leftmost derivation. Uh-huh. I think it's a question. I guess it's one. Sure. Should I include a left sum when I'm writing it out, or can I just include what's gonna remain at the end? You don't have to include it, but you can. It's either way. I mean, I, I don't, either way is fine. Okay. Thank you. What do you propose that we, um, for problem four? How do you propose that we go, um, um, for problem four on the homework? How do you, like, you say to show that our regular expression is correct to show examples of strings that it accepts and strings that it doesn't accept? Yes. Yeah. So, just like, uh, it's very similar to this first part here, right? That's like, look, here's a string that's in this regular expression. Here's a string that's also, like, you don't have to do all of it, right? But I want a few examples that show that you're testing your regular expression in some sense. So, like, I gave two examples here, where I said, like, this string, here's the string sort of broken up into parts. Here's the parts of the regular expression. So, it's like, that's B-A-B, which is the same as B-E-A, one occurrence followed by B. And then here I just gave a description, like, since this doesn't start with alternating A's and B's, then it's not accepted. Exactly. It's actually just, it's kind of like a self-check I put in there. So, to some people, submitted hex expressions that match things that weren't hex numbers, right? So, it's like, they, so yeah, so I want you to just look at it and show that, like, okay, here's an example where it matches, here's an example that it doesn't match, and that is consistent with the description. Okay. I wasn't sure. So, kind of like the same thing we did here, thinking about just, okay, two, five, six, zero, zero, one, we want it to not match, right? Right. Because it's really easy to write something that matches that and that. Right. So, you didn't, I didn't know if there was like some, I didn't think there was some sort of like formal way. No, no, no, you're not like proof, you know? Right, yeah. Thanks, Sam. Proves that it's correct. Like, I didn't notice it so we did the practice exam, but it only asked for it. So, on the homework, and it asked for just sequence of tokens actually through the values. That's fine. I'm not going to get docked. No, no, no, absolutely not. No, no, no, no. That's the thing is like, if you do that on like problem one, it's totally fine. Right. I can, as long as I can still figure it out, but if I have to dig a lot to be like, what are you trying to say here? Like what answer do you actually have? Yeah. It's not good. Yes. Okay. Thanks. So, for the homework, the last problem we're going to do in the match, you guys just show a couple of examples Yeah, just give me some strings. Like I want you to think about what strings match, what strings don't match, right? So, that way that you know that you're thinking about not just matching, right? Because that's the key thing is you can make that thing, you write that star, right? It will match everything, right? But that's not the purpose. Yeah, it's not the purpose, right? You want to match just that. We're talking about problem four on homework one. On what? On homework one. Are we allowed to use... Was that the last problem? Yeah. We're allowed to use braces, like, with regex, we can do braces to... No. You have to do regular expressions as we talk about them in class. You just can't use different operators. All those braces are course, right? I mean, they're just another way to do it in push. I guess. It's slightly more compact. It's nice to say brace to comma. Brace. Be like two or more. Absolutely not. Yeah, so this part says show your regular expressions correct by showing strings that are in the language and strings that are not in the language. Right, so this means do the regular expression and then do a self-check to make sure that you're matching things that are in the language and matching things that are in the language. Yeah. How many strings do you see? I mean, the game itself is very nice that you... Honestly, this is more of for your benefit because people submitted problem one or for homework one they submitted all stuff that didn't actually match so... or it matched too many things. So this is, I want you to demonstrate that you've thought about OK, it's like designing test cases for your homework, right? So like design the test cases that should match and the test cases that don't match. I didn't get that one so I couldn't see it out. So, I'll give it another shot and then I can email you my response to the characters and you can tell me if I did it right. So you're not even talking about me and that just sounds just silly out of context. Yeah, or you could come into office after me or something. You're all emotion birds. So... I don't know I don't know It sounds like we're talking about the character that you have. 530. You're all just a bunch of nerds. 3 to 4. And that's good. Then I'll be there. It's on my counter and it's on my memory instead of show up. And then running away. Any other questions? Yeah, it says Thursday, 3 to 4, is, is, is, is, is, Is. Didn't one of these questions say make a guarantee? Find a string with 2 different part trees and draw to a part tree to draw to a part tree. The string G is a really good one. The string G alone is a really good one. Like a lot of... Yeah, there's a couple different examples. Which is kinda funny can you do that? Kind of idealize me. Yeah, because I don't know what on my life is. I mean it's basically, you have to figure out, the idea is you have to figure out, so you're showing essentially that this grammar is ambiguous, right, by showing that there's two different parched trees for the same string. Okay. So we start with S, right? Has to start with S, always starts with S, yeah. So basically you're gonna have two trees that both start with S, and then have the same leaves, but it can't be the exact same tree, right? Okay, I see. That's so good. Multiple right answers. Like if there's three different parched trees, you have to put two of the three, I think. But this is, I really like these kind of questions, because it actually challenges you to think about, because you have to understand the grammar, see what it does, and try different things. It's not just like a first or follow-up set. So hand you like these kind of questions. It's actually more just like telling you guys why we ask these questions, so you don't think I'm an arbitrary, horrible person. E, F, yeah, so this could go Epsilon, Epsilon, Epsilon, Epsilon, right? Yeah, that's one tree. That would be one tree, and then, so the important point though, so I have to choose this one. I mean, how does G work? We have to include G here. No, you're good. Oh, oh, because I choose the other one. Okay, do you guys like it better like this, where there's two S's? I like it better with the bar. When I first saw that, I was like, that's really different. I think it's easier to do first sets that way, but it's easier to understand it when it's the other way. Yeah, I agree with that. It's kind of a struggle. It's just we first saw it, we're more used to looking with the bars. When we saw this first, we probably like this one. I'm like, wow, there's a lot of rules. Yeah, I didn't know what this was. Either one way or the other way. So here we get the string G. Both string G. The solution is using the one I have now, so it's easier. Okay. Okay, two different part trees, same string. That's really easy. Tell me a way to go about that. Could you just make the part tree that's on the line? I was like, you want to go ahead and type a question. It's a long story. You can just make the part tree see what you get. No, we're talking about the art. But yeah, the problem is it gets tricky, right, because for every possible rule combination there's a different combination you can choose. I would choose to try to get a string first. I would try to look for a string first and see if I can make it too tight. Just try to find the most simple string. This time I have to go.