 It's started for today, so I'll just reiterate some questions that were coming up about homework 3. I also just posted to the YouTube thing a recording of the office hours where we talked about homework 3. Well, we talked about handily mill their type inference at a high level, so we didn't talk about specifics of any problem, but we talked also about type equivalence and how you can tell something of the same type. Alright, so just reiterate some questions that were people came up to me to write beforehand. So if you see something used in we're doing this type inference and it's been used as a function in an apply and it has one parameter, you see it later being used as a function with zero parameters in a different apply. Does that type check? No. No, because that function can't be a function that takes in one parameter and a function that takes in zero parameter. So it's definitely no type check. And then on your homework, if you're trying to do it and you get to a point where the constraints can't be satisfied, then what do you have? A type error. Yeah, exactly. And you should probably explain why you think there's a type error just in case you're wrong and maybe you could give you some partial credit. Any other questions? So those were the questions that just came up. Anything else on homework 3? Yes. What kind of work do you want us to show exactly? What kind of work do I want you to show? What would be good is labeling every node and showing as a type and then showing what the constraints are. I mean, you do it as you go through it or not, right? I mean, you can kind of do it like we did person follow set where you see how that type changes, right? So you have, you know, T1 is equal to Ta, right? And so you know that that's a constraint. So as long as you're keeping track of that in some kind of order we can understand that'll help a lot. Yes. So if there's some error, is it okay to just interrupt? If what? If there's some error, a type error, is it okay to just interrupt? Or is it okay to just interrupt? Well, it's an error because the types don't matter, right? So the base case, I mean, one case is you have one type of an integer. You have an A, the type of A is an integer over here. And somewhere else in the tree, A is the type of a boolean. So that's cool. You might try to like add a string to an integer. Boom. Say type error and you say exactly that. So this has the type of integer and this has a type of string. And the alpha? You say type error. I mean, you say type error and it explains. You said that three years ago. Like, I'm not talking about that. Oh, oh, oh. In the project? Yeah. Oh, in the project? No, I mean the homework. Oh, in the homework? No, you don't have to keep going, right? As soon as there's a type error, it doesn't matter what else happens. There's a type error. Okay. Any other questions? I'd say types. Okay. And now we go back to the land without types. Okay. So this is where we left off on Monday. So we're talking now about the runtime environment and how does the code that we write actually get translated to assembly and specifically how we local and global variables work? Yeah. Sure, depends on what it is. Yeah. Where to start on type tracking? You need to start keeping track of the types, just like you do when you're manually doing any of their typing sorts, right? So you need to keep track of what things have what type. And where you do that is up to you, right? Kind of the quick implementation wise is you just modify all the parsing statements to keep track and update some global data structure. If you want to do it super clean but may take longer, you kind of create those same statements like all the print statements, like the prints, those all go through the entire tree. You can make a new thing of those that deal with type checking and you can do it that way. But the easiest implementation wise is to put it in the parsing statements, in the parsing code. Yeah. So you just have to get a parsing code working. And then you verify that that's working. Then you need to think about what data structures do I need to know about the types? And then how do I fill in those data structures as the program is being parsed? Anything else? Cool. Okay. So we're talking about the runtime here. So this is the environment where our code lives. And specifically we've been looking at how do, so we said that we can use kind of static memory for storing global variables and we can use the stack to store local variables that are local to our function. And we looked at the function frame. So we looked at how we're going to use this frame pointer or base pointer that's different for each function in vocation. So that way local variables are offsets of this rather than some static fix location. This is where we work, right? Okay. Perfect. At least vaguely familiar. Okay. So we have this program. It has three local variables, A, B, and C. We set A, B, and C to things. We add A to B, store it in A, and then we just return. So we looked at what this looks like in kind of pseudocode, right? So the compiler just gives each local variable A, B, and C some offset from the base pointer. And so it's going to say you're at offset A, you're at offset B, you're at offset C. And then in pseudocode it's going to look like, okay, the memory address at EBP plus A is equal to 10. The memory address at EBP plus B is 100, 10.45, and add them together. And so we looked, and by looking at the code that the compiler generates on 4x86 on CentOS67, the same thing we've been using, it decides that A is located at offset negative C, B is located at offset negative 8, and C is located at offset negative 4. That's just how it decided to do it. And so we looked at the code. So it's first moving the stack pointer into EBP. So what is this doing? Yeah, so it's, exactly. So it's establishing, so the stack pointer when it gets to main is going to be pointed somewhere in the stack. We don't know where. But we want that EBP register, which is what we're going to calculate all the offsets here. We want that to point to wherever the stack currently is. Then we need to make room for these local variables on the stack because otherwise we may end up rewriting them. So here we're subtracting 16 in base 10 of the stack and moving the stack pointer down. Then we're going to move hex A, or the value 10, into EBP minus C. And so we know EBP minus C is the variable C. Then we're going to move 100, which is hex 64, into EBP minus 8, which is this here. And then we're going to move that crazy floating point representation of 10.45 into EAX. Then we move EAX into EBP minus 4, which is where C is located. And then finally we are going to do the add instruction. So we're going to move B into EAX. And then we're going to add what's in EAX, which is the value B, and store it at, sorry, we're going to add what's in EAX with EBP minus C, which is A. And we're going to store that in A. So what's going to be in A should be A plus B, which should be 110. So you have the stack pointer and moving it down. The stack pointer and the base pointer are at the same point. And then you're moving the stack pointer down. And then you're setting the variables according to this base pointer at each spot. But you're not actually moving the base pointer. You're just placing the values of A, B, and C relative to that position. So we're going to see an example. So the basic idea here is that throughout this program's life, so obviously if our program was longer or more complicated, it was going to be more coded. But the base pointer for our function main here is never going to change. But that way, anywhere in this function, the compiler knows, hey, where's the address? Where's the location of C? What's that EVP minus 4? It doesn't matter where it is in that function. Because as the function executes, we may push or pop things off the stack depending on what we need to do. So the stack pointer can change. But because we started this base pointer, we know that that's a fixed offset for the lifetime of the function. I don't know 100%. I think it is some compiler optimization. Yeah. Or some GCC thing says that it's more efficient to do this when you have to specify the whole value here, all 32 bits, because that's going to be in the instruction. So, cool. Okay. So let's look at, so this is kind of just looking, we stepped through the code. So now let's actually look at how, I'm looking at visualization of what the stack is and how it changes throughout this code being executed. So we have the exact same code that we just saw here on the right. We have the top of the stack, which is going to be all Fs, at the very bottom of the stack, let's give you the very bottom of the stack. Zeroes, yeah. So we have our stack, right? So once, we're going to start, again, at 10,000 in hex, which, so our stack pointer is going to be pointing here when this first instruction is executed, right? So what's in the dots, the triple dots above us on the stack? Remember, we don't want to mess up, right? Because somebody may be called main, and we don't know, we know the stack pointer is here, and we know that what's above us is stuff that we shouldn't touch because somebody's using that. What about the stuff below us? Yeah, it's garbage, right? We don't care about that stuff. Okay, so we have the three registers that are important here, EAX, ESP, which is the stack pointer, and EDP. So right now, which of these three registers do we know for certain the value of? Where's the name? Why ESP? So then what's going to be the value of ESP? We don't know. You just said you knew. You know the location. So what's going to be in the register of ESP? 10,000, right? So it's kind of a circuit argument, but basically I said, okay, the stack pointer is pointing here, right? But there's no arrow in the machine that literally points here, right? The only reason why we know the stack pointer is actually here is because there is the value 10,000 in ESP. But do we know what's in EAX or EBP? No, it could be anything, right? We'll kind of see how that plays in later. But yeah, right now we don't care. So we start our execution. We're going to start right here at this very first instruction, right? So this is going to move ESP into the base pointer, right? So after this instruction executes what's going to be the value of EBP? 10,000. 10,000, right? Yeah. So this instruction that said move ESP, whatever the value is there, into whatever, into EBP. Well, isn't it already there at 10,000? If you recall this function, you would start to set up the stack pointer and the base pointer. The stack pointer already exists at the bottom of the stack. The stack pointer already exists, yeah. So right now the stack pointer is pointing to the base pointer, right? The base pointer could be anything. We don't know what we get here. Exactly. That's whoever called us base pointer essentially. So we want to do here, right? Because all of our offsets are based on the base pointer and the frame pointer. So we need to move and say, okay, wherever the stack is now, right? Because we know everything after that is free memory. So let's say, we're going to say EBP is at 10,000 hacks right now. And so that's all the semantics here. We can move the values in this register or put it in this register. Pretty simple. So now we're going to subtract hacks 10 from ESP. So what's 10,000 hacks minus 10? Smaller number. Smaller number? 9,000. That's 16. 16 smaller? Yeah. FF0 should be. Is that right? I feel like it's maybe wrong now, but... Well, wrong. How is that 0? subtracting 10 from 16 is 6. Wait. Wait. But it's actually 16. It should be C. No, C's not enough. C would be only subtracting 2, right? So 3, 4. This is when we pull up our hang-in calculator. The fifth thought is what I did. So 10,000 minus 10. Yeah. FFF0 plus 10. Oh. Math. No, actually it's more like calculators. They're handy. Good. A room full computer scientist, right? I didn't get to do it. Okay, so yeah. Who does hex match in their head? It's crazy, right? I don't know how to do that. So anyways, so we calculated the value. The value is FFF0. And so how many, so we're currently pointing here, if each of these lines is four bytes, right? So we're going, we've decreased ESP, so which direction is the arrow going to go? Up or down? Down. Down, right? So it's going towards smaller addresses. So how many are we going to move in? Four fours. One, two, three, four. So hopefully everything goes correctly. It should end up here. Does everybody agree? Yeah. Any disagree? So this would be, yeah. Okay. Yay. Okay. Good. Okay, but, so this is where the stack pointer is. But where's the base pointer? Still at 10,000. Right. Still at 10,000. So now we're going to have two arrows, right? So I hope this isn't super confusing, but you can always look at the registers here to see where that value is, right? So the stack pointer is now pointing further down the stack. So based on the semantics that we talked about, right? So everything below FFF zero, what is it? Garbage. Garbage. Yeah, replaceable. But stuff above, important, right? Yeah, good stuff. And you could think of this, we kind of talked about this on Monday. You could think of this as like an allocation, right? We've essentially said, hey, I'm going to use these four values. And so this is how I want to keep that information. Yeah. I think we'll see why later. I'm not 100% sure in this example why. Oh, I see. Yeah, I don't know. You want to store the previous data? Yeah. But that's, we'll get to that in a second. That's a good question. I'm not 100% sure. Okay, so we subtracted 10 and hex from ESB and the answer surprised us all. Then we're moving the value, remember the dollar sign here is the constant value 10, or A, well it is 10, 10 and decimal A and hex into EVP minus C. So where's EVP? So where's EVP? A 10,000? And you want to try 10,000 minus C? Maybe process of elimination to figure out where it is, right? Think it's F? FFF, what was that? FFF4. So which one that plays in? One, tell me when to stop. One, two, three. Should I say I'm on three? I was like here. Here or here? Here? Here? Yeah. But this is FFF0. What's the stack pointer? So one more? Okay, yeah. So that's the other way to kind of do it in your, I mean kind of in your head as you say, okay well I know this is one two, I know this was minus 16 and this is minus 12, which is C. So I know it should be four above, the next one above that one, so. So we're going to move, oh and now I'm going to make it a lot easier by putting the actual values here so we can see what these all are. So we can verify that these are actually the correct addresses here. Looks good, good. I like that. Okay, so then we're going to move, the constant value, A, hex A into FFF4, which is EBP minus C, right? So which variable is EBP minus C? What was that? A. A? Yeah, so also remember we said A is equal to 10, right? So that's another way to think about it. Okay, great. So we've executed this instruction, now we have the next instruction, move hex 64, which I believe is decimal 100. We're going to move that into EBP minus 8. What's EBP minus 8? FFF8, that's a lot easier with the things over here, right? Okay, so then we're going to move the constant value 64 into that memory location, and then we're going to move on to the next instruction, and then so, so have we changed the stack pointer to the base pointer? No, should we have changed the base pointer? No. No, we should never change the base pointer during our functions execution. Cool, okay, so now we're going to move this, this floating point value into EAX, so EAX is, now we actually know the value that's in that register, so we essentially just erase whatever was in it and don't care about it. And then we're going to execute this next instruction to move whatever's in that register into EBP minus 4. Where's EBP minus 4? FFFC, which is which variable name in our original program? C. C, and what type of C? Flow. Flow, yeah, which makes sense, right? So we have, here we have an integer 10, the integer 100, and a floating value, X, A, E, and C, local variables to our function name. So wait, that long X number represents the original value of C? Yes, that is the floating point representation of 10.45. Is there a difference between move and moveL? Is there a difference between move and moveL? Yes, moveL is move a long value. I don't know why the compiler decided to do this one versus the other one. Okay, so then we, oh yeah, we're going to move that value onto the stack, which is what we just did. So now C, so that took care, these two instructions took care of the C code of C is equal to 10.45. Now we're going to move EBP minus 8. So what's the value of EBP minus 8? Yeah, 64, which is 100. So essentially B. So we're taking B, we're going to move it into EAX. Oh, so yeah, so I kind of have here A, B, and C, right, to kind of relate it to the variables. Remember, right now the program doesn't, this binary, this assembly code doesn't care at all about variable names or types or anything. All it is is moving bits around in these assembly instructions. So we're going to move that into EAX. What's happening here? Then we are going to go onto the next instruction, and this final instruction is where we're going to do, I believe the original line was A is equal to A plus B, right? So here we're going to take EAX, add it to whatever is in EBP minus C, and store the result in whatever is in EBP minus C. So this is a kind of short hand notation. Whatever the destination is, on an ad is where you're going to put the result. So what's EBP minus C with variable A, which does that make sense semantically with what our program wanted to do? Yeah, our program is A is equal to A plus B, right, and so we got the value of B in EAX. And then we're going to take that value, add it to A, which is what we know we wanted in our expression, and then we're going to save that value back into A, so A is now forever changed. We're going to execute this. It'll be 6E, I believe. Feels like it's wrong, but I think it's right. Hex math doesn't make any sense. Okay, is it right? Okay, so then we get here, and then we're all done. So questions on this kind of example? Yeah. Would anything change, like you said A plus equals B as opposed to A equals A plus B? Nope. So A plus equals B is basically syntactic sugar for A is equal to A plus B. Okay. So the same instructions are going to be generated. So in this case, syntactic sugar just means that whatever that syntax is can be expressed in other ways in the language. So there are basically synonyms in programming that market and talk and compiler to different languages do that. It's actually C, sharp and Java do that a lot. So when they introduce a new like, I don't know, like do you see sharp with like length or p-length queries? Nobody? Not necessarily a lambda function, but you can essentially write a SQL query like select blank from blank in blank, and it looks like crazy code. And it is code, but the compiler actually has a series of function calls. So you could do the exact same thing yourself before. It's just it has a nice syntax to do it now. Similar things to lambdas and like Java, the new Java and C-sharp. Those also translate in a certain way so that they're backwards compatible. So you kind of could have done it, but they make a lot easier. Cool. Any other questions on this function frame? Yeah. So the problem basically we are using the base to allocate some some frame to the function, right? And we are allocating here 16 bytes, but maybe function will be more than that in the exact case. Because in the second step we are allocating the 16 bytes to this function. Maybe we will be more than that. So this is a C code, right? So where are you supposed to declare variables in C code in a function at the top, right? I think in later versions you can define it anywhere, but suffices to say that by looking at the code you can look and see what are all the local variables in that scope. So as a compiler you know what are all the variables in this function and you know what are the types of all those variables so I know exactly the size that I need to store them. So yeah, you can do this all at compile time and this is actually exactly why when you declare in a local variable like an array a buffer, you have to specify the exact size. Like it has to be a car, bracket 50 or 100. So the compiler knows it needs to reserve 100 bytes here on the stack for your buffer. Otherwise, if it was dynamically allocated, it would completely mess things up. Yeah. So you said in newer versions of C, say I do operations like I have a bunch of loops and conditional statements and so on. Two pages for one function. At the very end I declare a new local variable like in value and I return that value. So that value wasn't declared at the beginning of the function. So how does that use C standard? You just look at the whole, it just makes the parsing slightly more complicated, right? It's just that you basically you can look at the whole function once, right? And look at all the decorations, pull that out and now you know exactly how many variables and what their types are and done. So instead of kind of just doing it as you're going and just like, bam, I know exactly how many variables because they're at the top here, I can go through it all once and then say know what to put there. Any other questions? Yeah, so all the compiler needs to do really is to make sure that they essentially allocate enough room on the stack here for the local variables and that these local variables have and that each of the offsets from the base pointer are used the same. So everywhere it uses the value C, it's using EVP minus 4 and then everything works awesomely. So this is kind of what just like the local variables look like. So this is what the frame pointer is used for for local variables. But now we're kind of going to talk about functions a little bit so then we can see well how do we actually use this to call functions like who creates these frames, how does this actually work? So functions, so we kind of, this is a little bit of a review, right? We've talked about semantics of functions. So when you declare a function, what things are you declaring about that function? You can think of it as like meta information about that function. Yeah, so it returns yeah, what else? Somebody else? Who said that? What about parameters? How many they're type, the order? Yeah, exactly, what else? What was that? Name. Yeah, name, right, the name of the function. Good. So the function name we have the name and type of the parameter. So you're going to make an English semantic distinction between the formal parameters we're going to say are the parameters in the function. So the function defines the name of parameters X, A, B, C, whatever, and their types. So we talk about formal parameters, we're talking about the parameters in the function. To separate it from when we call a function we're passing in parameters. So if we just use the word parameters for both we get confused about which one we're talking about and where. Does that make sense? So formal parameters are in the declaration there. I don't know, maybe you can think about it in tuxedos or something, they're very formal parameters. And the return type, right, so that's pretty much, and I guess the body you have the body and the code and everything but that's kind of a good life there. So do you need to know all this information to be able to call the function? You're shaking your head very vigorously. You don't need a return type. I don't know, I think I disagree, because if you're calling it and setting it to a value, right you probably want to make sure that those types are correct. You don't need to know the name of the formal parameters. You don't need to know, yeah, that's a good point. So you don't actually need to know the name of the formal parameters, right, because especially in languages like C and C++ the order of the parameters is what matters. Not their, not their formal name. So yeah, but pretty much you need all this information, right, so you can't, this is why you can't why you need a function to be declared before it's used in C and C++. Questions on these? So to call a function, so we kind of refer to it as invocations here, invoking a function, which is kind of cool. It kind of sounds like magic, right? Invoking some spell that you've done somewhere else. And this is the syntax, so this is very basic, right? So we're calling a function f and we're passing it x1, x2 all the way up to xk. So what are x1, x2, and xk? The input parameters. Input parameters. Actual parameters. Actual parameters. Somebody's looking ahead, that's cheating. Nobody can come up with that by themselves. So, are they just variables? No, because you can put it just raw data, like you could input just a string if that was a static string that you always wanted to input. Yeah, so you could, exactly. So you could put variables as parameters, you can put constants, yeah, exactly. You could put constant values in there, what else? Other functions. You could put other functions in there as you're finding out in some languages. What about like an expression, right? What about like 5 plus 10? Right, so yeah, so generally you can think of that, each of these are expressions, right? So they can be an integer, or they can be a constant integer, they can be a constant string, it could be a variable, it could be a plus b minus c, divided by 10 times 132, whatever, any complicated expression you want. And so, as somebody just ruined the whole surprise for everyone, so what we're going to call these parameters are the actual parameters, right, so these are the actual parameters that go to the function. And so, then if we want to call these, so if we want to try to invoke a function, well, kind of a question becomes, then who creates the frame for that function? And where do the parameters live? Do we put the parameters in, like what are the parameters, right, what type are they a variable? Like the formal parameters, right, inside the function. Would you just store their addresses so that the compiler knows where they are rather than re-creating them? Maybe, so from what we've just said, first, are they variables? No? Yeah, maybe. Maybe? Based on what? Depends on whatever I've defined, that's a good answer. I think they're variables that you have to store in a specific function frame because we don't want to reference stack frames somewhere else because you don't want to store some operations because that goes through everything. So first thing, right, are they variables? So can you use them to add things and subtract things and can you pass them to other functions? Yes. Can you assign to them? Yes. If you have a function foo that takes in a parameter x, you can say x is equal to 10 and then use that x later on that's going to have the value 10. So you can assign to them. You can program each function in location. So are they global variables? Global variables. Yeah, so that's kind of the other thing. So what's the scope of the variables? Of the branch, the formal parameter? Yeah. They're just scoped in that function. They're just scoped in that function. So where do we want to store them? Not too technically, but generally. Inside. Inside? Inside where? You don't know. What if one of the, I would assume in within the function frame somewhere in the stack frame? Where's the function frame store? Below each pointer. Above stack point. Too specific. So would we want to store like global variables in some global storage area that's always static for each function in location? If you ever reverse a function, what do you do over it? We come with local variables, right? So local variables, if we recursively call ourselves, right, if we just have one global static allocation, well, that's going to get overridden, exactly. And the same thing with parameters. So where do we want to store them? On the stack, yes, as part of the function pane. But who's going to put those parameters there? On the stack. The compiler. Technically true. That's kind of true. What is it? The processor? I mean, that's also in the same name, but yes. It's eventually going to be the processor. Is it the function itself? Which function? The, uh, the function that is being the call, so we have the caller, the thing that's invoking the function, or the callee, the function itself. The callee, so should the function allocate those, uh, that space? Call. Like I said, some, let me just go in one direction and we can talk about whether it's right or not. Well, does the callee doesn't know what's inside of the function it's calling? So the callee, so the callee is the function being called. So that is the, here the function f, right? So f does know how many parameters it has and it knows the size of it, right? But it just knows the formal parameters, right? So it knows its formal names and types. Every time it's called, it would make area for it. But if every the other way seems odd it almost seems like it would just call itself forever. Okay, let's go up to the first thing that you said. We'll ignore going forever because I don't want to go forever. So, yeah, so the caller, right? So one way to think about it, the caller knows what the actual parameters are, right? If a thing called has no idea that you passed it in an expression or that you passed it in whatever, right? All it cares is that it can calculate on values. But the caller is the one and the caller knows how many parameters and it knows the types and sizes of those parameters. So, yeah, the invoking function or the caller has to create a is actually creating a frame on the stack to store enough space to hold the actual parameters and to put those parameters in there. So, okay, oh, I think these are a little bit out of order, but that's fine. Okay, so we looked at kind of just the frame itself. Okay, that's good. So this is just the frame itself to store the local variables that we just saw, right? So we have the function with local variables A, B, and C. So when we're calling another function, what information do we need or want? We're invoking some function. Yeah, we want to know, yeah, that's very, yeah, that's good. So we'll want to know when we're calling a function, we want to know where to come back to, kind of. Yeah, exactly. Because we want to keep that call chain intact, right? Yeah. So that's one, definitely. I don't know if it's on this list, but yeah. Yeah, where the code for the function actually starts. So where are we, where is this code living that we're going to call? Yeah. We need to know the parameters, exactly. And we need to somehow, yeah, we need to, like, make space for those parameters on the stack, right? What else? Yeah, the return type and value, right? So how do we get that value back from the function? Oh, yeah. So we have the return value. We have the parameters. What else? So what was the first thing that we did when we looked at that main function, that that main function did? The name? Yeah. Local variables, right? Yeah, we want to store information, so we talked about that, but yeah, that's part of it, right? We have some local variables that we want to store on the stack as well, along with the parameters, right? How many EVP registers are there? Just one, right? So kind of in the same way as we want to make sure we come back to where we are, this function is going to have its own EVP pointer. Or it's going to use the EVP register to point to its frame, right? So we should probably save our EVP, our base pointer so that we can get it back after that function executes, right? Because we can't have more than one at a time. So yeah, we want to save our frame pointer. Somebody talked about the return address. That's good. Local variables. What other space, I think we can maybe talk about it a little bit. What other space may a function need to execute? Variables that are allocated to the heap. No, not quite. Close. So what happens if we have a lot of variables? Or let's say, so we have, we have, what was it? We talked eight, we have eight registers, right, that we can use in the CPU. Stack pointer and base pointer are already used. So what happens if we need to use more than compute on more than six or seven, on six, seven or eight values? Yeah. Are you talking about the clump list? Maybe, why don't you describe it? So basically, if you have a value inside of a register and that register is going to be used in the function, then you need to save that value so that you can put it back in after you go back out of the function so that you don't do that. Yeah, that's actually a good point that I didn't talk about. So I'll put on here temporary variables. It will allocate that memory for it. The other thing you may want to do exactly, so we saw that main just copy a value into EAX, right? It just decided I'm going to take this value, I'm going to put it in EAX. Did it care if there was a value already in EAX? No, it didn't care, right? And so, yeah, the battery died on the mic. Hello? Oh, it did. Can you not hear me? People always say that. A little bit not? Okay. Let's see if we can do some mic. Oh, yeah, it's flashing fast. It's flashing fast. Probably not. No, all this because they put two batteries right up here so I can just keep going. And now which way did it go? You have to live one more battery in case I do that. How does it not say? It's got to be only one way. 50-50 chance? That one also says battery. Did I get it right or no? Ah, it doesn't close if it's on the wrong way. See, that's good design. Siding battery testing. Hello? Better? Awesome. No, it's still fast. Okay, so back to the clobber list, right? So if I'm calculating some value as part of calculating that value, I want to call a function but I've stored my temporary variables in EAS or EVX. That function has free reign over using those registers. So I would actually want to store those values first onto the stack, call the function, and then when that function returns, pop those values back off into the stack. I need to save them somehow. How I exactly do it depends. The compiler knows because it's outputting all of these instructions. That if it's using or not using EAS, EVX. So it keeps a list of basically using registers and not using registers. Yeah, so it's kind of a whole thing about compiler optimization and stuff that we're not going to talk about. Yeah, it's a whole thing. And then the question becomes can you actually use less registers? Because filling onto the stack is really slow because now you're going from the registers that are right on the chip to going out to memory. Can you optimize the code to reduce the number of registers you use? What if it's actually contradicts something I asked earlier because you send in an address of a local variable from another function and you want to operate on that specific value at that address. So if you send like the ampersand of the invariable over time that local variable belongs to another function frame. How do you change that local variable from another function? You basically can't, right? So that goes back to the semantics of local variables and how long their allocation is valid for. So they're only valid for their scope. And when local variables are outside of their scope, they're deallocated and so whatever happens. If you somehow get a pointer to them, right, it's garbage. There's no way to guarantee that that value stays in there. But you have to have a way to do it here because what if I send in the address of an invariable A? So from one function, right, if you pass in pointers to a function, that function can dereference that assuming that memory is allocated. That's really funny. The problem becomes when you return the address of a variable that's on the stack from your local function, that's when it becomes garbage. Cool. Alright. You're actually referencing the memory above your function frame when you're doing that. Yeah, exactly. That's totally fine. Totally fine. Because it's the parameter to your function, right? So that means whoever's calling is giving you permission to use and edit that memory. Exactly. Cool. Alright. So the question is, so what order do we do this? So we just talk about a bunch of things, we stick on the stack. So what order does this happen in? And who does what? I don't know. Is there a natural order from here? Almost. So yeah, there needs to be basically you need to make so the other question is, well what kind of functions do you want to call, right? So in a program that you have all the source to and you're compiling kind of doesn't matter, you can set on a convention, right? The compiler can just do something. But the problem is if you compile something with Microsoft's Visual Studio, some object file and you want to use that in my program and call your functions, I should be able to do that. But in order to do that, we want to make sure that we're talking the same convention as who does what and what order are all these things put onto the stack and what does that mean? So that's as known as a calling convention. So this is actually a really important concept to know that there, okay, it is essentially, you know, human decided. People have to decide on a convention. But then once it's decided, okay, then now as long as you know how to speak that calling convention you can call that code that function can execute and it can return. So all basically, when we talk about all that information has to be stored in the stack in a specific order. So the other thing, all that information has got to be stored. So who's storing that information? Yeah, so kind of there are some things that naturally fall into one of the other, right? Like the function itself, the callee should allocate space for its local variables, right? Because to invoke another function, all you need to know is the name the number of parameters and the type of each and the return type, right? Do you need to know how many local variables it uses? No, because otherwise that's crazy, right? You have to pass all this information and it really doesn't matter to you from the outside calling this function, right? So this is where you kind of split responsibility and say, okay, the callee function the function that gets called should allocate that space that it needs. So then it becomes a question of, okay who stores the, then we have this base pointer, right? So who stores my base pointer, the caller's base pointer? Is there a natural way to put that? Like you said, the caller does it. What makes sense one way or the other? The caller, why, who said that? Yeah, why the caller? Yeah, so that's a pretty good argument, right? It's like it's my base pointer so if this other function messes it up it's going to affect me, so maybe I don't want to do that. What about, is there an argument for the other way? Maybe the callee should save the base pointer? Yeah. Yeah, so similar to that, what if the callee doesn't need to use the base pointer and doesn't touch it? What if it has no local variables? Right, so then maybe it's like a premature optimization if every time you call my function you're saving the base pointer but I never need that base pointer so I don't care about it. So the point is these are just things that need to be decided, right? So you have to have a convention so that we know exactly whose responsibility what is. And so yeah, we need to decide who stores what onto the stack. And we talked about, so definitely it can vary based on compiler. It really varies based on, it's pretty crazy, so it can vary based on the processor so different, like x86 I think is a different calling dimension than 64 bit, which has different calling dimension than ARM which probably has a different calling dimension than MIPS I don't know but I'm pretty sure that's probably the case. So the processor definitely impacts that and the processor really informs the assembly language. The operating system so there's actually calling system calls, like Linux system calls actually have a different calling convention, they're called the sys call calling convention as opposed to calling other C programs. So that makes things more confusing. The compiler, the compiler can do it differently and the operating system's definitely like Microsoft has its own calling convention, so all windows programs basically use a different calling convention than the Linux programs. And even so type of call, that's right so you have a program running on Linux it can make fast system calls to the operating system which have one set of one set of calling conventions versus the system call versus the standard kind of Linux x86 calling convention. So you just need to specify and actually it's kind of interesting, I believe in C++ you can specify when you like x turn a function, you can say what calling convention it requires so you can't actually use code from maybe like a windows machine or a function assuming the calling convention is correct. Questions on this? Okay, so we're going to look at the x86 Linux calling convention specifically it's called C++ so this is the standard, so this is everything that's got to happen in this order on the stack. So the caller first pushes the arguments to the function onto the stack in right to left order, so think about the function call right the rightmost parameter is pushed first and then the second to rightmost parameter is pushed and the third rightmost parameter is pushed to the leftmost parameter so then so which, so rightmost or leftmost, higher or lower on the stack rightmost is lower on the stack no well, okay, so it's not bad, it's probably the other way right, so you gotta think about pushing right, so but we've kind of envisioned the stack like this from the top of the top so to push things on the stack the stack's growing towards the bottom, so yeah so we're going to take the rightmost parameter and push on the stack, take the parameter after that, push on the stack the parameter after that, so why might we want to do it like that yeah so you're saying it's wrong to push the rightmost value to the lowest position but it shouldn't matter because the stack pointers aren't even moved down all the way there, so you know you're farthest from reaching so then the question is how does the calling needs to know how to access those parameters right, so those how to access the local variable we saw it takes the base pointer and it subtracts what we're going to see is because this is specified it has to be in this order, it knows that to access a certain parameter it takes the base pointer and it goes up the stack to access it and it knows the exact offset because it knows everybody that calls it is going to do it this exact way so it's got to be the same so that they both know basically so this is what we mean, pushing it onto the stack right to left so we take the rightmost, push it it's here, take the next one push it, take the next one, push it take the next one, push it, so the stack's growing down and the rightmost one so why might you want to what, think about some crazy C functions that you've used why might you want to do it like this as opposed to the other way and you use the printf function please everyone raise your hand okay, thank you how many parameters does the printf take? as many as you need it to it takes an infinite amount of well not infinite, it takes a finite but as many as you want a number of parameters so it has its own we're not going to get into exactly how it does that but if it so if you think about the stack of parameters and I have the leftmost parameter as the first one on the stack and the next parameter next, next, next so if I highlight this I'll know when I get to the first I'll always be able to identify the first parameter no matter how many parameters you give me so I know that first parameter is going to be a string and then essentially what printf does it departs that string and walks up the stack and how much it goes depends on how many percent signs you have in your print string but if you had the other way at the rightmost value, how does it get to the string? you'd have to walk through everything else and there's nothing on here to put the number on there you'd have to get the number know how many arguments there were walk at the stack that many get the string and then walk down it's a little bit more natural to go the other way this allows you to write functions that take a variable number of arguments and see but it's not impossible to do it the other way you just have to change things ok what else does the caller know that the callee does not know? the term type the callee knows that or the callee knows that somebody mentioned it earlier we were just talking about it in this context we have the arguments what else? the base pointer it's not the base pointer it's plus one more thing what was it? the return the next instruction to be executed in that function it's going to then push the address of the next instruction in that function to be called ok, so that's the callee the caller's responsibility so actually this is all the caller needs to do which is kind of nice simplifies things now it's the callee's responsibility to push the previous frame pointer onto the stack right? and so that's its job and then it's going to, as we saw in the other example it's going to create space on the stack for local variables that it needs then there's not a key point which we didn't mention but what? so ok, so we know from this that the callee is saving the previous frame pointer it should put it back by the time it returns so we need that when we call a function our frame pointer is the same as when we left what else do we really want to make sure is the same from when we call a function to when that function returns yeah so the callee is going to put the base pointer back which is the frame pointer but what about the other one? the stack pointer points to the bottom of the stack so yeah, we call a function and it moves the stack down and now it's completely it didn't free up and properly deallocate all the memory that it used and so now our stack is all messed up and the whole program is going to crash so it's up to the callee to ensure that the stack is consistent at the same place as when it returned and then just like I mentioned so we talked about the return value so one way to do it would be to basically have the return value as one of the arguments on the stack, right? that the caller could allocate a space on the stack for the return value the callee would know to copy the return value into there but this calling convention ensures that the return value is always in the EAX register so that's just by convention so the caller knows after it calls a function it can grab whatever is in EAX and that's the return value so if there's more than one return value so can there be more than one return value? Yeah, 5 We're not talking about 5, we're talking about C because so that's the other thing so it's pretty clear so C as a language was designed to be very close to the assembly code C was actually started as basically portable assembly code with some abstractions but that's why you have to worry about allocating pointer, allocating memory pointer to the allocation, all this stuff so in C can you return what was that? Structure, yeah, so what's a structure? So structure is one variable in memory what does it look like? Yeah, it just looks like a contiguous chunk of memory with each variable so actually that's a good point I actually don't know 100% what happened today I think the compiler will know that it's that the value can't be stored in EAX so it would do it some other way, I don't know the other way and you'll probably put it on the stack, I guess Yeah, it's a good point so what if the value is bigger than what fits in EAX so that's kind of the downside of using this register because it works great in 90% of the cases but those weird side cases you have to somehow deal with Cool Questions on this? So I don't think we'll get through this but hopefully we'll get it Okay, so now we're going to look at how exactly this works and how the stack works in this case. So here we have our main function we have a variable a a is going to call some function calle and it's going to pass the parameters 10 and 4 10 and 40 and then we're going to set a to be the return of that function and then we're going to return a so let's say we're not using c let's say we're doing some type inference what would you be able to infer about the type of calle what is it? Same as the return type of main correct which is int exactly What else do you know about calle What was that? Yeah It's a what? No What type is it? Is it a return type? No It's a function And then how many parameters does that function have? What are the types of the parameters? int and what is the return? int Done Yeah we can tell that just by looking at here so we know that we also could say a is an integer and we know that calle because it's assigning it to an a that a has to be an integer but then we also know that it's returning the return of calle so we know that that has to be an integer as well What does this function look like Super simple function It takes a and b and adds them together then adds one to that and returns it So super pretty simple code here C code, this is very simple c code this is much simpler than any project or anything you've done so far Now we're going to look at what the code looks like So in main So remember main is just another function so it has to act at the start as if it was called by something So main first what's the first thing that we said the calle does pushes the previous frame pointer on the stack Yeah, saves the saves the previous frame pointer so the calle's responsibility is to save the frame pointer So the first thing it does it pushes evp onto the stack which is the base pointer and then as we saw it's moving the stack pointer into the base pointer so this is where we saw from before where we saw that it's setting the stack setting the base pointer to be where the current stack pointer is because we know we've saved dvp so everything's good with dvp then we create space for our we create space on the stack evp-ap maybe we can talk about why it's doing that So it doesn't need whatever 18 is in hex on the stack from the local variables here I think that's why it's doing that That's what? That's why it looks like it functions and we know that it functions in here Yes, I hope so because if we found a bug in GCC that would be a very big problem especially on a super simple example like this but just looking at this, we know there's how many local variables in there just one what's the size of an it, how many bytes four bytes, right? but here we're, what's 18 hex what's 16 plus 8? 24? we're moving 24 bytes the question is why well, maybe we'll be able to tell by the next one There's five local variables and they account for one extra space Is there five local variables here? Well yeah, because there's there's A, there's quality A and B and then you don't know about A and B yet Right, A and B don't exist yet, right? These are the formal parameters So good Let's look at the next instruction and see if we can come back to this question So here we're moving the constant value 28 into ESP plus 4 So we just moved the stack down 18 hex and now we're we moved it down so we're allocating some space on the stack, right? and when we go up from ESP is that going into garbage or going into good memory? Going into memory to validate Yeah, going into good memory to validate So what's 28 in in the hex? Now 28 in hex is 28 or something else, but yes so it should be 40, right? And then we're going to move 10 hex A into where the stack pointer currently is Right, so what do we just do in essence? Where do we put 10 and 40? At the end of the memory we allocated in what order? Right to left 40 then 10 So then what's the next instruction going to be? So the stack pointer now is pointing right at the value 10 and above that is the value 40 on the stack, so we pushed it on essentially these two statements pushed on to the stack negative 4 sorry, pushed on to the stack 40 and then 10 and then we're going to call call E and so the call instruction basically does two things at once it takes the value of the next so it pushes whatever the next instruction that's going to be executed it pushes that address on to the stack so that's where it stores the next instruction pointer and then it jumps to call E and starts executing a call E so at this point the stack is going to be the return value the return instruction address that we want to return to this next code here and 10 and 40 so then I'll ask again why do we subtract in hex here so the address of the next instruction we didn't actually do it's done by a call function but in essence the compiler was smart in this case and said okay I know I need 4 bytes for an A and I know the first thing I'm going to do is call a function and that function has two ints so I know 16 bytes there and I guess it gave itself another 4 which I'm not 100 or another 4 yeah another 4 I don't exactly know why but the return I'll have to look into exactly why it's doing that but the point is it kind of optimizes this and said okay I'm going to allocate this and then I know the current stack pointer is essentially what I'm going to pass to call E and I'm going to move put 40 and 10 on the stack there as parameters so basically instead of they could have just subtracted enough space for A at first and then each of these could have been pushed 28 onto the stack and this could have been pushed 10 onto the stack which would have moved the stack and then we could have called call E but it decided to kind of do this all at once maybe it's back to somehow so plus remember so plus goes up into good area because high at top low at bottom plus goes up into good area subtracting goes into negative area but we subtract from the base pointer because we know it's already in a good area all right and then we have what we're going to do afterwards right so we're going to move so what's in Eax after this call instruction how's that the value of A almost not quite yet it's the return of call E exactly yeah so but this instruction is taking the return of call E which is in A B X and moving it into EVP minus 4 which we haven't seen yet but what do we think EVP minus 4 is hey it's got to be A right then it's going to move EVP minus 4 into Eax so why is it doing this close that yeah the return of main right so main has to put whatever it's going to return in Eax for whoever called it then it calls the leave function so the leave function is also kind of confusing it does two things at once it sets the stack pointer to the base pointer and then pops the base pops what's currently on the stack into the base pointer so up here we have push EVP right so that's say EVP on the stack and then here we moved the stack pointer to the base pointer and this basically does the reverse we'll see exactly how it works that's what it does for now and then return as we're going to return to whoever called it so which lines of these code actually belonged to main that was code that we basically that we wrote did we write this did we write this did we write this did we write these yeah we called and we want to call a function we put 10 and 4 down there right so we basically wrote these what about these ones more or less yeah returning on there we didn't really write this so this is what's known as a function prolog so these three by these three instructions in particular should be in almost every function that is a seed declaration calling convention right because this is the call these responsibility is to save dvp create a new base pointer and allocate space for local variables so this is the prolog what was this one probably be called what yes the epilogue perfect so yeah the function epilogue is the end part of a function that's always in every function that cleans up and gets it ready for the function that calls it okay so then the callee we'll go over this briefly because we'll come back to this on Monday so it's going to first push ebp right because it's what it has to do it's going to set the new base pointer it's going to move one of its parameters now ebp plus c into dax it's going to move ebp plus 8 into edx it's going to add edx plus eax put in eax add 1 to dax and then it's going to pop ebx off and return this too has a prolog in epilogue but very quickly why doesn't callee's prolog have subtraction of the set pointer yeah we already did it in the main but that's main's each frame is unique to each indication of each function no they're completely separate yeah yeah there's no local variables right so we don't have any local variables here in callee neither are parameters I don't need to change the set pointer to make space for my local variables so cool alright let's stop here