 All right, good morning, everyone. What is it, Monday? The 4th? Yes. It's Tuesday. Tuesday? Oh, good. Then your project's already due, and so nobody's going to have any questions on it today. Yeah, you can see we have a reduced crowd. Who likes jobs? Having jobs, having internships, that kind of stuff. There'll be a representative, I can't remember his exact position, from US Foods. They're one of like the 10th largest private companies in the US. He's going to come and speak for 15 minutes on Wednesday in the morning. He's going to talk to you about the cool tech development kind of positions they have, and that they're actively looking for interns. So show them on Wednesday. Be proud, and then we'll continue to throw your stuff. Questions on that? You know we're in the final stretch. We only have a month left. Why are you reminding us? There's not enough time. Not enough time for what? For what? To pass on my exams. To pass on your classes? That started at the beginning of the semester. A lot of time. That's awesome. There's going to be like another one more project. One more project, one more midterm. The project will be released later this morning. Are there any other? Christmas is also one more. Yes, there'll be probably two more, I'll tell you. That way you can make sure we cover all the stuff. We can do one on tight systems, we'll do one on Lamborghini. Can you say when midterm, do you mean final or midterm? Midterm. Three midterms out of final. Why do they call them midterms? Because it's in, yes. Not midway through the term, but in the middle of the term. I don't know, man. That's just what they call it. I didn't come up with it. What else would be attached to it? Exactly. That didn't scare you enough. Yeah, exactly. You'll be terrified into studying. I'm pretty sure they've been teaching and studying about that. You could have like one quiz, one test, and one exam. You could just have one final. Would you like that? One final for your entire grade? No. Like in other areas? I'd take a class like that, it's kind of terrifying. They do that in law school, right? They have one final test, and that's your entire 100% of your grade is that one test. We could do that. How many of our professors do that? Why don't they? No, some of them do. Well, it's not fun. It has pros and cons. OK, so we've talked about, we learned about type systems. We learned about all the ways we can specify types in a type system. We've talked about how we can actually automatically infer the types in a program, which is what you're doing for project four, and hopefully have mostly already done before I'm going to spend 10 AM to midnight to finish. Now we need to talk about how does the compiler actually make these constructs work? How does the compiler, how does it allow us to use local variables and function calls and all these sorts of things? So this is really what we're going to be focusing on. So what's the difference between a location and the name based on what we talked about with box-circle diagrams? Location is the box, and name is the thing with the arrow to the box. Yes, so location is the box. So what would that represent on the actual machine? It's bound to a memory address. Memory, yeah. So that box, we've been talking about it abstractly, but that box is memory address on the machine. That's why it has some address, because it's addressable in the memory of the machine. So what is the name? What is a reference to that memory address? A reference to that memory address? Kind of, yeah. Is it a constant reference? No, because you can change. Who's that name for? Does the machine use that name? All the time. For the programmer? It talks about the memory address. Right, so it's actually a construct created for the programmer, right? When you declare some variable foo, the compiler, the computer, only cares about memory and values inside that memory. So they only care what's at memory address 10,000. They don't care that memory address 10,000 is actually variable foo in your program, because the computer only cares about its view of memory. And so this is what we're going to think about. So there is this, but we can see, we've been looking at how we can map names to locations, right? You can see that, okay, this name foo is bound to some location, but how does the compiler actually do this? So how does it, how do you think, I mean, how would, if you were writing a compiler, you're on a desert island, right? How would you map names to memory locations? Similar way to an enum, in what sense? How is an enum a label given to an integer? About the integer rather than the label. Talks about the integer rather than the label. Yeah, that's kind of the idea, right? So it's going to be giving it essentially symbolic names. So this is what we're going to look at in this process. So for this whole section, we're going to assume static scoping, because dynamic scoping kind of changes significantly how we're going to do this. And so I've kind of debated about how to teach this section. One way to do it is to kind of talk abstractly about function frames and variable stored on functions. I don't know about you, but I'm not, I mean, I can do the abstract stuff, but I'm much more interested in the details of how real systems and real things work. So we're going to look at here, we're going to look at as an example of how this is done, we're going to look at how GCC compiles your programs to 32 bit x86 instructions and how that maps to the concepts we're talking about here. So we're actually going to look at exactly how the resulting x86 code looks like and how it does this mapping of names to memory addresses. So for all the types of variables we can have in our program. Point, some suggestions, a few weeks talking about types. More abstract types, like what kinds of variables could you have in your program? Numbers. Numbers, even more abstract, so values. Pointers. Pointers. Strings. Where do they live? I mean, how do we know about the scoping rules? Let's try to frame it maybe with scoping. They're defined in the language. They're defined in the language, yes. What are the different types of scoping rules and how does that affect? And we can static scoping rules. So we're focusing only on the static scoping rules, right? So can you access any variable name, the name that's declared anywhere? No. No? So what are some of the ways that you can declare variables? Globally. Globally, so you can have global variables with me for a variable to be global. That's accessible anywhere. That it's accessible anywhere. So how many copies of that variable should there ever be at one point? One, there's gonna be one copy of that variable. What are some other ways you can declare variables? So globally, on the stack, what's another word for that? Yeah, so global, local, opposite of global, local, right? Yeah, so yeah, we'll go with local so we can do local scoping to decide, okay, this is actually, this is a local variable that's only available in this function. And we know we've looked at it, it's actually allocated on the stack. So let's look at these things. So for global variables, right? Where can the compiler put global variables? You're a compiler writer. Where are you going to put in memory the global variables? On where? The static space. The static space, what does that mean static space? Unmovable. Or it's unchangeable. But it's not, you can't change the global variables. You can change the global variables. Oh, it's memory location. Just a second. It's in with the rest of the code. What was that? It's in with the rest of the code. It's in with the rest of the code? What does that mean? It's in the same memory location. In the same memory location, yeah, yeah. It's kind of like a separate space. It's kind of like the code and like the global variable is in the stack with you. Right, okay. So it's going to have a specific layout. Right, so where can we, so in general, right? So global variables are one thing, right? So we can think about where to put those. But what choice does the compiler have about where to put variables in general, right? So it can put variables in memory, right? But where else can the compiler put variables? Registers? Yeah, right? So every CPU has some Reddit? Well, most CPUs, if they're register-based, CPUs have registers so that they can perform computations. So maybe the compiler's smart enough. It can tell, hey, I actually don't need to ever store this variable in memory. I can just leave it in its register. Where else? Is this the only choices on the disk? Yeah, right? We could, could you write a probe compiler to do this to store, yeah, right? And some of you are in operating systems. We just talked about swap, swap files, what happens there? That's what we do. So they're implicitly the memory, so the compiler is storing variables in memory and the swap, the operating system, if it decides you're not gonna use this memory, we'll actually put that memory onto disk, right? And, but your program has no knowledge that that has actually happened. But the compiler, there's nothing to say that variables have to be in memory. The compiler could write, compile your program such that your global variables are on disk. What about in the cloud? Could you think of like a crazy compiler where like variables are all over the place, distributed? Maybe it uses like a blockchain, huh? That sounds like it would be a mess. Yeah, lots of things as a mess. Is computing not a mess right now? Did you just fully understand that mess? So it seems. That mess seems a lot less messy than the cloud. A lot less messy, let's sort, partially sort messes. I mean, yeah, I guess we could do that. Yeah, it could be, right? Have a variable that lives in a drop-box file or something so it gets passed to different computers. Run like a distributed system compiler, like if you're compiling terabytes of data. Yeah, yeah, yeah. So outside of the scope, like just what we're talking about here, when companies, do they actually store variables in the cloud or do they just create a local copy and talk back and forth? Yeah, that's a tricky, it's actually a tricky question to answer. So think about actually Bitcoin, right? So Bitcoin is a distributed database, essentially. Essentially you can think of it as an append-only database that's distributed throughout all the Bitcoin participants. So the global quote-quote variables of who has what Bitcoins are actually stored as transactions in this blockchain, right? I mean, they are also probably stored in memory, right? Because we want, when we do computations, we don't want to go to the cloud every time to get some values. But there's a lot of cases where actually, yeah, the data that you want, you want it to be stored. And so you can think about maybe I want to write a programming language that does this automatically, right, that does this distributed cloud storage so that the programmer, their API, they only have to worry about global variables. They don't have to worry about all that low-level syncing and cloud storage and all the problems with distributed systems about how do you know if you have the right values, all that kind of stuff. So I just want you to think about different ways that this could be done, right? The fact that we do it now with putting variables in memory, that doesn't mean that it has to be done this way. Somebody mentioned compiling large amounts of source code. So actually, last time I remember Google, if I remember correctly, Google has all employees basically have access to almost all the source code and you can check out their whole source code. They're using kind of like a per-force clone. And then the repository is huge. So I think when you want to compile something, it'll compile on like other servers and systems and like do a distributed cloud compilation and they give you the results back on your local system machine, but it looks like you're accessing it from the file system. And it also does crazy tricks with like caching files that you actually use locally on your disk and putting other ones in their cloud system. So you can do a lot of cool stuff with this. So when we think about global variables, right? What are the constraints on global variables? What are, from the program's perspective, what does it mean for a value to be global and what constraints is it supplied with the compiler can do? For instance, who can access global variables? Anything? Could my program access your program's global variables? Anything in that program. Anything in that program? What do you define as a program? We're getting a little soft. So how do you define a program? Is it something, so can you compile something that's not a program? Can you compile libraries that other programs can include and use? Can you have global variables inside those libraries? Sure. So who can access the global variables inside those libraries? Yeah, any other programs that are compiled linked to those libraries, right? So we have to have some way of knowing, right? So the other programs have to know what are the global variables, but that way when we write our code, our program, our compiler, when it's compiling these two files knows how to actually get that global variable. What are the restrictions on, do you have constraints or restrictions on global variables in something like C? Public, private, protected. Right, yeah, so those are constraints that you can have on class instance variables and a lot of class-based languages. Those are usually not global, although you could have a, I mean, if you think about our classes global, like in Java, you've got a private class that's, you can also have, I think, there's some way to limit a class to only one file in Java that the internal package, I'm not sure. So what about on global variables? Did you know you can declare variables that in C, I believe if you declare variables with a static keyword that are global, they're only accessible within that file only. So they're global in the sense that anywhere in that file can access it, but it actually gives you some kind of encapsulation to say that, well, other programs can't mess with this variable. Okay, so how is this actually done? So how do we find out how our compiler works? What are some ways? Do you want to learn how GCC works or Clang or something? What? Magic? There's no magic. It's all confusing. So you can look at the documentation. That's definitely one approach. Look at the stack. That's good. What else? You guys live in the beautiful age of open source code. All right, you can look at stack overflow examples so you can have somebody else's crappy interpretation of what it does. Read the source. You can read the source code, right? That's the beauty of open source tools and open source compilers. You can read the source. If you ever need to figure out how something in Linux works, you can download the entire source. I mean, it's gonna take you a while to figure out what's going on, but you have that ability, right? So you can do all those. What else? Is there anything else? What was that? Reverse engineer. Reverse engineer, in what sense? What does that mean? Take something that converts the machine code or whatever back into, so get the raw executable format and basically read and have something that can open it up so you see the move it, assembly code or whatever, and go through that and figure out exactly what's going on and then read. Yeah, so we've actually, I think, established basically all the different ways you can try to understand a system, right? You can read the documentation, you can read the source code, you can read what other people have written about that system, and you can try to reverse engineer it yourself based on the output, right? You give some input to the system, you see what an output, and then you go back and try to answer, okay, why did it do this, right? What happens if I change things? So that's what we're gonna do here. We're gonna try to look at and try to understand what GCC, how GCC works and how it, where it puts global variables. So we're gonna have a super simple example. So we have global variables A, B, and C, right, two n's and a float. We have our main method. We're gonna set A to 10, B to 100, C to 10.45, we're gonna set A as equal to A plus B and we're gonna return zero. So for global variables, right? So let's think about it this way. So okay, those other crazy ways of maybe thinking about variables notwithstanding. And normally you wanna put it somewhere in memory, right, of your program. How, how does the compiler decide where to put these global variables? Elf files, ooh, that's really good actually. Yeah, so that's actually where, well, when the program's compiled in the elf, the elf, which is the, oh, I forget the name, what elf stands for, executable and loading format or something like that. Or linking format, maybe that's what that'll support. Yeah, it has to have some way of specifying where these variables are. But why do they need to, so, we kind of already said, okay, they need to specify where these variables are. But can the addresses of where these boxes are for A, B, and C, should they change every time this program's run? I don't understand, they need to, they can. How about this code? So let's think about it in two ways, right? So not only do we need to define where this variable is used, I mean, define the location, define a location for these variables of where they are in memory. But now, what happens when the compiler wants to say A is equal to 10? When it spits out x86 code to do that, does it tell the CPU, hey, set A to 10? No, it says whatever the memory location of A is. Exactly, so it needs to know, right? These instructions here need to know exactly where am I putting this value 10 in memory, right? Where does variable A live? What is essentially the address of variable A, right? And the same here with B, C, and to do this addition. So what the compiler basically does is while it's compiling this, it's gonna just decide on memory locations for each of those global variables. So I'm gonna do it kind of symbolically here, but we'll see some real addresses that it actually picks. So this says, okay, variable A is at memory location capital A, variable B is at some location memory B, C is at some location C. And so when it's going through and compiling these instructions in main, it generates assembly code that's equivalent to, hey, in memory, look up memory address A and set whatever's inside the set, the value inside there to be 10. And then get memory B in memory, look up memory with address B, set it equal to 100. Memory C, set it equal to 10.45. And then set memory A is equal to memory A plus memory B. So actually once you do this, right? Once you have addresses for here, compiling this code becomes fairly easy, right? Because all you have to do is use whatever assembly x86 instructions there are for copying memory and doing addition. So for instance, one time that you compile this, right? And so it could say, okay, A is at location 804, 96, 34, B is at 804, 96, 38, C is at 804, 96, 3, C. And then it compiles all of these basically our pseudo-instructions into real x86 instructions. So the way to read this is to move hex A, which is a constant, so it's a dollar sign constant, into this memory location, right? 804, 46, sorry, 804, 96, 34. So what's hex A? 10. So just from the usage here, what do we know that this is the address of? The memory location? Memory location A, right? Because we can easily map this to the C code, right? We know that C code moved 10 into variable A. And we can see here that constant value 10 being copied into some memory location. So we know just for looking at this code that that's where A is. And similarly we can see move 64 into this other address, we'll call it 38, move 412733333 into EAX. What is this? That's the float value. That's the float value of what? 10.45. So how big are floats in C? 32 bits. 32 bits, right? So this is moving that value into register EAX and then moving EAX into that memory location, right? So one question, since we're looking at this from a black box perspective, why did we do this instead of this? So let's think about it first. Does the CPU know that this is an int and this is a float? What's hex 64? It's 100. I don't know what else on my head. I do those things. Most of the operating systems have calculators, right? Actually the Mac OS calculator has a programmer mode that you can easily switch between 16 and 10. Good, yes. Yeah, it's awesome, like I saw a student like, wait, was it one of you? No, I don't think it was. I think it's a student of my grad class who would look up online like a hex calculator. Like, you know you're out, OS does that. All right, yes. Okay, so back to the question, right? Does this, does the compiler, so is this a valid integer? Yeah, what is an invalid integer actually? Or int, let's just use integer, right, int. I don't think there is one, right? Any 32 bit pattern will be a valid int. Depending on what kind of integer whether it's on the sign or sign. It'll have different values. Yes, so how do you interpret that integer, right? Unsigned sign may vary, but fundamentally any 32 bit pattern can be an int, right? Yeah, that's not 32 bit, right? That's the trick. Yes, once it's exactly, once it's bigger than an int, you need something else, like a long, right? So I can have, I don't know what, maybe I'll do it, I don't know exactly what this value is in decimal, right? But if you said A is equal, let's do that. So this is what, four, one, two, seven, three, three, three. Right, so if I said X was, what's this, one billion, 93, I'm not even reading it. If I said A was this value, right? Then we would see in the code, hey, move this hex value into whatever that memory location is, right? We'd see the exact same sequence of operations. So to the CPU, or at least, I guess we should be clear, we're talking specifically about the x86 CPU, right? So could there be CPUs that know about floats versus ints? Yes, is x86 one of those? No, and I think most do not, right? So to our program, we have no idea that, so then why does it move it into a register and then move that register into this memory location? Yeah. So the compiler knows it's a float, so it moves it into a floating point register so the CPU can handle it and then it moves it. Very, actually it's a great point. Not quite in this instance, but yes, so that's the important thing to know. This code that is generated, the x86 code does not know about the types, but the compiler knows about the types, right? So if we're gonna do addition on this floating point number, we'd actually see it move this value into floating point registers and use the floating point registers to do the addition, right, because the compiler knows that. But EAX is a normal register, so it's actually not a full floating point register, it's just a 32-bit register, but that's a great point. Yeah, there are special floating point operations on the CPU that the compiler will take advantage of. So then why? Any theories, hypotheses, guesses, punches, anything? Probably because the instruction has to be a certain length. You can't actually have that one of a number directly moved into a memory location as you move into a register first, then it can move. Yeah, that's actually, so I actually don't know 100%. I think you still should be able to do it, right? So what assembly language have you studied? MIPS, and MIPS I believe is a fixed-length instruction set, right? So x86 is not fixed-length, it's variable width. But I think it's because actually, so the real answer is because the compiler decided to do it and it's semantically equivalent to doing it the other way, right? So to us, the programmer, we don't care if the compiler uses one assembly instruction or 100 assembly instructions to do whatever we want it to do. As long as by the time this thing completes, that 10 is in an A, or here after this completes, 10.45 is in C, right? That's the only thing that we care about from the programmer's perspective. So this is likely an optimization for that reason. So I think you could do this, move this value into here, but the resulting bytecode would probably be longer than these two instructions. So whatever GCC has some optimizations and if you compile this with different versions of GCC, you're likely to get different x86 code. Yes, all kinds of stuff. So that's part of the problem with trying to look in too much into this reverse engineering part, right? That's why we want to get kind of a high level of what's going on. So the next thing that happens is it's gonna move 634 into EDX, and what was 634? A, and then it's gonna move 638 into EAX, which is, 638 is B. So what is it setting up right now? It's setting up the addition. The addition, right? So then it's gonna, okay, this one's a little bit crazy. It actually is really, well, okay, once you understand what it's doing. It's the one addition. Yes, so it is doing an addition here. It is, so load effective address is a way to easily do address computation. I'm not gonna spend a ton of time on this, but it's basically take the second parameter times the third parameter. So EAX times one plus EDX, move that into EAX for loops. So by loop, so if you think of, this is the starting address of your buffer, right? This is the size of the buffer, and this is your index into the buffer. So it's gonna calculate the offset of that buffer by doing just simple addition. So is there no, just straight up adder? Oh, there is, there is. Why would I choose that? I don't know. Wouldn't that be faster? There might be faster. That's why we trust compiler writers to write compilers. Also saying, this is GCC 6.7 that we're using, right? I think the newer one will actually change this to an add. Why is it putting the value in the address that was assigned to you? It's not, remember, it's only putting it in this register. It hasn't actually committed it to memory yet. So we have to actually look at the next instruction to see, it's gonna move this result into 34, right? Where 34 is A, exactly. Yeah, so that's one of the tricky things is decoding this. But I guess I should probably just say now, you don't have to memorize how all this stuff works. We're using this to understand how real systems actually implement these techniques. Is there a way that you can actually see the assembly language is being generated based on your own C code? Yes. Yes, so let's actually look. Okay. So I have it in the notes here, but I'll go over it right now. I'll like teed up, ready to go. So let's see. Writing up all the compilers. Yes, writing code in front of all of you, it's always so awesome. It's like three lines, or like five lines, like two. All right, yeah, we'll do it. Okay, let's see. I believe I'm in the wrong class. What class? Who are you, people? What am I doing here? Do you want an exam, so I have to go? I don't think so. I don't think so. Ha, ha, ha. Okay. CSE 545, if you've forgotten. I don't know how that helps you. I am right now. Okay, here we have ABC main. Return zero, so we're going to compile this. GCC. We're gonna compile it in 32, which makes it be 32-bit compilations. This is a 64-bit operating system, global example.c. So it's then gonna compile it into write a.out. Then I can use object dump dash capital D. So this will disassemble all sections of the file of this a.out binary, and then I'm gonna pipe that through less so I can tap through it. So we can actually see that this terminal is way too big. The font size here kills me when it's... Okay, close enough. All right, so we can actually see there's a lot of junk here. This is actually all that elf stuff. So yeah, part of it, there's elf stuff, there's some libc stuff in here. We can search for main, and this is our actual function. So what's really cool is on the left here, it shows you when your program executes at these memory locations are the instructions, the code of your program. And in the middle column are the actual bytes of each of these columns. And then on the right are all of the x86 interpretations of those instructions. So assembly is just a very easy one-to-one mapping between assembly and the bytes. So it's very easy to go back. It's a lot more difficult to go back from this to the C code, right? We can see here that we move A into, oh, I did do the same one, 804.9634. Move 64 into 804.83. Move that guy, our floating point guy, into EAX, EAX into that thing. Move this back into EDX, move this into EAX, the load effective address. Then move EAX back to 34. Move zero into EAX, the pop, the return, and then we're done. So you can do this with your programs. It's actually really instructive to look at these and see, okay, how does the compiler actually do four loops and all kinds of loops you can use. I don't know, I learn a lot every time I look at these things. Oh, and then we can do, let's see, I'm gonna do this read to elf. So we can look at the eight out out section. Let's see, there's some way to do all of the sections. Yeah, dash S. Read elf dash S. And this will show, this is like really bad, okay, how does it matter? No, not with that one. Normal size, there we go. All right, can y'all read this? No? It's fine. Okay, basically what this table says, so in this elf file, right, this elf file is just a bunch of bytes on disk. So this tells the operating system how to turn those bytes on disk into an actual program. So it says, for instance, the dot text, yeah. So at this segment is all of the code that we just wrote, our x86 instructions, and it tells us, if I go to the top here, it actually tells us that, okay, so this offset is the offset in the file, two E zero, so at byte two E zero into the file, four 18C bytes, that length, put all those bytes here at 080482 E zero. So put that in memory, so the OS knows exactly how to map that into memory, just like, I believe, let's see. Any remember off hand what those addresses were? 4963, I think these will be in here in the BSS segment. Yeah, so this is where our variables are gonna live, so this is how it tells the OS there will always be memory at this location here and it will be properly allocated. Cool, and then, let's see, is there anything else interesting here? So a whole bunch of stuff that happens for dynamically gaining and all that stuff, but we won't go to that right now. Local variables are actually pretty easy, right? The compiler just decides and says, okay, here at this memory location is this value. So what constraints do we have on local variables? What does that mean? It's in the scope it existed, you need to scope it, believe it or not, usually placed on the stack. Why placed on the stack, though? Isn't that just an academic, or is that a, sorry, not an academic one? Conceptual. Is that an implementation detail? What's an actual thing? That the compiler actually implements. Right, yeah. And they're usually temporary. They're usually temporary, what does that mean? So we know the scope rules, so we know that after that scope leaves, that variable should go away. So that's actually an incredibly tricky question. Because, so to be very safe, yes, you always have to write variables to memory because if you think about, you can memory map processes to each other, so that they share memory. So then you want to make sure that every write to that variable is reflected in that other process. So you need to ensure that there's no way for the compiler to optimize that write to memory away because it's like, oh, you're not using that. So there's actually, in C, there's the volatile keyword. And that means that that memory, you always have to access, you can't store that variable in registers or anything like that. But yeah, there's a whole bunch of stuff about memory safety and what kind of optimizations the compiler can do. Okay, what other things about local variables? Can other functions access our local variables? Are functions local variables? I mean, functions have scopes in my variables, and I don't know if they're exactly the same. Yeah. So like, if you have an integer local variable and then you call another function, but you pass that function, the address of that integer, you can access that local variable. Okay, say that again. If you have a local variable and just an integer and then you pass another function, the address of that local variable, that function can then access that variable. Yeah, right? So by default, no, as programmers, we can't just straight up access some other functions' local variables, right? Even if they're the function that called us, I mean, we shouldn't be able to, right? They have to be passed as parameters to our function, right? Yeah, so where can, so think about you're writing a compiler, right? We're, so we wanna think again about kind of broadly designed things, right? So where could the compiler place local variables? Could place them on a stack? Registers. Registers? Yeah, it could place in registers, right? Registers are a lot faster than memory. But on global memory, right? Why not just use the same technique we saw and for every local variable, give it a global memory? Could it then be accessed by the rest of the program? Could it then be accessed? Well, the compiler's doing the, I mean, technically, yes, but technically, your memory can access, like if you have, if you're accessing memory directly, you're violating the semantics of C kind of, right? I mean, like under the hood, though, like you could then interpret, like. Right, the compiler could still, like in terms of safety. Right, so the compiler could still enforce, right? Security mechanisms that say, hey, only certain, like, only within the scope can you actually access this variable, but still implementation-wise, place those variables in global variables. So why would we do this? What's the problem with this? Seems a lot simpler, right? Why use a stack? Why not use global chunks of memory? Yeah? Well, if it's a big program, do it. If it's a big program, you use a ton of memory. Yeah, that's a good point, right? So you have, for every function, you've got functions that never get called, right? But you'd still have memory allocated for that function and those local variables, yeah. If you put your local variables in global memory after the function is done executing, you'd have to clean it out, right? Or I'll keep this waste memory. Maybe, yeah, but if we're allocating it beforehand, but still, it could be actually, I mean, kind of with both those questions, right? That could actually be a good argument for doing it like this in an embedded systems environment, or a real-time operating system where you have, like, hard guarantees on resort usage and all this kind of stuff, right? So here, at least it's bounded, and you know it's only ever gonna use this much memory. Yeah. But let's say the compiler enforces the access, right? So this is just an implementation detail, right? So the compiler enforces that, hey, you can't access somebody else's variable just like it does currently, right? You can't write code really without passing addresses and all this kind of stuff to access some other functions variable, yeah. I can see you writing the issues with having multiple variables with the same name and the compiler not knowing which one you're trying to access. Okay, good. Multiple variables with the same name. What does that mean? So like if you have a for loop somewhere, pretty much any time I use a for loop, it's always gonna die equals zero, yada, yada, yada, but I have probably like 24 loops in my code. I'll have 20 in I, but if they're global, which in I am I looking at? But the compiler knows, right? The compiler always knows based on the scope which I you're referring to, so it could give maybe each I a unique name that identifies it, like append the scope to the name. The extra things that it doesn't need to do. The issue that would happen there is if you, the programmer, make an error, you're trying to access an I that you shouldn't, the compiler, it's nice if it yells at you instead of giving you the wrong data. Nice, okay. I'd still argue that the compiler could still enforce that, right? Because it knows all the scoping rules. So when you access an I, when it sees an I access, it would know do the scoping rules allow this, right? But what if you're confused, what if what you want is not what the scoping rules are saying? What if you say I want global into I? Well, you declare it globally, right? Just like normal. This is an implementation to you, say, all right? But I mean, like, what if I wanted to read my mind when I say I wanted to know exactly what I want? What if I wanted to read my mind? I would love the computer to read my mind, yes. So let's think about it this way. I'm gonna argue that you, this is definitely something you completely do not want to have because it restricts what types of programs you can write. Why would that be? So let's think about it like this. Let's look at a simple factorial program, right? What's the factorial of n? What's the base case? n is zero, return one, otherwise what else? Yeah, return n times the factorial of n minus one. So let's say we use global memory to store, well, local variables and local variables include parameters, right? Could we write this factorial function with using global memory for the values? Why can't you do recursion? Because that n only exists once. Right, every copy of this n is different for every invocation of the function factorial. But if it's global and you only ever have one allocation of it, right? The programs are gonna be overriding their local variables. And this would happen any place in the call stack, right? This is, I mean, recursion is one thing when you want the same function call itself. When you have one function a that calls b, b that calls c, c that calls a, a that calls b, right? You need to have multiple invocations of a function. So this idea is kinda out, right? We don't really want that. I mean, so there actually are older program languages. I believe like, I wanna say Fortran, they actually did this. So you couldn't actually write recursive functions. You can only have sub-procedures. So you can only call one function to do something and then that function could not call any other functions. Well, it stores it in memory. Yeah, just like, right? So any local variables on here, right? Because we saw a stack allocation, right? So anytime a function's invoked, there's gonna be a new copy of local variables specifically for that invocation of that function. Right, we don't have to worry about if that function was previously called before we call that function. Otherwise, there's gonna be some overwriting. So the alternative here, instead of using global memory, right? Let's use what's kind of, well, let's call scratch memory, right? So let's use some memory that we're free as, not really the compiler, well, kind of as the compiler as the x86 code, we're free to move and write to and change things, right? And we'll know we're not gonna erase real memory. We're not gonna destroy anything else that's happening. And so this is where the stack in, stack comes in. So this stack is essentially scratch memory for functions, right? And this scratch memory also comes in handy. So how many registers does the MIPS machine have? 16? Seems like a lot. No, it's like 30, something. I think like, 30? 30? Well, I'll say it's like a lot more. So it's a mini window. All right, this is making my point not very great. Okay, how many instructions, how many registers does x86 have? I actually don't have the number on the top of my head. I believe it's like around 15 or so. It has about five or six general purpose registers and there's a bunch of other registers that do other things, right? So, right, where does the CPU actually perform computations in the CPU from values in registers, right? So typically you can't store, well, most machines, you can't say add this memory location to this memory location and put it in this memory location, right? You have to bring the values into the registers, compute on them, and then store the results, right? So this is another reason why you might want scratch memory. Maybe you're computing something. A plus B plus C plus G plus E plus G plus Q minus 12 times 13 or whatever, right? There's a lot of intermediate results that actually may not fit in all your registers. So in this case you want some scratch memory to save values that you can come back to. So the stack is used in MIPS, ARM, x86, x8664, a lot of different architectures. And so what we'll look at in this class, whenever we draw stacks, they go from high to low. So high memory to low memory and they grow decreasing. So this means that functions can use this stack by pushing values onto the stack and popping things off and making sure that they can store values there. So on x86, register ESP, SP for stack pointer, has the address of the, what we call the top of the stack, right? It's thinking of the stack logically where you push things on. It's the bottom of the stack when you think about it from high to low. So we can push things onto the stack so we can push values, like we can say push the contents of register Eax on the stack, which will store the value of Eax on the stack and move the stack down. And we can do the opposite. We can pop values from the stack into registers. Cool, all right. So we'll continue this on Wednesday. We're gonna go over a detailed example of how the stack works and how x86 can support functions.