 Okay, so we'll start today by asking if there's any questions. Yeah. Can you post a PowerPoint before class, because I'm right now. Can I post a PowerPoint before class? Yeah, I will try. I didn't get to you today. So, yeah. Any other questions? There, I guess I can reiterate a question that came up before class. The question was in the form of, there's some error or something is wrong with the lexer. I changed one line in it, and it doesn't work on this completely new grammar that it wasn't designed for. So this responses, yeah, that sounds like a problem. It sounds like you should look into the lexer and figure out why it's not doing what you want it to do. So it's a great quote. Oh, I don't know. Okay. All right, so we're going to go right along this here. Yeah. We're allowed to use, like, I modified the old lexer. Yeah, so that's fine, right? Yeah, yeah, so you can use the lexer that was given to you previously. You're allowed to do that. You have to remember that that lexer is not for your grammar and your specific tokens and language, right? So it's up to you to make that work with whatever you want to work with. Yeah. You can do whatever, you can do whatever you want. You can have it as an external file. You can copy that code directly into it. You can rip out functions from there and use those directly and throw away all the rest. I don't care. I'm just saying, like, it's not plagiarism if you use that code. That's basically all that statement says. Any other questions? Yeah. You got to speak up. So I don't understand a lot. So you uploaded it and the testings didn't pass? So it sounds like you have a bug in your code and you're not implementing a spec correctly. So you have to read through the investigation and understand what your output is supposed to be. All right. Anything else? All right. Okay. So just to refresh everybody's memory since, well, maybe I don't need to since we just have a midterm. So hopefully you're all full of knowledge on everything that we've done up till now. Or has it all been empty onto the page? I just don't all know. I don't all knowledge. That's exactly what I as an educator want to hear. I learned just enough to pass your midterm and I forget everything as I go forward. Awesome. Well, more things to learn so that you can forget them later. Okay. So we talked about how to represent how we can draw memory diagrams with the box and circle diagrams, right? So that helps us visualize and understand the semantics of assignment and how assignment works in a program. The one thing we didn't talk about and this will help finish us out on semantics is what about memory allocation? So we kind of looked at, like, now offered some other things at a very high level and we didn't really get kind of into the details about what they mean, what the semantics is. So memory allocation basically refers to very generally how to create new locations, the new boxes in our box circle diagram, new locations and reserve that associated the address, the memory address that is associated with that location. And so we're not going to go into the details of the memory allocation system. But the basic idea is we need to be able to, our program is going to find some memory that's not currently reserved, right? We want to make sure our, the memory allocation needs to make sure that it's not giving up the same memory to the same parts of, to different locations, right? Because then you'll have a huge problem when you try to dereference or access that same memory. So and after memory is out and so after some memory is found that's not currently reserved, it's either going to associate that memory with a variable name in the case of the declaration, we'll see. Or it will somehow reserve that memory and return the address of that memory to our program so we can use it. So in the case of malloc, this is essentially what malloc does. It's finding us a free piece of memory, an address or it's finding us a free unreserved location in the memory and then it returns the address to that location. So then deallocation is really pretty easy, right? It's the reverse of this process. So deallocation says, okay well this is how we release this location back and say this address can be reused. So how to release it? Because otherwise if we never were able to release memory, right? Our program would just keep consuming memory over and over and over until they eventually consume all the memory in the machine and then crash. So this is kind of a high level, the two concepts that we think about when we think about memory allocation. Questions here? Okay. So there's three main types of memory allocation. So these are very important because it's going to help you think about how long are these allocations valid for? So one is global allocation. So this would be global variables. So I think about when is it allocated and when does that allocation end? So global memory is allocated once. It's done once and it's never deallocated because various parts of the program could use that memory. Because that address or that variable is global, right? It's accessible to all parts of the program so it doesn't make sense to never deallocate it and give it back. And to say unreserved that memory address. The other really important type of memory allocation is stack allocation. And so this is when memory is, so the name comes from the memory being allocated on the stack but as far as I know that distinction isn't super critical. But what's important here is that this memory is done automatically when a scope comes into scope and is deallocated when it comes out of scope. So this is things like nested scopes or function calls. So when you have a function call, those local variables are stack allocated. So they're allocated on the stack. They're available to the program. And then when that function ends and returns, those are automatically deallocated by the program, right? And this is kind of, you can think of it in some way as this implements the scoping rules that we've been talking about, right? So the scoping rules in C are variables are valid in the scope and in the block that they're defined in. And so here this is why. So we have to, when we declare a variable, right? We need to have some memory, some address, some location that's associated with that. And so this is done on the stack. And then when the function leaves, we leave that block, then that gets cleaned up and it's no longer accessible. So it essentially gets, that memory gets released back to the system so it can be reused. That's the one? Okay. The third one is heap allocations. So this would be something that you're all very familiar with. So this is when allocation is explicitly requested by the program or specifically the programmer, right? So this is using functions like malloc or the new. You're requesting, hey, I want to create a new location, create that new location and give me back the address to it so that I can use it. And what's not said here, what's implied here, if your program is explicitly allocating that memory, well, it's up to you to explicitly deallocate that memory, right? So that's why you have to, in C, I'd call free after you malloc something when you're no longer using it because unlike these previous two allocation types, stack allocation is just done based on scope or function calls. So as soon as that scope is done, the program knows they can clean up all those variables and they're no longer being used. Whereas with the heap, when you malloc something, well then you're saying, hey, I want this memory and I will be responsible for deallocating it and giving it back when I'm no longer using it. Questions there? Okay. So now we look at some examples and try to identify what types of allocations are being used in the program. So this is, I believe, from the practice midterm, I want to say, both from there. And so let's think about some of these variables and think about where they're allocated. So what about this variable up here at the top, which of those three types of allocation is this? Low ball. Low ball, yeah. So it's allocated once and that memory is never deallocated. So it's always valid throughout the entire execution of the program. Cool. That's pretty easy. Okay. Let's assume that we're in Maine and we're about to go into this block. And then so where is this care star X? Where is that location? How is that allocated? SAC? Low ball, heap? SAC. SAC? Why? Was it? Yeah, so we did request new memory. So we're going to call malloc. So it's clearly not a heap allocation. And care star X is a local variable. It's a variable whose scope is this block. So we know that it's only going to be accessible during this block. So after we leave this block, we know we can get rid of it and automatically deallocate that memory. So I actually didn't go into it here. But what's the value inside X? Was it? Testing. Testing. No. Yeah. So it's a memory address and the location that that address is associated with has the actual bytes testing followed by a zero at the end. And actually what the compiler does is it sees this literal string and actually turns out into a global variable of actually kind of by X. So that's actually how it does this here. Cool. But I forget the details here. OK. So let me execute some stuff. Doesn't really matter. Then when we leave this scope, what happens to care star X? It's deallocated, right? It goes away. That address, whatever address was there at care star X, whatever address that location had is no longer there. What happens to wherever testing is? Does it get deallocated? What was that? Yeah, it's a global. It's a global. So no. But it's not explicitly global. But yeah, if you're looking at how this is done, that's actually how it's done. It's kind of cool. Yeah. So after care star X was deallocated, can you assign another variable to the phrase testing? Would it find that testing in the global? Or would it create a brand new one? It depends on the implementation, I think. I think in C, it will probably reuse it because that variable is probably a redundant memory, so you can't write to it and change it. Because that's the big problem, right? Because if you were to change this, use X to try and overwrite something in testing, would you actually do that? I'm not 100% sure. But depending on the language, some languages will do that, is they will make sure that any reference to testing is going to point to the same object, basically. So yeah, they'll try to reuse it. That'd be a fun thing to test. Okay. But then we get into a function like, let's say, var. So where is var C stored? What's the type of allocation? Stack. Stack? Yeah. How are you going to stack? It's a local. So you can either use process elimination, right? It's not global. And it's not being mallocked. So it's stored on the stack. Here, we actually have the value, right? So the value inside the location associated with C, is going to have the value of C. So that's actually going to be there, as opposed to, in this case, we had a pointer, right? So there, I have the address of the string of testing. And then we get to the end of this function. And it gets to the end of this function. And what happens to C? The allocated. Yeah, goes away. It's no longer, it's, whatever address is there, can be reused by the programmer. Okay. Then we get into var. So what kind of memory allocation is int star x? We'll say stack. So int star x, what is int star x? But x is the name, and it's bound to what? A location. It's bound to a location, right? There's a box. And so that box, is that box stack allocated or heap allocated? Stack allocated, right? So we have to make room for int star x, wherever we're going to put the address. Right now, when we put into it, where did that, where is the address that we end up putting in x? Where is that? How is that allocated? Heap. Heap. Yeah. So remember, we have to make the distinction between, okay, malloc itself is heap allocating a location. Right? Returning some location. Returning the address of a location that it creates. But int star x is stack allocated because it's a local variable just like care c. Okay, so then we get to the end of this function and then what gets deallocated automatically? Int star x. Int star x, why? Because it's on a stack. Because it's on a stack, it's stack allocated, right? So int star x goes away, what about the malloc, the memory that we malloc? No, it stays, right? Yeah, we told the program, hey, give me the sort of technically, well, I guess it depends on the exact system, but most systems are going to be, hey, give me four bytes of memory and that's that size of the value that we can store in this location. But by calling malloc, we're telling the program, hey, we'll out, we'll free this. Don't worry, I definitely know what I'm doing. But then we end up, well, we'll get into it in a second. So this leads us into memory errors, which are everybody's favorite programming problem, especially if you're in a place like C or C++. So I don't know, for global memory, is that deallocated when the program ends or is it just picked up some time by the operating system? It's an interesting question. So the question is, does global variable get deallocated when the program ends? I guess it's kind of a philosophical question as to what happens when a program ends, right? So the memory does go away, but you're not executing anything in the context of your program that could ever access those. So if you're thinking about it in the context of this program, there's never a location in this program where X is not allocated. I think that's how we can think about it, right? So in that sense, no, there's no possible way that X could ever be deallocated during the execution of the program. As far as your other question, I mean, yeah, technically that memory goes back to the OS as soon as our program terminates, but that's all memory that we use in our program. Any other super weird tricky questions? Not that those are discourage, definitely encourage. Okay, memory errors, super fun errors. So there's basically two classes of memory errors we're going to talk about here. And one is what we call a dangling reference. So this is when you have a reference to a memory address that was originally allocated, so that was allocated at some point, but is now deallocated, so now it doesn't really refer to anything valid. Like the thing about in the circle box diagrams, we have a value in there that doesn't point to any box, it just points somewhere, right? So it's dangling, it doesn't have something that it's actually pointing to. So in the second big memory error is garbage. And so this is when we actually allocate some memory, usually typically on the heap, right? Because of global allocation, we don't have to worry about deallocating. Stack allocation is done automatically. So this is when we allocate memory on the heap, and it's not then deallocated, but it's not accessible by the program. So there's no way for our program to access or reference that memory. And this would be when we... Well, we'll get into examples of this, but so yeah, this is memory that we've allocated, we've said, hey, give me a certain amount of bytes, the program goes, yay, give us a location, and then we, well, we'll see in a second, we forget about it, or we have no way later to reference that memory. And so then it's garbage. There's no way for us to clean that up or to deallocate it at any point, explicitly ourselves. Questions? How does garbage clean up work if you don't know where it is? The question is, how does garbage clean up working? We don't. So the difference here is we or the program doesn't know, has no way to access that memory. So if you have a garbage collection system, what that does is track all memory accesses, and then it can see, it sees what you can see and calculates the set of all memory, essentially you can think of it, you calculate the set of all memory you can reference, which is something here, and then the set of all memory that's been allocated, and if there's anything that's not overlapping, then it knows that, hey, you've allocated something and you can't actually reference it. So it can go ahead and clean that up for you. So it's very, it's intensive. It can be expensive, yeah, but the flip side is you don't have to worry about automatically, you don't have to worry about explicitly deallocating memory, which can be nice. So it's got trade-offs, but. So if you're running, would you want to garbage more on smaller programs or larger programs? What is a smaller program or a larger program? Like a program, yeah, it's tricky. So there's like a whole field of research in the garbage collection. There's some ways you can do it in parallel with another thread or on another process, and so how can you do that and deal with same processes and things? So, well, we can't work, but right now we've got to deal with at the bare minimum, or what does garbage mean? We need to find these things. So let's look at an example. Okay, so here we've got a function at the top foo, we have a second function bar, we have another function main, and main is going to declare an int star, so called dang, and where is dang allocated? Stack. Yes, exactly. Okay, so dang is going to be set to the return value of foo. So we know that foo returns an int star, so we know that that's good. We're going to recurse into foo, and foo has a local variable x, and so how is x allocated? Stack. Stack. And it's out here on the stack, and the value 100 is placed in the location associated with x. Right, so we have x that's then stack allocated. Then when we're returning is the address of x. So address of x returns what kind of value, an l value and r value? R value, yes. It's going to return to r value, which is the address of the location associated with x. Right, so it's going to return that, fine. Then we get to the end of this function. What happens to x? Deallocated. Yeah, it gets deallocated, right? We've reached the end of the scope where x is valid, and so we get to the end of the scope, and now x is deallocated. But, you know, we're still fine on all the types, right? We've returned an int star. We've returned an r value, which is the address of an integer. And so we're going to take that r value that we get from foo. We're going to copy it into the location associated with dang, put it in the value there, and say, good, okay, well, let's continue. So now we get to the next statement. We're going to print out some information to try to see what's going on. So the percent p, print f. Anybody know what this does? Print a pointer, what does that mean? Yes. So it prints the value in whatever you pass. So it prints the value there, it prints it out in x. That's the only big difference. So you can see, kind of, it's a lot easier to think about memory addresses in terms of x values, but I'm going to start looking at enough of them. Okay, so it's going to print out the value of whatever's in dang as a percent p, so as a x value. And then the percent d is going to do what? Print out what? Print out a character, a frame, an integer. Yeah, an integer, right? And so star dang, what's that going to do? Yeah. Print out a character. Wow. What are the semantics of what's right here in front of us? Print out values associated with the address. Right, so yeah, the star, right, the star operator is going to take the value that's in dang, look up that location associated with that address, and then this print out is going to print out that value there. So yeah, that's all it's doing. Then we'll call bar, bar comes in here, creates values x, y, and z, where are these located? Stack allocation, yeah, so we're going to print them out. We're going to print out y and z, and then we return, then we return what happens to y and z. It is beautiful. They get deallocated, yeah. Yes, they get deallocated. Okay, let me get into here, here's the next print out statement. It's the exact same thing. We're going to print out the value that's inside the dang as a hex value, and then we're going to print out the integer that dang is pointing to, so star dang. Okay, so what's going to happen when I compile this on a CentOS67 operating system? So is it going to compile? They say no, but will it compile? It'll give me an alert, maybe, we'll see, yeah. Everything is absolutely not going to compile. Okay, so yeah, so it's going to give us a warning, and it's going to say in function foo, function returns address of local variables. So actually the compiler does kind of see what we're doing here, but it allows us to do this just fine. It still compiles it. This is only a warning. So it's still going to actually compile this program for us. And then when I run it out, what's it going to output? So are we going to know what's in here? Are we going to know this percent P? Yes, no? So are we going to argue for one side? Yeah. So what about this percent P? Right here. So this is going to be a memory address. Whatever memory address is inside dang. So basically you can really think about it whatever if returned by foo. So I guess the question is, can we guess what that's going to be before name? No? Before name. Before we run it. Right now. Could you tell me what that, what is how it's going to be? No. Okay. No. So how did I already hit it? Okay. So star dang. Ah, that's fine. Okay. So star dang. So it prints out a hundred. But does it have to print out a hundred? No? Yes? Maybe you can find it. Well, just a question. I'm asking the questions here. I guess with respect to C then, when you deallocate memory from the stack, it doesn't do anything with the values, right? So a hundred still could be in it, or something else could have already written to it in the background. So yeah. So from the semantics, right, we're using, we're trying to access memory that's been deallocated by the program. So really the semantics here, anything can happen. It could give us an error because we're trying to access memory that's been deallocated. It could return the value that was in there. Or it could return anything, yeah. The way the program is certain space of memory that is, that belongs to it, so since there is no other allocations between when you store the value into X and when the value in that location is printed out, then yes, it's going to be. But as a guarantee by the C standard, if you rank this on every single C compiler, is it always going to be 100? No, because you can't rely on this behavior, yeah. What happens if I thought the 100 would be more of a local? This 100 here? Yeah. So this X here is local to this function foo. So it means every time you call it, a new in X is allocated on the stack. So stack allocation, 100 is copied in. And then when this function returns, that memory is deallocated. And then you can bring it back to the system to reuse. Not only personally, but just like we had earlier. Oh, so basically you can think about it, like I'm returning the address of a variable that's only valid in this scope. So the high level console, this is, so dang here would be a dangling pointer because it's pointing to, or dangling reference, because it's pointing to something that the system has already deallocated. I'm guessing this is going to be referenced if you call bar, but I'm guessing because it's a stack, it popped off the top, let's just say a dereference, right? So when you try to allocate something else locally, it'll replace that 100 with whatever value that is. It could, who knows? So that's kind of the thing, right? So the big thing is, and the reason why that's a compiler is because you can't rely on this behavior being consistent across different compilers. You can say, well maybe on this specific system, I think I know what it's going to do, but actually the compiler could be smarter than you and could say, well, I'm going to actually inline this function foo so it's no longer an actual function call, so the stack is actually never changing, so then maybe x is something else, and so it can do all kinds of objectizations. It can just return zero, I mean that's a valid thing to do because an address isn't valid after you dereference, or after you deallocate it. So yeah, so the compiler could just say replace foo with like a, I don't know, return zero or return 100 or whatever. So yeah, okay, so what's the next thing that's going to print out? The next line. Yeah, 10,000 and zero. I guess I'm using these. I actually count every time. Yeah, 10,000, so it's going to, so it'll print out this line, right? It's going to go into bar, call bar, printf is going to output 10,000 and zero. So yes, that's what we expect, right? That's no funny business going on in here. It should work just fine. So the next line, what's going to get printed out? So same address and what's going on? Hopefully another value. Maybe another value. Is it going to print out 100? Probably not. Probably not. I guess you're wrong. You're considering a set of all possible values that could print out, then yes, it's probably highly unlikely it's going to be 100, but okay, so just happens to be, so same memory address, right? So the value inside of dang did not change, but when we dereference dang, it now points to something that has a value of zero. Very weird. And the reason is because that memory has been deallocated and it just so happens that this system reuses that memory very quickly. So it actually is accessible and it got changed somehow when we called bar. Now we don't know if that's exactly from this line of z, so that z happened to have the same memory address as what we pointed to here. It could be, it could not be. But now we can look at another example on the Mac here. So this is running the GCC that's probably claimed. So is it going to compile? So why yes? Why don't you say like, definitely yes. What's that? Yeah, that's actually a good answer, right? So it's a valid, I mean, okay, I guess, so it's a valid C program, right? It compiled on one compiler. It should definitely compile on another valid C compiler. Otherwise we would have problems with our compilers or problems with our program. Okay, but it's also going to give me a warning. It's going to say address of stack memory associated with local variable x return. Now of course the compiler here is being very, actually this is kind of nice, it's being very helpful and showing me the exact code where it's doing that, which is kind of cool. But I could, you know, I could be smarter than a compiler to get around these warnings, right, and create code that looks kind of funky but was semantically exactly the same and it wouldn't do complain at all, it would just do it. Okay, what happens when I run this? Is it going to be the same address that you saw previously? Why? Because, no. Because what? Why is it not going to be the same address? Yeah. Doesn't it have some kind of like queuing system like, hey, you want memory, take this one, and then now you're at a different place in the queue. Ah, maybe at some level. So there's a couple different layers in here, so we're going to adjust this program layer. So the short, I guess, answer is no. There's no guarantee that we ever get the same memory. Like the layout of memory that we get is not guaranteed at all by the C semantics here, right? It could put us at, I don't know, starting at zero and going down, it could put us at wherever it wants to. Yeah, you could compile it again, right? Yeah, you could compile it again, it would be completely different. If the computer cuts you a slice of pie, you could probably be in a different part of the computer, right? No. So there's also, the other tricky thing, operating systems, are you taking an operating system yet? No, I think it's over. Is that in here in it, or you kind of took it? Okay. Yeah, so there's also virtual memory. So there's a physical memory of the actual bits on the memory chips in my computer. Those have addresses, but if every computer can reference, or every program that's running could maybe mess with each other's stuff, you basically want it so that each program thinks it's using all of the memory, and the operating system maps between them and says, oh, actually, in program X, your location 1,000 is actually on this chip, in this specific memory location, where program Y, your memory address 1,000 is actually a completely different part. So you have two programs with the exact same memory. They think they have the same memory location, but it's really different physically. So that's kind of a long answer to say. I mean, they could have it, but really we're running it on a completely different operating system, and modern operating systems have stuff like ASLR, which is called the address-based layout randomization, which explicitly randomizes where things are in memory every time the program executes. So not even each time it's compiled. So, yeah, so here we get a different address. We get 100, which is interesting, it's the same. So then the next print, what's it gonna do to that? 10,100, yeah, that's not gonna change. So what about next? So what's gonna be printed out? Same address, and something, that's good. Okay, so same address, yeah, definitely same address, but now it prints out 10,000, so weird. Yeah, so specifically why it prints out 10,000? Yes, so it may have to do with that, it may have to do with what other things it could put onto the stack in between function invocations, it could do with how optimizing the compiler is, it could be a lot of things. The big point here is that, like, here we're trying to access memory that's been deallocated, so the results of which are undefined, unknowns, this is a really bad memory, this is a really bad error, because if you were reliant, if maybe you made this mistake, and you're reliant on that value changing to zero or not changing, but now it changes when there's a function call or depending on what functions or what compilers, right, you have a program that's doing weird, crazy stuff, so this is why it's a really big memory error. So if at a point in a program execution, like I said, okay, here, does anything have any dangling references, like what variables have our dangling references, you'd be able to say, oh yeah, dang here is a dangling reference, yeah. Could you use it as a random factor? It'd probably be a very poor random factor and maybe the same every time you run it, just it would be different on different systems. So it'd be a very bad random factor. Okay, so this is one thing, so this is returning stack-allocated memory, returning the address of stack-allocated memory and trying to use it. So there's another way we can get a dangling reference. So here we have some code, so we're declaring now two variables, stack-allocated, so dang and foo are both stack-allocated, and they're both in stars. So we have dang and foo, and then we're going to malloc, so this is what kind of allocation? Heap allocation, yeah, so we're explicitly allocating an object. We're malloc-ing an integer, we're saying give us an integer, so it creates a location, it assigns the address of that location into the value of dang, then we say foo equals dang, so after this is executed, what's going to be the value in foo? The address of what? The value. So yeah, so this line before is going to take whatever the address of whatever location that malloc creates for us, it returns that address, it copies it into the value of dang, the location it says there is dang, and so this line is going to copy that value into the value associated with foo, so now foo and dang are both going to have an address in their value, and that address is the address of the location that malloc created. Okay, then if we call 100 equals, a star foo equals 100, what's this going to do? Yeah, set the value, so copy the value of 100 into the value associated with star foo, which is our star foo, which is the location returned by malloc. Okay, then we call free foo, so what does free do? Yeah, so this is our explicit deallocation, we're telling the system, hey, we're done with this memory, take it back. So we call free on it, now what happens when we print out star dang? So what's inside dang? What's inside the value of dang? Which address? It's actually returned by malloc, but is that a valid address? Why? Right, because it's then deallocated, that address has been freed with free foo, right? Free foo doesn't mean, hey, free this value that's on the stack, it means hey, take the address that's inside foo and deallocate that memory, and it happens that dang foo will point to that same location. So we print it out, it's gonna try to be here if that prints something out. Then we're going to malloc a new, a new pointer, set it equal to foo, and what's star foo, so 42 is going to accommodate the value 42 into that location returned by malloc. Then we're gonna call free foo to deallocate what? Yeah, the new malloc memory location, so this malloc, perfect. Then we're gonna print it out again. So, when I compile this, is there gonna be any errors? Are there gonna be any errors? No errors. No errors? Yeah. Did you find string literal? Did you free a globally defined string literal? No, because it hasn't been heap allocated. It has to be malloc. So the way this is working internally, malloc creates some bookkeeping to know the size that was initially created and all that stuff. So when you give it that same pointer back to free, it knows how to look up and say, oh, this means I'm freeing and then that pointer plus the next eight bytes or whatever, and now that's free and we can use it again. So in that case, no. So, okay. Any errors? No? I don't think so. No. So, no errors? What about warnings? Yeah, why? I think we'll tell you that you're setting the value. What line is it going to give you an error on? Or a warning on, let's say? Yeah. This one? Yeah. So what do you think? Who thinks it's going to give us a warning on that line? Who thinks it's not? Who's on Facebook? Okay. So it turns out it doesn't give us any warnings or anything. So this is actually a very kind of classic problem with languages like C. So here I have two pointers, dang and foo, and they just happen to point to the same location, right? So the compiler, to understand that there's a problem here, would have to keep track of that and have to keep track of, oh, these are pointers that actually point to the same thing, and that means when you free one, like here, that means the other one is now dangling pointer and is pointing to unallocated memory. In a simple case like this, it may be able to do that, but for an entire program, thinking of all the possible memory allocations, it just becomes crazy. And so this is why it's actually very difficult to statically analyze a program like C to say, hey, are there any security vulnerabilities? Well, it's really difficult because these pointers can point to the same things, and you can free things and change things, and anyways, it gets crazy. So this is actually, I think, why it just gets in the errors. Yeah? You mean it's going to give us an error or not? Yes. I also don't know. I have to test it. It'd be one of those things. It definitely wouldn't give us an error. It may give us a warning. It couldn't possibly. So when we run it, what's it going to output? How many lines is it going to output? Let's start with easy questions. Two. Two? Okay, what's going to be the first thing that's going to output? A number. Big correct. Most likely 100. Most likely 100. So most likely 100. What's another likely candidate? A random number? Zero. Zero? We should just shout out numbers all day. So it turns out, in CentOS67, when I compile this, it outputs zero first. And then this next start dang, what does it output? Zero again? It does output zero again. Why? I actually don't know. Well, I have a suspicion of what it's doing, but let's compare it to what happens on the Mac when we do it. So is it going to give us any warnings? Maybe. It doesn't, though. And when we look at it, so it outputs, so it actually outputs 100 first, right? Which is kind of weird, because we freed this memory. So even the next output is? Zero. Zero? 42? 100? 42. So yeah, it actually outputs 42, right? Which is pretty crazy. And so this is because the malloc can freeze, so the memory allocation here thinks that the address that it returned before has been freed. It can reuse that address now, right? So it's perfectly valid for it to return that address on a subsequent call to malloc. So just as a little aside, I'm pretty sure this is putting zeros in here, because it actually helps your program if it zeros out the memory when it frees it. Because then, in this case, you're likely to, if your values are changing to zero after you free them and you're using them, it's probably going to cause a crash in your program or cause your program to do the wrong thing. Whereas here, if it's the same value after you free it, you're probably going to have this bug in here for a long time until something weird happens or something changes, and you don't wonder why that happened. So hey, it's a small segment. Okay. So these are the two different ways that you can have dangling references. Do you have a question? So that's the crazy thing. So that's what, so dang is pointing, so the value inside dang is not changing. So that's the address returned by the first malloc, right? But once we call free, malloc is free to reuse that address because the free says, hey, release this address, release this location. You can now reuse this for whatever you want. And so when we call this next malloc, it just happens to reuse that and say, hey, here's some memory. Here's the address. It returns it to foo. Then the star foo sets that equal to 42. And then it just so happens that those happen to point to the same memory location. So yeah, it's just a weird implementation detail. That is absolutely something you shouldn't depend on because we have a dangling reference here in our program. And so it's outside the C specifications. Whatever happens is essentially random or could be guarded, whatever. Any other questions? Okay. So these are two types of dangling references. Now we're going to look at another one, which I can't remember what this example came from. I think it was an earlier one. Maybe this was. Okay. Okay. So for example, we have main. So we have an int star star of q. So this is what type of allocation? Global. Global allocation. Global stack key. Q is global. Okay. We started executing in main. So we're going to have an int star A. What type of allocation is this? Stack. Stack. Okay. And then we go into this block. We have an int star B. What's this location? Out here? Stack. Stack. Okay. Now here we're malloc-ing memory, memory location A, storing the address of that location into the value in the location associated with A. Then we're malloc-ing a new piece of memory that happens out of the memory address of the two. We're putting it in the value of the location associated with B. Then we're setting star A to be 42. So at this point, do we have any dangling references? No. You could maybe say that a Q may be dangling because we haven't put anything in it. We'd probably specify something like we did on the midterm where I assume all pointers are pointing to null so it's not really dangling at that point. Okay. So no dangling. Do we have any garbage? No. Maybe not. No? You talking about for Q? For B. For B. If it zeroes it out, it leaves it as it is. What do you mean by that? Yeah. Are there any, so to rephrase that, so garbage is any, are there any locations that we can't access? No. So there are, at this point in the program there's five locations, right? There's Q, there's A, there's B, and then there's the two locations returned by malloc. So remember we can access the locations returned by these mallocs with star A and star B, right? So we can actually access all the memory that's been allocated at this point. Now when we hit this line, so now what we're doing is we're calling mallocs. We're creating a new location. Its memory address is three. We're copying that three, that value into the location associated with B. So then we're going to do, I don't know, star A is equal to star B. Star B is equal to star A. So we're taking, what, 42 and copying that into whatever just got returned by mallocs, whatever's at location three, or whatever's at memory address three is going out of the value 42 inside of it. Then we're storing the address of A and putting it in Q at the top. So we get here at point two. Are there any dangling references? Yes. Yes. What? The first B. What do you mean the first B? There's only one B. What does the dangling reference mean? Three times? Yeah. No, dangling reference. Just spend two slides with me. Yeah. We're pointing to something that's been deallocated. Exactly. So we're pointing to something that's been deallocated. So we're pointing, do we have any addresses inside of our pointers that have locations that have not been, that have been deallocated? No. No, so we have Q, right? Q is pointing to A, so it has the address of A. A is still valid. It hasn't been deallocated yet. We have B. B is pointing to, has memory address three. This is still allocated. And we have A, and A is pointing to this allocation, which we have for free. So we have a dangling reference. Do we have any garbage? Yes. Yes. Which memory address is garbage? Two. Two. Right? So here, after this, we can no longer access this memory location. There's no combination of stars or address operators that would get us to this memory location, too. So then we go over here, and then we say, ask the same questions. Are there any dangling references here? Yes. Yes? What's the dangling reference? B? So is Q dangling? What's the address in Q? A. Is A been deallocated? No. So Q is not dangling. We'll say A. What's the address in A? Memory one. Is memory one been deallocated? No. No. So that's not dangling. Is B dangling? Yes. Is it a nonsense question because B is out of scope? Yes. Yes. So right now, if we're talking about what a dangling reference is, right? The only things we care about are Q and A because B has already been deallocated because it's on the stack. Okay. So that's dangling references. So Q and A are both not dangling, but do we have any garbage references? Yes. Which are the garbage? What are the garbage references now? Memory two and three. Why not one? Yeah, because A has the address of one. So by accessing star A, we can get to memory one, whatever the term by this amount. But this malloc went away already, right? We lost our reference to that. It became garbage when we assigned B to this one. But now B was just deallocated. So now we have no way of accessing memory three that was allocated here. So now that's gone. Yes. But it would create a location. So it would put the address of whatever was malloc into in star star, in your in star star. And then it would just create a location with no value. So you could get in star attack as a pointer, but you're not pointing at anything yet. Does that make sense? These are the daggers. Let's check out the daggers. Okay, so yeah. So at point three, we have no dangling references, two garbage locations. Yeah. So you could think of these as symbolic addresses. So yeah, so maybe you're trying to get too well. Could we take the address of one of them at one, subtract one, and maybe get to the other one? Right. Yeah. No. So the thing to think about now, that's still garbage, because we can't access it. So these could be completely random numbers. They could be in no relation to each other. Our purposes, they are pretty much symbolic. All right. Any other questions on this? In the program, and will the heap memory be delegated to the application? The same response. So yeah, it's, yes, the heap memory will get free by the program. It's not like it's going to hang around forever. Yeah. But you need to be careful, because I mean, a lot of the programs that you write are longer in programs, right? If you're writing a GUI application, or you're writing a server application with two of the main types of applications, right? It's going to be running for a long time. Any garbage that you create is going to accumulate over time. And then all of a sudden your program, which was using 50 megs, is now using five gigs, and it's destroying your whole system. All right. So now we're going to briefly touch on one thing. So we're going to kind of, it's going to feel like we're going to go back to assignment semantics, because I want to show you different types of assignment semantics than what we've been talking about. So we've been talking about copy semantics up until now. So we've been thinking about A is equal to B. Where are the semantics of this operation? Do you want to go? Assume the voltage is, yeah. Yeah, so I'll be more specific. So copy the value of the location associated with B into the value of the location associated with A, right? Exactly. So that's what we've been talking about. So this is copy semantics, where the assignment semantics is copy. Take that value, copy it to this other value. There's another type of semantics that maybe we haven't thought about, but it's called sharing semantics. So in this case, we have A is equal to B. So now we're going to actually, instead of copying values, we're going to bind the name B to a location associated with A. So here in our circle box diagram, we're going to change the bar of the binding of A to a different location. And so kind of a famous language that has this kind of semantics is actually Java. So we'll look at an example. I will admit right now I did not compile this in Java. So if you see any mistakes, let me know. I'm pretty sure it's mostly right. So we'll see. So let's see how sharing semantics actually changes this and how we do the circle box diagram for something of sharing semantics. So we have an object A. So do we have a little kick? So we know we have a name A, right? Do we have a binding for this A? Not yet. Not yet, because we haven't assigned it to everything. And do we have location associated with this A? No, it's just the name. So yeah, we have A bound to basically nothing. And we get to B, same thing. We have B bound to nothing. And now here we have the assignment operator. So it's similar to malloc. So it's creating us a new object, a new location. And now our sharing semantics say bind the left-hand side, bind the A to the location returned by new. So let's say it's some object, whatever. So now we finally have our binding. Then when we do the same thing with B, so we're going to create a new, new is going to create a new location with an object and the values. And it's going to bind B to that new location. Now when I create a new object A, now I'm getting rid of this binding. I'm changing this binding dynamically at runtime. And I'm going to say, OK, well, there's a new, create a new object like this. And then I'm going to bind A now to that new object. And so have I created any garbage at this point? Yes. Boxes, one, two, three, starting from the top. Which one is garbage? Box one. Yeah, the top. So there's no way I could reference this location anymore. OK, then I get to B is equal to A. So before we had copy semantics, I would copy the value in A, copy it into the value of B. But now with sharing semantics, I'm going to change the binding of B to bind to whatever A is bound to. So now I'm going to change B. So now it's bound to A. And now they're both bound to the same object. So any changes I make to A are going to change the same object when I access it through B. And so now you have two pieces of garbage. So now I have two pieces of garbage, right? Object one and object three. So this is what you remember about Java? No. No? OK. You should look at our program for Java. So at this point if I do A equals 20 and print B after that, would I get a 20? Yeah, so now if you change A, so accessing A is going to access this location. And then accessing B afterwards is going to access this location. So if you call some method on object A and then get that, like call a setter on object A and then call the getter on object B, you're going to get that same value that you put in because it's the same object. Or do most average programming languages work this way? Like does C plus plus work this way when it comes to objects? Or does this can be? I think the answer is it depends. So Java doesn't have raw pointers. So that's why you can think of a semantics in terms of this. You can also flip it around and think of, it has copy semantics where everything is a pointer except for primitive types. But it's actually, this is kind of a cleaner way I think of thinking about it with the object model. So I believe C plus plus, it's still copy semantics. Which is why you have to deal with pointers and everything. Right, and you're creating new objects and all that kind of stuff. Any other questions here? Alright, good to make the time. Okay, we're going to finish semantics. Could you ask a project question instead of starting with material? Because you had chances at the beginning of class that's probably a question. Okay, good. Type systems. Super important. So you talked about semantics. So I mean, I guess technically type systems is a part of semantics but the semantics we've been talking about are really kind of basic semantics. So what does assignment mean? What does, oh jeez, I don't remember. All the pointer operations, what do they mean? What are the semantics? What do they mean behind these programs that we write? And so the type system, type systems are all about, well, kind of, I don't know, they kind of restrict in some sense the set of programs you can write. So if you think about all possible programs you could ever write, the type system actually restricts what you can do. There's some programs you just can't write because the type system won't allow you to write them. So is that a good thing? Maybe? I might think it is. Is that a good thing? That thing? Type systems. You find yourself loving the type systems. You find yourself hating it and fighting with it. Depends on yourself. Depends on what? I don't know, I'll move that in. I'll move that in. You think I'll move the compiler, is that it? Yeah. I hope not. Yeah. Sometimes I feel like I have to jump through loops to get my programs to work how I feel like they should, especially when I start getting into pointers and addresses. So yeah, so there's some time where maybe you feel like you know what you're doing and the type system is in your way from trying to do what you want to do. I remember in the opposite case where you start off thinking knowing what you're doing and the stupid type system so you do whatever you want you to do. And then you realize later that oh, actually I now have a bug because I like asked this to a thing when it's not really that thing and now I have a huge problem. So man, I should have listened to the type system. So does that ever happen? Yes. No? Yes. You're a perfect programmer? Okay. Yeah, so it kind of works both ways and it really depends on the specifics of the type system. So sometimes it can be overly restrictive to where you feel like you're in a straight jacket and can't code properly. Sometimes they can be almost too expressive and not restrictive enough where it's easy to write programs with bugs in them or maybe you don't know there's a bug until you actually try to run the program. I actually kind of felt that way the more I go with Python. Not that the type system is bad, but it doesn't do the type checking statically before it runs. So oftentimes my fat finger variable or something can make the types don't compare, right? I don't go until I run and hit that piece of code which may happen today and it may happen a week from now. So yeah, it kind of depends. But here we're going to talk about type systems generally and talk about what are all the components inside of a type system and what kind of decisions does a language designer have to make about the type system. So informally we can talk about a type system, a type in a programming language specifies some set of values and operations that can be applied on these values. And so a type can be associated with either a variable or a constant. And when we talk about values, so we're not talking necessarily numeric types, we can specify function types that we'll see in a second. And so there's four, is it four? Okay, looking ahead. Four major components when we talk about a type system. And these are all the decisions that the designer of the language has to think about and you, the programmer, have to think about understanding, well, will my program type check? Is it compatible with what the language specifies? So the language is going to define some basic types, right? So what are some basic types? Integer. Integer. In what language? Does C have an integer? Would you define it as an integer? Or about like a long, or a wide int? Yes? These are all integers? They are all basic types, right? So this is kind of the difference, right? So C has the basic types int and they have car and they have byte and they have long. Those all have, so once again, these types have a set of values, right? So an int can store a set of values that are actually machine dependent on what values can be in there. So most of the time, it's a 32-bit integer, but on different systems, it can actually be different, which is kind of crazy. Yeah, so, and on different systems, it's different, right? So some systems, like, if you're going to be like modern one, but I know like CommonList actually has an integer type and it is, represents all possible integers. You don't have to worry about, ah, is this going to be more than 2 to the 32? And then I'm going to have to like cast that to a double or a long or something, or a 64-bit int. You don't have to worry, like the compiler and the system will take care of that for you about changing and dealing with the type. So sometimes, right, the language designer has to deal with the machine, has to make that trade-off between, well, what are the types on the wrong machine because if I give the programmer that power, maybe they'll be able to write more performance programs. Whereas, like, well, really, if I want an integer, then I don't want to stop at 2 to the 32. I want to represent all possible integers. So with basic types, we'll just make it great for a good type system. What do you need on top of that? Is this all you do every day? Just use basic types. Yeah, use a defined type. So the programmer needs a way to construct types, right? Because if you can't create your own types, then you're stuck dealing with whatever the language designer thought of, and that may not be, you know, that's going to restrict you, and it may not be expressive enough. Another thing to think about is can the compiler and or the interpreter infer types? So we'll get into exactly how this is done. We're actually going to go really in depth into here. But even in the case of, let's say, I don't know, what's the result of an addition operator when you add an int to a float? What is the result of that? So the program has to have some way of inferring oh, that's the result of this operation. And then the big thing here is how does the language define type compatibility? So when are two types the same and when is it going to give you an error and say, hey, this doesn't type check. This is wrong. So type declarations. So the program language is going to have basic types which we talked about, which is included in the program language and is available to any program written in that language. So you don't have to do anything special to use these types. Whereas type instructors are a way for a programmer to define new types and really help you to define exactly what you're trying to do. Okay, so type constructors. So in addition to types, we're going to kind of define here a little bit just so we can talk about these things. Kind of a general type constructors so we can think about type systems without thinking about specifically the C type system or the Java type system even though we will kind of talk about those things. So we can construct a type that's a pointer to T where T is some type. So in C the syntax for this is in star, in star, star, in star, star, star, star, star, star, star. The star indicates it's a pointer to whatever that type is without the star. We may want to also be able to define a struct. So what? So how do you think of a struct? Basic types or what? I don't know, I'm just curious. How do you think about a struct? Do you think about a struct as like a new thing? Do you think about it as... I always think of it as like a package. Information. When you use it you're saying give me all of this information. Right, so... Think about a struct as has fields and each of those fields has types. So I kind of think that if you're grouping up these types that you wanted to this specific group. So the classic example whenever I talk about a point on a plane I'm always going to talk about a double X and a double Y and those are going to be my types but more generally we can define a struct as having some fields a1, a2 all the way to ak that have associated types. So here the syntax is field name, colon type semi-colon, new field name, colon type where every ai is a field name and ti is a previously defined type. We can talk about defining arrays so we can define an array as a range of a specific type where range can be either single or multi-dimension and we can specify the length in here specifically and how we address the elements. I'll get into this in a second to show an example. But here we're specifying that that array, all the elements in that array are the same type and they're this type t whatever we say here. Then we can talk about functions. So functions are type t1, t2, all the way to tk that returns some type t and so this means that we're defining a type of a function and the types of the parameters are t1, tk, and the return type is t and we'll get into this more on Wednesday.