 So, how was Project 3? F**k, you're awesome! Awesome, because it's over? I'm the Project 4, awesome. Cool, yeah, I'll release those details out today, so you can get started on that one. Actually, DoubleTales isn't very nice as what we're talking about today, about type systems, so. Cool, alright. Let's see, call break, go to the housekeeping staff. Alright, let's get back to it. They are a third of the way of being created? I have no idea. I have no idea. No, that would be extra work. So, it would be 500 and never going to do category. Do you want to say it's like a high number or a low number? Alright, so here we return to example C code that we've been looking at to study memory errors. What are the specific types of memory errors that we looked at on Friday? A dangling reference? Yeah, what does that mean? From somebody else who didn't just say a dangling pointer. Yeah? It's not when we have a pointer pointing to something and the thing points to goes up a scope, so it's pointing to some memory address that could be overwritten or two. Could it also have the same value as before? You ready? Yes, so dangling reference in generally means we have a reference. One of our pointers has a value in it and that points to memory that is deallocated. But deallocation could have happened in two different ways, right? It could be deallocated due to scoping rules, so the memory that it pointed to went out of scope, or the second alternative is the programmer manually freed and deallocated the memory, but we still contain a reference to that memory. So either way, as soon as that happens, right, we have a dangling reference, it's not that dereferencing it causes a problem, which is a problem, but in itself it is a dangling reference. Cool. What's the second page? Second page involves what's caused from what? De-referencing null pointers. De-referencing null pointers, which is basically the same thing as a dangling reference. What's another type of memory error that happens when your program, let's say, accumulates memory over time? Garbage, yes. Right, so what's garbage in relation to memory errors? A memory leak? Yes, what does it do? You don't have access to the memory anymore? Right, so we have memory that is allocated, but we have no way to access that memory, right? There is no series of pointers we can follow to access that memory. And so therefore we say because that memory can no longer be accessed, we can no longer free that memory, and so it's going to exist forever. Cool. So we have this little program here. We have an integer pointer called dang in the main method. It gets the result of foo, and foo is returning the address of x, where x is a local variable inside foo. Is it considered a memory error if we inadvertently corrupt aliases? Does that follow that category? If we have some pointers that are alias and we change it one place and it's magically changing somewhere else and we don't see the connection, is that considered a memory error or is that just being done? I think I would file that as a bug. A bug due to aliasing. I don't think it's a specific name. I mean it'd be like an aliasing problem. But yeah, I don't think it's a specific term, and I don't know that I'd necessarily clump that in with all memory errors, like these ones. These ones are very specific, very concrete things where we can say, yes, this is wrong because of this. It may be that you wanted to do that, right? So like the problem is your intentions as the programmer are different than what's actually there, whereas like a memory error exists in a program and you can say definitively just by looking at the code, yes, there's an error here, or no, there's not. I think that's it. Don't hold me to that. Maybe there's some fancy classification of all these errors. Okay, so we have foo. foo is returning an integer pointer. It's returning the address of x, which is an r value. We store that as dang, and then we print out the value that's inside dang and also what dang points to, right? We dereference dang and print out whatever point to. We call r, which has two other variables, y and z. Print out y and z, and then print out dang and dang star. And we saw on Friday, we're going to do this. We just wanted the results. We saw that when we compiled this, it actually does give us a warning, which is nice. Compilers are pretty smart. It's telling us that function foo returns the address of a local variable which we know is a problem because that local variable's address is only valid in the scope of that program, right? So that by returning that, we're basically essentially guaranteeing that that memory of x is deallocated by the time we use the return value of foo. When we run this on this specific machine, which is 64-bit, we got some memory address of 100, and then we saw 10,000 and zero. It's actually what we expect so far, right? Because here we have, so this would be the address of x was inside dang, which is being output right here. And then we have, what does it point to? And it points to 100, which is what we expect. I mean, we expect incorrectly because you should not be able to do this. Because at this point, dang is a dangling reference, and so there's no guarantees on what that value is going to be when we access it. Inside bar, we see it should be 10,000 and zero, right? y is 10,000, z is zero. It prints out y and z. 10,000 and zero makes sense. But then when we return, we see that things have changed, right? So how come dang did not change this first output? It's still pointing at that memory address. It's not going anywhere now. So where is dang allocated? What type of allocation? Stack. It's allocated on the stack. On the stack, at some memory address that we don't know, we can just call it the address of dang right now, has the value 7FFE3E680FFC, which was the address of the pass, right? And because these function calls don't change the value inside bang inside this variable, right, it's clear that it should not change anything. But we see that maybe surprisingly, maybe not what dang points to changes. And we'll see that this is not consistent across different implementation, different programming like a different operating system. So this is on CentOS67. Here is on the map. So compiling it also has a warning, which is nice. And then when we run it, we see obviously a different address makes sense. It prints out 100, the same thing. It prints out 10,000 and zero. It prints out the same thing and 10,000. So we can see that actually in this case, dang, what was the address of X is now probably the address of Y, right, in this call to bar. But we can't guarantee that that's going to remain the same. These are going to also be very tricky to debug and figure out what's going on, because here you just have something that you're pointing to and it's just changing randomly throughout the program. Questions here? Let's go through another example. So here we have two variables, dang and foo inside main. Dang and foo are both integer pointers. We're saying dang is equal to malloc size of int. So what's malloc going to do for us? It's going to give us a new memory location, right? It's going to give us a new box. And what does malloc return? What specific r value? The address of the new box, exactly. So crazy new box that returns an address to that box. And then we store that into dang. Is everything good so far? Can we say foo equals dang also good? So star foo is equal to 100. Cool. Then we freeze foo. Is that also valid? So we have, let's see, let's do box circle here because I think that's going to help. So we got to here, let's say right here, right? So we have dang. We have foo. Box is attached. Circles. So on this line, this creates a new box, which I don't know if I'm going to drop here. I have an address. Let's call it. So malloc size of it gives us four bytes or eight bytes depending on what system we're on. And it returns an alpha, puts it into dang. Then here we say foo is equal to dang. So we copy the value inside dang and copy it into location associated with foo. It's going to put alpha here. So now when I say star, what is star foo in this diagram? Pretty close. So what does star always return? A location, an L value. D-references always return location. So when I dereference it, I'm actually talking about this box. Foo. So I set that equal to 100, right? And then I have free foo, and then what happens? Delete which box? The foo, this box? The new one. So then what does star foo point to now? Things. Yes. Perfect. So at this point here, what dangling references do we have? foo and dang, right? Exactly. So here when I try to print out dang, the problem is, even though I freed foo, because dang still had this address in it, star dang is now dang. Like dang is a dangling pointer at this point, a dangling reference. Cool. So it's going to open something. I say foo is equal to int star size of int. So this is going to malloc create me a new, let's call it beta here. So it's going to copy beta into here. Now when I say star foo is equal to 42, what's it going to change? Inside this new box. And then I free foo, it goes away, and then I print out star dang. What do you think it's going to output? What can output semantics wise? Anything, why anything? Something else might take it over that number. Yeah, we're pointing to memory that's been deallocated, right? Dang is a dangling reference, so dereferencing it now means it could literally be anything in there, right? The semantics, they can absolutely no guarantee about what that value is. So let's see. Okay, all the way through here. So running this on sendOS67, and unlike before, we compile this with all warnings enabled and we get nothing, right? Nothing to tell us that we're doing something wrong. So when I'm running, one time when I did it, it output zero and zero. Is that what you expected or not what you expected? How would you expect it? You expect it to maybe be the same like 100 every time? Straight out of different operating systems. So running this on the Mac, but again, no warnings, right? This is very tricky behavior. Because of the aliases, it comes back to the aliases, right? We're freeing food, that automatically means that dang is now a dangling pointer. And now when I run it, I get 100 on the first output, which is kind of what we expect. But then the next time, I get 42. So why do you think that is? What was that? Yeah, so instead of when we talk about it symbolically, right? We said malloc returns some new address which is some box that is a brand new box, which is the case kind of abstractly. But what actually happens here is malloc, when you call free, it's free to reuse that memory location. So the next time here you ask for, hey, give me four bytes, it's this great here. Got four bytes, here's alpha. And so by changing that value in there, we're changing the same memory location. Questions here? I thought those were kind of cool. It doesn't change the value for this compiler and maybe same to us. So the free and malloc are libc functions. So it actually doesn't have anything to do with the compiler itself. It depends on the library. So yeah, it could be the different library versions. Some library versions could zero things out when they free it, but you want to run fast. So by the C standard, you don't have to do that. So you're just doing extra work and your malloc and free is going to be slower than somebody else's who doesn't do that. It may make sense for small things. You think about you could malloc a huge chunk of data in the gigs range and it will work and do that. So if you have to zero that whole thing out, you may be wasting your memory now. Are you guaranteed a segmentation fault if you try to write to a dangling point? Ooh, are you guaranteed a segmentation fault? No. So if we chose to... Here I'm essentially accessing the dangling point. I'm reading. I can also write to it the same way it would work. Because that memory segmentation fault has to do with the operating system that tells your programming to use memory segments you can use and you're trying to access something that's outside of those segments. Here you're accessing something that your program has been allowed to access. It's inside the segment, in the heap segment specifically. But because... So there's no error when you read or write to that. Secondation fault is reading or writing. If you write to a dangling pointer, is it still a dangling pointer? Now you know what it is and you know where it is. It is still dangling because you shouldn't have access to that memory. You are changing memory that has not been allocated to your program. You want to try it real quick? The worry is that at any point it could get written over. You're never guaranteed what if you put something there. You could write to it and then immediately read from it and you have no guarantee that it's going to be the same value you just put there. Somebody shout out a number. No, you said it. I want to see what values it returns. This could be interesting. We want to change it. No errors still. So we can see here... So the first output of the printup is the first thing that malloc returns. So this is some memory address on the heap. 7f8690c05390 whatever. Then we output 100. We're outputting what dangle points to. Then we change it to 3 and in the next line, even though it's dangling, we're still just writing to memory that we have rewrite access to. So this works and it outputs 3. Then we call malloc again and we can see we actually get back that same memory address because we called 3. Then we can set dangle to 53 and it outputs 53 even after we've freed it. So if we output boon right there it's going to give us 53 as well. Which one, right here? Cool. Because they bolt pointing to that same memory location. Can you throw another malloc in there? Don't even assign it to something because it gave you back the same one, right? Yes. So we can do maybe this one to do to simulate two of them. Before? Sure. So now we can see that the memory address changed. Here is 940030 and here it's 00020 and now we can see they're not pointing to the same thing anymore because we can essentially malloc more memory. Let's see if we're really tricky. What's the difference between those? Let's have to do hexamups very quickly. So they aim or not? Yeah. What is that? 16? 16. So we could do something very not good. We could one thing plus 16. I'm doing something wrong with it. We should be able to do this in one direction. Just keep experimenting until we find the correct way. No, I think it's in two seconds. Part of the problem is it's doing pointer arithmetic. When we do foo. So it knows that foo is an integer pointer so it's moving it like 12. We have to like add a word. Yeah. Huh? But it's not, it's 16 bytes from the other one and I think an end pointer should be 8 bytes. Yeah, so that's what I think. I think foo minus 2 but foo is foo is the larger number. Yeah, I feel like that should work. The easy way to do this is to I'm not supposed to see all the slots I've just made. It seems to be all prepared for me. Cool. Any questions on this? Any pointers? Scarlet? This is from one of them. We have a program. We have a global variable q and we have a scope here. We have an end pointer b. We see a is equal to malloc size of int and what does this memory 1 say of being over here? Giving the box an address. We're saying it's returning memory location 1. So that's what we can use. So how many boxes do I have already before I get to that point? When I'm right here. Three boxes. I have q first. That's not a q. When I get to this line it's a box and it's telling me that its memory is address 1. So I know I have some new box here. Memory address 1. Do I know what's inside that box? Nope. No idea. But the return of that is stored where? A. So it's 1 inside of A. I then execute this next line where I'm doing the same thing. I'm mallocing a new memory location. 2. And I'm storing that inside of A with B. Then I set star A to be 42. So how's that going to change my diagram? The value inside memory 1 is going to be 42. So some of the things I should be able to do is at these various program points draw the box circle diagrams. And I can even draw star A and star B. And label those lines as well. Is that required? Possibly depends on what the question says. So after this next line I say B is equal to instar malloc size of memory 3. So now I have a new and so what do I do? How else does this diagram change? So I just called malloc. I created a new location. What else do I do? Port it into B. And then I say star B is equal to star A. What's star A? 42. 42. This location? The value associated with this location means to where? 3. So this is star B and I'm copying 42 into there. Then I say Q is equal to the address of A. We'll say we didn't give these things addresses. We'll just call this address of A. Address of Q. Address of B. I think it'll usually be here in a comment. So how does this Q is equal to address of A? So now we use the program point Q. 2. Sorry. What memory is garbage at point 2? 2. Memory location 2. What are the aliases at program point 2? Q star and A. Q star and A. So let's start Q points to here. And A also refers to that same box. So at point 2 we have star Q and A. These are both aliases. What else? Let me say star A and star B. Are those aliases? Why not? Aliases are references different names for the same box. And here star A points to one box star B points to another. Even when they have the same value they are different locations. Cool. Now we get to point 3. Except Q and A all the other memory blocks are D-allogated. All the memory blocks are D-allogated? Except Q and A. So like 1, 2, and 3? No. So which specifically is D-allogated? B. B. Cool. The question about the aliases so can we say that star star Q and star A are aliases? Yes, that's a good point. So star star Q and star A are both aliases. So could you say because this happens on this test we've seen it a lot. Are Q and A aliases? No. Address of A is an R value. It can't be an A. That's great. What definitely does not count I guess we should get this clear but I've never had this come up but you can't do like star so that A and A. That doesn't count. You're just having a visual struggle. It would be clever at this point since I just warned you. Cool. So program point 3, what memory is garbage? 2 and 3. Right? So the way to think about it is you can say starting from the variables that I know is there any way I can follow their pointers? What memory can I access? Right? So starting with Q and then follow A to get here. Starting from A I can start here and follow here but there's no way I can get to and access 2 and 3. Right? It's impossible. So 2 and 3 are both garbage because at this program point 3 B went out of scope. So what about dangling references? Do I have any dangling references? No. Do I have the aliases of the same? Right? Start Q and A and start Q and start A point going on. Now there's no more code. Memory 3 is still in scope. Memory 3 was manually allocated on the heap by the programmer so it has to be manually deallocated with the call to the free function. But because B is on the stack as soon as we leave it in scope it is automatically deallocated. Right? The same as A when we return from main A will be gone. Q will remain forever. It's global allocation. We talked about various types of semantics. Right? We talked about assignment semantics. One of the semantics that we see A equals B. The value in the location associated with B to the value in the location associated with A is this true in every single program language you've ever used? No? Which is it not true in? Semantics wise that's a syntax issue. You would tell me I didn't zero Pascal model. Except maybe for that one fact. So what about I did something like out of this assuming there's a print function. Another test? But why? Semantics wise what should happen here? Referencing the same object. Referencing the same object but I don't have pointers in Java. Syntactic thing? They all essentially pointer the references. Oh it's a syntactic thing? You want to argue that a little bit more? I mean they're they're just already references. You don't have to declare them as pointers. It's just how it works. But then what if I change this Well, it's going to mess it up but let's consider a very similar example. Those are primitive not objects. So Make it capital integer with the object wrapper with the primitive around it. There you go. This doesn't really work right? So what's it going to output? I actually don't know. My job is a little rusty. What am I doing? How do I get an output? Is that it? Stupid. So an output of 42 42 and 25 How far is it for you to assign the 45 to it? 42, 42, 25 That's interesting. Object is tricky here. We have to have attributes and we have to be able to change the attributes of the object. That's why it's not changing. String is also... String since you're in a primitive? String is an object in Java. String is an object in Java? They're still readable like integers. So, okay. To make it even more confusing, right? This actually follows our assignment semantics that we talked about. It's essentially copying this value 42 It's copying it into bar and they are both separate locations that store values inside them. Whereas even though we probably are not going to go through it and do this if we see this what would this output, assuming an object has some Baz field should output another test, right? But that means then that the copy semantics are different. Not only just different than this case they're different than what we normally think about when we think about copy semantics, right? This assignment semantics if we were using copy semantics like we talked about it would be copy the object food into the object bar. Get a brand new copy of that object and then changes to one object like here we're changing bar should then change Baz. And then Java of course confuses things because not everything is a class and some things are primitives and some things are class primitives just to make it even more annoying and weird and some data types are mutable or it's like actually if you want to really explore object oriented programming I suggest you check out Ruby literally everything is an object in Ruby like even numbers like numbers have a method something like really the syntax of this isn't right but you have like every integer has a method called up to which does a loop and we'll go starting from one up to whatever parameter you want and execute a chunk of code for every time so it's like a custom for loop built into every integer and they have other things too so it's just kind of nice because it strips away and says okay let's not deal with this nonsense that Java has where you have sometimes objects integers and primitives it's like let's treat everything as a class okay so what does Java actually do? so when we see this variable spoon what is it doing? and does this completely change? do we have to change the way we draw our diagrams or think about box circle diagrams and all that? and does what? so in this case of the objects what is it doing here? so there's actually two ways to think about this one way is to think about what is a Java doing under the hood? under the hood foo and bar are pointers and when every time you access any field of foo or bar or pass it into a function you're passing a pointer to that object another way to think about it which works the same is to change the way we think about pointer semantics sorry not pointer semantics assignment semantics here here when I see I have an object foo normally I say I have foo that means it has a location associated with it here when I have an object foo I just have a main foo and when I see it's able to a new object this new object actually creates the new location and the assignment operator in this case does the binding and this says bind the main foo to that object and then here when we see object bar we just have a main bar but no location associated with bar until here this means foo-bass to test we'll just put testing here and now when we say bar equals the foo now the assignment semantics do not mean copy it means bind the main bar to something different bind the main bar to the same thing and so it's bound like this so you can see it two different ways depending on how you whatever is easiest or best for you to think about the semantics of Java but it definitely works it does work this way it's called sharing semantics so you're saying that A and B you're basically allowing your semantics to say that the assignment operator rebinds names to locations rather than copies things around so we have like A, B we can say A is a new object we're going to create a new object, bind it to A we can say B is a new object we can say A is now a new object and then finally assign B to A and so now we can know exactly what happens when we manipulate objects that are associated with either A or B cool questions on this? so if you change something B it'll also change to value 9 exactly so if you want to copy an object you need to specifically clone that object to get a brand new object but let's get split clone it or just point? essentially they are referring to the same object A and B so any changes to A will be reflected when you access B pass by reference pass by reference we're going to get that we're going to do that's going to be when we get to the runtime environment so we need to go a little bit more a little bit more I like to do more in depth into the runtime environment so we can talk about exactly what the stack looks like where the compiler puts things in memories and then that gets into how parameters are passed and all those kinds of things so we'll learn about pass by value pass by reference, pass by name in different ways that the compiler can do those so for now we're going to talk about type systems so let me kill another one we're going to switch to type systems so how are type systems and semantics related what is a type system good question how an operator works could change so the built-in operators may change depending on their types in some languages plus operator if its arguments are integers it will do addition if its arguments are strings some languages that are super annoying where you have to use a different operator plus is only defined for integers so if you want to add floats you have to use a different operator makes it incredibly annoying to program in those so what are type systems useful do you wish you've never had a type system and didn't have to worry about it probably type languages are safer safer now you could have made your computer explode your face have you ever had a computer explode your face well maybe if you have a Gaussian L7 it's the job right don't tell myself so do you like it or not like it I think that's a fundamental question I personally like it one nice thing about static type things is that you can do a lot of refactoring and develop your tools that can be used by an operator so there's there definitely is a lot of benefit so okay so we're talking about type systems so the idea is if every variable has a type and every function knows what type of parameter it takes you can actually do pretty cool things and you can say like in this method change this variable to a different type or change the type of type functions to change types and do all kinds of cool refactoring I think you can do cool things like in Java some of the fancy stuff here is highlight some part of your code and say refactor this into its own method and it will allow you to figure out inputs and outputs depending on what things are using things now what is a type system so does it help you be more expressive does it restrict what you're trying to do keeps it from teacher program keeps you from writing programs that do weird or unexpected things so semantics semantics is about very precisely defining exactly what everything in the programming language means here with type systems now we're saying okay we know what the program does but what types of things are allowable it's actually I think of it as restrictive the type system tries to stop you the programmer from doing things that could end up harming yourself like trying to add a string to a number right which doesn't make any sense or like Python can multiply a string and a number which actually is handy but also makes no sense it repeats that number that many times like what happens if you have a negative number see it doesn't make sense it's just weird programs I can see that one of the drawbacks you guys have just been praising strongly typed languages it's slow to write it's what? slow to write why? you have to type in if you think about it you're annotating right the computer doesn't necessarily need to know the types of all the variables but you're having to tell the computer exactly what every variable in your program what you think the types of that variable should be so you're literally typing what else? speed memory management in what sense? if you know the type of something you know how much space to allocate it it'll never need more space in that because if an object can randomly change types if you have a variable that in an instant changes from an integer to a string you're going to have to figure out how to allocate more space for it there's actually performance any performance benefits write a bunch of code and it gives you very cryptic type warnings and type errors casting always works casting is the way that you can intentionally trick yourself up and say I definitely know what I'm doing here and it turns out in five hours later that you didn't you have to go back and take that out and fix it so there's pros and cons and I want you to think about that