 Today is the 7th, so I will repeat. Thanks, I appreciate that. The slides can now be online, ready to go. Are there any other questions that are not about not being able to access the slides? Okay, so now we're going to start right where we left off, I believe, on Monday. So we're talking about type systems. We're talking about how does a language designer, how does a language define what types of operations are valid and what types are valid. And so we talked about being able to, the four things that are necessary for a type system. Anybody remember what those four things are? Maybe we can go back to slides and their slides that they're looking at right now. What's the first thing? What do you need in any type system? Basic types. Yeah, you've got to have basic types, right? Otherwise you can't build any other types on top of that. So you need basic types, what do you need on top of that? Type constructors, yes. The final of this slide, good guess. Remember to sit back. The third thing, what was it? Type inference. Yes, we're going to get into that. And the fourth one, type, what was it? Declaration. Type declarations, sorry. Compatibility. Yeah, I was thinking it's too small to see. Okay, yeah. So those are the four things that we need. Okay, so now we're talking about type constructors. So how do we as a programmer say we want new types in the system? One is we can just use maybe a basic type. Another one is we can declare we want a pointer to some type where T is the type. So this is kind of the syntax we're going to use right now talking about kind of type systems in the abstract. So here we're going to talk about just pointer to T where T is some type that's already previously defined. Okay, the next thing we're going to talk about structs. So we're going to talk about structures. And so here we're going to represent that very similar-ish to C where we have curly braces, then we have various fields, A-I through A-K, and each of those fields has an associated type. So this is where it is, where A-I is the field name and T-I is some previously, well, is some type. Could be previously defined. Questions on structs? Right now we're just kind of defining how to build. So you can build structures by combining any number of previous types in kind of any order, which is fun. Okay, then we have arrays. So we have an array of a specific range of some type where the range can be single or multi-dimensional, and we'll see in a little bit today an example of what this range looks like. So is this, I don't know, the C's have a way of defining an array of a range of a type? The C++, yes, no, no wrong answer. There's a lot less of you, there should be more interaction, right? I can, so the question is, can you define an array of a range of values, either single or multi-dimensional, in C? You mean like, how much of a memory do we have today? I don't know, can you define this a type of an array? Array of some type. You can guess, it's okay to guess, we can talk about it. Yes, no? You can guess the entire class. Yeah. I vote yes. You vote yes. So how do you define an array of a range of types in C? So I'm assuming an array means a memory space. Is that what we're talking about? A range would be the number of elements in the array. And you can even say, and how to index those elements. Is it zero through five, it would be a six element array. Is it one through five, it would be a five element array. Can you have negative indices? So can you do like negative five to five or something like that? Yes. So yes, you can. Not so much. Yeah. I don't think you can do the negative five. Definitely can't do the negative. Can you set like a lower bound on the index of an array? You can just set the upper bound. You can set the upper bound, right? But what actually does that type to compile it to? So when you declare like an int bracket 10 and then some name, does it actually have a type of int array? It's like an int size of 10. So it's actually an int star. So it's actually just a pointer to that chunk of memory that happens to be a size of 10. So C is a little bit weird because you can kind of define this, but you can't actually say this is a type of array and it has five elements. Whereas in other languages you actually can. You can actually do that and say, hey, here's an array, it's specifically got five elements and you can even give the indices of that. It's actually a conch star, right? Is it a conch pointer? Yes. It's a conch star. Yes, I believe that's... And so you can't pass a star to it. Pass a star to it? Like if you wanted to like, make a trade equals to an int star to a conch star, you can't do that. It'll fail in five minutes. Possibly. Okay, so it's adding other type information, but it's not a type of an array, right? Okay, that's good. Yeah, we should test that. Okay, and we can also define functions, right? So functions have a type. What would a type be of a function? How would you try to want to type a function? Return type. Return type? Definitely important. Is that all you need? Yeah. A parameter type? Yeah, all the parameter types, right? So those kind of all together can make up the type of a function. So here we're just declaring that we have a function, and it accepts parameter types t1 through tk, and returns a type t. So this is saying that the type of whatever we're declaring as a variable is a function, and it has these input parameter types, t1 through tk, and it returns some type t. Oh, no, I don't have animations on the next slide. Okay, we'll have to, we'll be fine. Okay, so there's two types. So when we declare, we can also declare, in some sense, aliases of types, or we can declare a certain type. So one way to do it, so in a language like C, we can say type def, so we can say, hey, type def, cm, so here this would be centimeter, which is, as base and integer, right? So we can actually, as programmers, we can define new types, even if they're not combinations, or even if they're not pointers, structures, arrays, or functions, right? We can create new types. So here we've created a type centimeter that's really an integer. We're creating a type RGBA, which is an array of zero through four. Oh, I think I may be off by one there, of type integer. Yeah, that's fine. So it's an array of integers, so that's how, just in this kind of type syntax, we're specifying the range here. But I can build on that type, right? So I can build an array of zero to 256 of RGBA values where each type in there is an array from zero to four type integer. So questions on that? Declare types. We can declare new types, which is important. Okay, we can also have types with no names, right? So we can have anonymous types. So this is actually possible in, so this array, so here we're saying that we're declaring a variable X. And what's the type of X? Was it? Almost. So here we basically, so you can think about it kind of like C where we have the type, and then we have the variable name. Was it? Yeah, so it's an array of zero to four, so it's a five length array from zero to four, and it has each of those is type int, right? And we're calling it X, but that, so the difference is up here, we defined a new type, so we gave this type a name so we could refer to it later. But here we're just saying that, hey, we have this variable X and it has this type, but this type has no name. So this is actually definitely possible in C as well, so you ever wonder why you have to put a semicolon at the end of a struct definition? Is it C and C++? Yes, because it's the exact same way you define, you declare, you're technically declaring a type, you can get a type of a struct and you give it a name, and you can also assign variables to that type. So that's why you can do that, where you don't need to do that with functions because though there's something completely different. So here we're defining, so this is actually valid C syntax, you can go put this into C or C++, this will compile. So here we're defining a variable Y, and what's the type of Y? What kind of struct? Yeah, struct that holds an int in field A and a char in field B. Does this structure have a name? Oh, so Y is the variable name, so we're declaring the variable, if I wanted to give it a name I'd put it here, I'd say struct something, and then I'd add the definition. But, so this is where anonymous comes in, right? So here we have a type with no name that we can define a variable as having a specific type. You think it's useful? Yes. Yes, when is it useful? Yeah, if you only ever need one of that type, it could be useful to define it maybe for a specific function. When would you be in that situation where you only need one type of a variable? Shrub this. You start off with it's useful and then... How would that be useful? It could be useful. Yeah, it could be a singleton, maybe you only ever need it or declare it once, that's good. You could use it as a singleton maybe as far to define maybe some global state or something like that, in a game or something, but then you'd have other ways of saving that state and loading it. Okay, so does every get the difference here between these two decorations? So what are these three all having in common? A raise? This one doesn't have an array. Close, yeah. Yeah, they have names. So they're all types that are defined and they have names. Whereas these types here don't have names. There's variables that have these types, but the types themselves do not have names. Okay, so maybe you're thinking this is really pedantic and crazy and weird, but it is very important when we come to talk about type compatibility. So which assignment... So type compatibility really boils down to what assignments are allowed by the type system. So the question is, if you have an assignment of A equals A gets the value of B or A is assigned B, is this allowed by the type system? So if A is an int and B is a float, is it allowed? You answered enough. Anybody else? Yeah, depends on the language. Yeah, so what language is, would this maybe be allowed in and what would it not be? Ruby would allow this? I actually don't know, that would be a good... Ruby does allow a lot of things. That's a good point. Ruby does allow a lot of things. So that would actually also not surprise me. In C, you can also do this as well. You can assign a float. So what's the big problem of allowing this? So we don't have much. Who said that? Whatever he said. I don't think you're allowed to do that. Lost information, what information are we losing? Stuff that float can store that it can't. Yeah, so a float has some decimal part, which you really can't represent in an integer. And integers don't have a decimal part. So fundamentally, if you just think about it like that, you're losing some information. Yeah, so in this case, it's up to the language designer to say, hey, do I want to allow this? And just say, hey, you have to be careful. If you're ever assigning a float to an int, you're going to miss those decimal points. The other thing is, well, how does this conversion happen? Is the float rounded to the nearest integer? Is it just that decimal part is thrown away? So these are all decisions that have to be made. And so if you allow this, you have to be very clear in the language, hey, this is exactly what happens. Let's go the other way around. No, can't do that? Why? Start it off by saying no. And then you were so sure. Did you talk yourself out of a no? Yeah. Do you want to defend the yes now, or do you want something else? Yeah? I mean, does it see the way that it defines a float though very differently than it defines an int? Because you just use basic binary to define an int but then with the, when you're defining a float, the way that it works with the decimal point, I forgot there's a certain way that it does it differently, right? So the question is, if I can sum that up, is there a weird way that floats are defined as opposed to ints? So at the basic level, they're all just bits, right? And so there's 32 bits, I believe, in a float and an int, but I actually don't know if that's true, if it's 32 or 64. But I think what you're referring to is that the int is going to be stored in two's complement integer form, whereas the float is going to be stored in whatever IEEE floating point precision specification is to actually specify because it has to keep track of what's the decimal part and what's the non-decimal part, which has a name which I don't remember right now. But if the compiler knows how to convert between an int and a float, it can insert the instructions to do that into here. I guess another way to think of it is using the information from going from an int to a float. No? I mean, yes? So yeah, the big question is what are the ranges of floats and what are the ranges of ints? And is there any possibility where max int can't be represented as a float? So this, I don't know, 100%. We'd have to look that up. Are you shaking your head from experience? Yeah, almost both were centered that you could get into a float without the int precision parameter. So you can do it without losing precision? Yeah, because every int can be represented that requires more data. Okay, these are all good points. So the real thing is, as it goes back to the original answer, it depends on the type system, right? So the type system either allow it or disallow it. I think in C it would allow this and do it automatically because you shouldn't be losing type information. But I'm not 100%. Shouldn't it allow it, though, because I think the art isn't a float type given more memory than an int type. In that case, if a compiler or whatever knows how to convert between ints and floats, then you basically, you're just getting more information to store that int into. I mean, you're giving more space to store that int. So yeah, I think that's the big question is if it's more space. I think it is. I think a float is maybe, I want to say 64 bits, but I'm not 100% certain on that. Probably, but we'd have to look at it. Anyway, so that's the whole point, right? Because we'd have to look at it and you as a language designer have to define hey, what things are actually allowed? Are these types compatible? Can I assign them one to another? And if you do allow that, well then you have to define how that conversion takes place for the built-in type. Okay, type inference. Type inference is very important. So this is how the compiler, this is actually one of the coolest things and we're going to get into this a lot in maybe the next lecture, but this is how the compiler or the interpreter is able to infer the type of an expression. So remember, expressions are sequences of operations, right? They're very broadly defined. And so, for instance, what is the type of a plus b? What is the result of that expression? What's the type of that expression? Is it explicitly specified by the programmer here? No. So, assuming that types for a and b are already declared, can we infer what the type of a plus b is? Or is it the same? Yeah, so it's one of these things that, well, a, whether it's a lot or not depends on what the language semantics are and what the language type system says. But specifically, that's what type inference rules talk about. So, okay. We have a plus b. Can we infer what the result of that type is? Because we may be using that in another expression, right? We may do a plus b times c divided by 20. So we want to know what the result of that expression is. So if the type is a as an integer and b as a float, what's the type of this expression in, let's say, c? Float, why? Because it retains the decimal portion. Because it retains the decimal portion? So, yeah, so if we did, if we cast the float b to an integer and then did the addition to get an integer, well, then we've lost the decimal value that was originally in b. So in c, it actually returns a float. But different languages do this completely differently. So it's languages like the ML family of languages. It's actually an error. So they have different operations, like addition and division, depending on if the operands are integers versus floats. And so that forces the program writer to be very aware of what the types of their variables are and how they can be used so that that way this accidental maybe type conversion doesn't take place when they don't intend it to. What about this? a and a times b, so that's an expression, yeah? That's a question. Doesn't it matter what you're storing, storing that expression type? So, I mean, if you said like in c is equal to a plus b versus float c is equal to a plus b, wouldn't it depend on how much memory you're allocating? So the question is, does it depend? So remember, right now we're not talking about memory at all. We're talking about abstract type systems. And so the type system, each type defines a set of values that that type can represent. And so, specifically here in type inference, we're just saying, hey, what's the type, what's the resulting type of this expression? Then, to decide on that assignment, right, we'd go back to what are the type compatibility specifications for that language. Can we, can we assign, is the type of the result of the expression, is that compatible with whatever you're trying to put it in? Alright, so this is something to do with this, if I said in c is equal to a plus b, a plus b returns a float, and you're trying to store a float in c, so that would be fine. Yeah, it would be fine. It would depend on whether you can store, whether a float, you can assign it to an integer. So a times b, what if a is a string and b is an integer? Is this, what's the result type of this expression? It depends on the language, like in Python. It depends on the language, give me a language. Python. Python. It's a string. Yeah, so Python, so what is it going to be in most other languages? An error. An error, right? In most languages, if you try to multiply a string by an integer, it really doesn't make sense, so it's going to error. Python is very weird, because it doesn't come handy sometimes, where it's actually going to return a string, so it's going to return the string a, b number of times. That's going to be the result thing, string. But the point is that the type, the type inference options here are different depending on the language that we're talking about here. Pearl does that too. That's probably where Python got it from, is my guess, but it still seems very evil. Ruby does it as well. Interesting. Any other languages? What is JavaScript? I think JavaScript does not. I think I tried it in JavaScript. I thought it would have worked, actually, but surprisingly it doesn't. Okay, so we just talked about type inference, right? So type inference is how do we infer the result of an expression? Now, type compatibility, so this is where we're going to go into detail here. So type compatibility is, principally, the principal question here is what types are actually equivalent, meaning that what is the compiler, what does the type system think of two types being equal? So if we have a type of a centimeter, which we define to be an integer, and we have a type inch that we define to be an integer, and we declare a variable x that has a type of centimeter, and we declare a variable y that has type of inch, can we assign x equals to y? There's a no, why no? Yeah, but there's still integers. Right, but we don't see that part. We only see that it's a centimeter. So give me an argument for why you wouldn't want them to be able to assign like this. Because if I had to define one centimeter that has a certain length, that's different than one inch. I mean, one centimeter is like two centimeters. Right, so I'm representing semantic information on my program where the variable x is some number of centimeters. It really doesn't make sense to be able to just copy, turn 10 centimeters, and copy that into a variable that's inches, because then that's 10 inches, which definitely isn't the same value. Yeah? What is that handle by the processor? What is that? This is handled by the compiler or interpreter after it does all the parsing, checks all the semantics, then it checks the types, basically. So that's another step in the types of things, depending. Yeah? Which is the information stored when there is a conversion that does allow when it's converging on one type of another? Is it stored in the inches or is it stored in the centimeters? I don't understand. So like, you know how you can change a float to an integer? Ah, okay. And you do that by writing a specific conversion in that particular class or it's actually stored in the inches or it's stored in the centimeters. I think that's definitely an implementation-dependent question, because yeah, it depends on specifically the implementation. For basic types, those are defined in the language, right? So that has to be defined as to what types you can convert from one to the other two without saying anything. And then if you do it explicitly, exactly what the semantics are for all the basic types. And then for, depending on how types are defined like this, then that's a whole other level that completely depends on the language itself. For language like, let's say, Scala, if you define a way to convert between centimeters and inches, it'll actually allow you to do this and automatically apply that conversion to this assignment statement. But, so the question is so, okay, so we found out here why you wouldn't want to be able to, why this assignment statement shouldn't be valid and shouldn't type check. But what would, are there examples, when would you want to? Yeah. Like in C, you want to give, do you want to give ultimate control for programming, so if you choose to do that, then he has the right to do that regardless. Right, so C would be, yeah, I guess the C argument can be low-level language, maybe has, you want more direct access to directly to memory and all that, all those things. One way would be, it could be really annoying. You'd probably not define types like this if every time you tried to use them or convert between them, it would give you an error. Does anybody actually do this, when they write their programs, or like maybe projects in this class? Did you define special types for all of the, like that you find in, I don't know, the same enum that represents whether a symbol is a terminal, non-terminal, or yeah, some people. I'm trying to think of other examples where this might be good. So it's good in the sense that it can, so by stopping this, right, we can restrict the set of programs that the programmer writes, and so maybe we're forcing them to write safer programs, but then the flip side is, we could give them their way, and maybe the programmer does want to convert directly from whatever, some integer to another integer, and wants to do that, so. So what we're getting to now is all the different cases, all the different types of type compatibility that define whether this is a loud or not a loud. So the first one is name equivalence, which is exactly what it sounds like. So the types have to have the exact same name to be equivalent, right, otherwise they're not equivalent. So in the case we saw previously, we have a type centimeter, a type inch, X, Y, so can we assign X to Y, or Y, or an inch to a centimeter? Y. Yeah, they're not the same type, they're not the exact same name, so that's going to error out, and it's going to block this. So what if we also maybe, so now what the question is, what happens when we have that with anonymous types, types that have no name? So here we're defining A as an array of 0 to 4 integers, and then we can find a variable B as an array of 0 to 4 integers. So now I'm going to ask the question, is A equal to B? Intuitively what do you think it should be with name equivalence? Defensive position? Yeah. Why? So yes, but we're using name equivalence. What's the question here? Yeah, what's the name of the type of A and what's the name of the type of B? So do they have names? The compiler probably knows that we're integers. Does this say anything about the compiler needs to be smart to figure out the names or equivalence? Is it good? You've got to keep that in mind. What are we getting there? But what does it say? Have to be the exact same name. I guess it's a question, but if you don't have a name, can they have the exact same name? This is a null string, so then any anonymous type would always be equal with any other anonymous type? So who thinks it shouldn't be allowed? Who thinks it shouldn't be allowed? The rest of you have no opinion? I just want to move on to them. Okay. So not allowed under name equivalence, right? So these A and B both have anonymous types. The type of A is some anonymous thing. The type of B is some anonymous thing. Those can never be equal. So we can never have these be equal under name equivalence. So now we have the question. We're defining two variables A and B that are both the type of array, 0 through 4 of integer. Same thing, but now the declarations are on the same line. So can we set A is equal to B? And using name equivalence of what we just said. We're not arguing one way or the other, but you don't have to be right. Yes, why? So they both technically have the same name even though it's anonymous. But it doesn't have a name, because it's anonymous. I don't need to think about it. I could see arguments honestly for both ways. Yeah, why it should be allowed is because even if it's anonymous that thing where you just signed it a randomised number can say that both are instantiated by the same randomised number and then essentially if you did that then they would have the same equivalence. However, if you just say hey we're just going to say they're both anonymous not named and not given pointers then no, they wouldn't be able to. So the argument is the compiler has to give them some name. So they're anonymous but the compiler has to have some way of knowing what variables have a type and so it's got to have some way to assign that to an anonymous name. But I guess the question is I guess maybe by the way we defined it but then aren't we specifying something about the compiler behaviour that it has to always give the same type on the same declaration because it could be that I could see this and I could compile this as an A of type array. Like semantically this is identical to the previous one, right? Where you have two declarations of that. So is there a set of rules that we have for name equivalence that have to go across all the milers? Yeah, so it's got to be the exact same name that's the rule. So I want you to get to think about it because it's actually a great way of how we're going to lead to the next one. But under you consider strict name equivalence or name equivalence it's a great name so it's not allowed because array of 0 to 4 of int is not named and that's why it has no name therefore they can't be assigned to each other. It would be a very interesting question if we got A is equal to A but we'll leave that aside for now. But if we defined a type called A of array 0 to 4 of int now can we assign A is equal to B? Yes. Why? They have the same not type? The name. Yes, they have the same type name. Yes, so it's allowed because both A and B have the same name. And okay you're probably thinking well this is probably an overly strict right? I think that's kind of the discussion we just had is this seems overly restrictive. So we can relax this name equivalence a little bit which will give us something that is very similar to what we've been talking about called internal name equivalence. Right? So this if the I call program interpreter so the compiler or interpreter or type checker whatever if it gives the same internal name to two different variables then they share the same type. And so does this work with the same for types that have names as name equivalence? Does this work the same as name equivalence for types that have names? Yeah, so the compiler is going to give internal names to every type that has names. Right? So every type that has a name is going to have the same name, same type. This is definitely going to pass. The only time where it ever comes up differently is in the cases we've been talking about with anonymous types. So here we have variables A and B 0 to 4 of int. And we have C, which is an array of 0 to 4 of type int. So now we want to ask can we assign A is equal to B? Yes. Yeah, because now we're specifying the compiler is going to give the same internal name to this type, this anonymous type and it's going to give that to A and B when we declare them. So we're going to say yes this is allowed it's going to give it some internal name it doesn't matter. But can we assign A is equal to C? No. Why? We have a different internal name. So the compiler is going to give this or the interpreter is going to give this a different name because it's a different declaration. So it doesn't go back and check oh have I seen this type before. It sees a new anonymous type so it's going to give it a new internal name. Questions on that? Now we're getting to something a little bit more interesting. So now we can actually relax this even more and this gets back to why don't we have a smarter compiler that can kind of figure things out that these structures are really the same type. Because kind of that's what we want is we want the compiler to be a little bit smart about things. So we're going to find instead of just talking about the type names, why don't we look at the structure of the types to see if the structures are the same and then we can say that they're the same type. So we have five rules to decide structural equivalence and it brings up some really interesting issues here. So one is built-in types so the same built-in types are the same right so int is structurally equivalent to an int float is structurally equivalent to a float right now we'll just do very basic they're only equivalent to each other so we have some complicated rules on top of that. What about pointers what are pointers when two pointers are structurally equivalent what would that mean? Yeah they both point to the same basic type right so there if you remove the star with a pointer from it then those types must be the same so they're pointers to structurally equivalent types. So is this like a recursive definition? If you have a pointer to a pointer to an int how would you tell that structurally equivalent to a pointer to a pointer to a pointer to a float? Should it be structurally equivalent? Yeah so they're pointing to two different basic types right so you can keep applying these rule two to keep removing references until you get to the underlying basic type so at the outermost you have a pointer to some type doesn't matter what it is so this rule this two says if those pointers are structurally equivalent types whatever the type is without the outside pointer then they're structurally equivalent so let me say okay they're both pointer types so let's remove one of the pointers and now we have pointer to something else and I know I can remove that for both of them and I know I can remove that for both of them until finally I get to an int and a float and say hey these aren't the same these aren't structurally equivalent okay let's look at another example so we have a type we're defining centimeters and integer we're defining a type of an inch so this is the same example so now can we say x is equal to y? yeah right now we can say that x is equal to y we can copy this we can do whatever we want because they have the same structure they're the same basic types underneath okay then we can look at the pointer another pointer example so we have an int star a we have a float a floating pointer b so is it a equal to b? no some of you are checking ahead why not different base types yeah exactly so the pointers are pointing to different base types which means the pointers are not structurally equivalent okay yeah int star star c to what? to a so what do you think if we had int star star c could we set a equal c? we should have one point right so on a structure equivalent if the internal types are equivalent right? so for this pointer the type of this is pointer to an int so the internal type is int on an int star star the internal type is an int star or a pointer to an int so we can pair those and say oh those definitely aren't this is a basic type this is a pointer these are clearly not the same type yeah good question okay now we have the third rule and these aren't just arbitrary rules these talk about how we deal with each of the types and type constructions in our type system so third is okay how do we deal with structures with structs so how could we define that two structures are equivalent what do you think? their members are all the same type how do you define members? one structure has three different members that are attached in yeah so we want the types of the members at a very high level to be the same exactly the order of the members so the structure right has different fields that's kind of a question so do we want the fields to be in the same order and have the same types is there anything else that we may want to be identical or that we could want to be identical yeah functions will be rule 5 I want to say so we'll look at that in a second so we talked about definitely need to have all the same types so they need to have the same types they don't have to all be the same we talked about the order what are some other things in structures what are the properties of a structure the structure itself has a name but what about inside the structure an ordered list of types there's one other thing that's missing though size this is like a compiler detail we can get that from the number of the size of the list what is each so you have zero one you can think about the order or whatever what are each of those halves we have a type but they also have something else yeah they have a name right so you also have the names of the fields so yeah do the names have to match yes no I don't know I can visually see in my mind it's got to go so it's a question for you so two structures have the same fields and the same field names and the same types but in a different order are those structurally equivalent why not that's a tricky question but you if you only ever reference the floors by name it would never matter to you if they switch one from the third floor to the tenth floor so yeah it's actually a tricky it's actually just something we need to decide I mean I say we but I've already pre-decided obviously but you're thinking about the implementation of how a struct is actually implemented in C but it doesn't have to be a continuous walk of memory where each type is specifically that much memory does it act the other way and if we know the types then we can know if we know the type beforehand and we know the fields then we know how to generate code to properly move those around so it should be great so that when you reference field A which is the first one in this one type in the next type you reference field A it's actually the third element but you can swap those around that's not crazy you have all the information to do that so the purpose of this discussion is we actually that's one of the things we need to decide is how what do we care about with structures do we care about the names or do we care about the order yeah so that's another question is what level of benefit do we get from that added level of complexity and I guess I have maybe a little facetious or maybe playing devil's advocate I haven't really decided but the we are actually driven by what the performance characteristics of our program is right so for a language like C yeah we don't want to have to when we copy structures we don't want to have to move all the structures around right so we care about is how actually is that represented in memory and can I just copy that chunk of memory from one place to another to copy those structures so exactly so that's what we're going to go with is structure the orders the order of the types matter the names of the fields don't matter at all so we'll go with kind of more of a C style definition you said the order of the types matters the order of the types matters that's the only thing that matters the names of the fields don't matter at all I guess this does also help you if you ever did have a way you could construct well I guess that would make sense could you construct the structure with an anonymous field I don't know probably not yeah what about unions we're not getting into those yeah that so that you would have to make sure that the types in each of the unions were the same types that it could possibly be the unions really mess up a type system because you can interpret one value as another value later so that's why we're not going to really touch it so yeah so this is how we're going to find structural equivalence for structures so here we have structure one it has various field names one through xk each of those have various types w1 through wk then we have a structure two which has field names y1 through yk and has type names q1 through qk so structures one and two are structure equivalent if and only if what's true based on what we said wk is so wk is structure equivalent to qk for just the last one well I'm typing all of them yeah so all of them one through k are there any other restrictions we need or want there's the same number of k's there's the same number of k's yeah all by that I'll say it's implied in kind of what we're doing right so if you're going through one through k of all of them and you get offset right that would be they're not going to match so yeah so w1 is structure equivalent to q1 w2 structure equivalent to q2 all the way down to wk is structure equivalent to qk so do we care so once again so you can see in this definition here we don't care about the x1's and the y1's or the xk's or the yk's we don't care about the names of the field we just care about the type and the order of those types okay so if we had a structure a it has type int and float and structure b has type in float we declare an a called foo we declare b called bar can we set a as equal to bar so are they structure equivalent yes say yes why okay what about the names yeah we don't care about it they don't matter exactly so yeah even though this structure b has a field a that is a float and structure a has a field a that is an integer we don't care about that we don't care are the types is the order of the types int and float float boom done that's all we care about alright and similarly if we had a situation like this can we set a equals to bar here no no why yeah the order doesn't match right the int float no doesn't match boom done alright questions on determining equivalent of structures yeah why is order important if we can see there is an int it should contain an int so why is order important so I think this is well I think it's two reasons one is it's a historical artifact these used to be called records and records had no field names they just had offsets basically zero through whatever you also kind of think of it as a couple in that sense it's a certain number but anyways but so yeah and the other reason is it's coming from the memory itself right so in a low level language like C a structure is literally however many bytes each of those fields are in memory continuous like that they're just grouped together so it's a way of grouping eight bytes instead of always having an int and a float somewhere else they're always right next to each other and compiler knows the offsets of all those things so to copy one to the other like in this case we could just copy that memory over and it's fine otherwise if we had to worry about field names then we'd have to copy the memories the right places of the structure and we're kind of like breaking apart the structure to do that I was going to say you couldn't define it like that but for our purposes here we're only caring about order order so we just talked about structures so now we want to say okay how do we define an array as being structurally equivalent so we have two arrays so when would two arrays be array types be structurally equivalent what do you think not saying use a different word that's on the slide has the title they're structurally equivalent right two so the types of the arrays the types of each of the elements in the array have to be structurally equivalent right and what about what is that the length the length so the range yeah the range it has to be the same so yeah so we have two arrays T1 which is an array of range one of some type T1 we have T2 which is an array of some range two of T2 and we can say that they're structurally equivalent if and only if the range has the same number of dimensions right we have multi-dimensioned arrays and each of those dimensions is the same length right because it doesn't make sense we can have arrays of different lengths being the same type you couldn't ever copy those over cause you'd lose or gain information or not gain but not have yeah how do you handle types that change their array length so here we're defining an array as a as a type in the type system so the way we define it it always knows the type and it doesn't change throughout the program execution using something like a vector you can create another class I mean you're talking about C you can create a pointer and all that stuff so it would be another type that wouldn't have this restriction because in C++ you create a vector class you can assign one vector type a variable that's a type vector to a variable of another type vector of the same generic type right but it doesn't check to see if they have the same lengths they can't do that at type checking time but this is basically like what if we baked in the size into our type checking system then we could check for these kinds of things so there'd be cases where you'd want to whether you know the array is only ever going to be this long and there'd be cases you wouldn't want to when it's going to be dynamic depending on the input okay as we said T1 and T2 have to be structurally equivalent so as we can see this every time we're comparing two types we're always saying that they have to be structurally equivalent so we can just break this down by just pulling apart all these types applying all these rules to say whether things are structurally equivalent we can have an array of structures and each of those structures could have structures as part of its field so you just break it apart to make sure the basic types match up so the fifth one functions so functions structurally equivalent so we have two functions and remember the time bar functions where we have the parameters 1 through K and then we have some return type so when would these be structurally equivalent? get the what? the parameter types are not the same structurally equivalent so if the types T1 through TK are structurally equivalent to the types T1 through VK what else? yeah the order is the same? yeah definitely important the return types are the same? what was that? the name of the function so here we're not even thinking about the name so just like just like our anonymous structure we can have function types that are not actually defined to a variable or assigned a variable name okay yeah so our functions T1 and T2 are going to be structurally equivalent if for all I from 1 to K TI is structurally equivalent to VI so the types the parameter types match up in the specific order because the order of the parameters to our functions matter we can't just say two functions are the same oh this function accepts an int in a float and this other function accepts a float in an int those types aren't the same and T is structurally equivalent to V so these are actually all the rules we need in order to determine and calculate structural equivalence between between any two types in this little type system that we've created and so what's our so we want to determine structural equivalence so what is the goal so our goal is we want to for every pair of types in the program we want to determine are they the same are they structurally equivalent so I mean is the same pretty straightforward for this case we're thinking about structural equivalence as in terms of being able to assign one variable to another so this would be thinking can you so if you had some function as a variable right that had some signature including parameters and return type could you assign that to another variable and use that in the same way so unless those return types return structurally equivalent values you can't do that because it could return a string like one variable like one function used to think oh this could return a string whereas the other one thinks oh this can use a return an integer or something like that so in this case specifically for doing this we really care about the return types of functions whereas when we are doing overload we don't care about the return types so yeah those are actually like separate concepts and in the language like see we can define types function types of variables so again anyways we'll see examples later cool so yeah seems pretty straightforward right we have these five rules we just keep applying them and at each point we can pull apart we know how to pull apart that type right so if it's a function we compare each of the parameter types see if they're structurally equivalent if it's a structure we compare the order of the field of the types to see if they're structurally equivalent and if it's a pointer we kind of break off the pointer part and see what's inside the method right so we can do this to any arbitrary leak crazy complex type and we just keep doing this until the base case of one or two or really I think the base case of one is really but the question comes up so how do we handle the following case so t1 is a structure it has a field a which is an integer and has a field p which is a pointer to some type t2 where we define t2 is a structure of the first field names a type int where the second field is a pointer to t1 so can we apply the five rules to this yeah are t1 and t2 structurally equivalent no or not but p is a pointer to t2 right p and t1 is a pointer to t2 p and t2 is a pointer to t1 so how do we find the structure equivalents of pointers looking for the internal thing what they point to so the pointers in here are t2 to t1 so the question is are t2 and t1 structurally equivalent but why not so then we look at the structure and then we see oh the integers match and the second field are pointers to t2 and pointers to t1 so we see those pointers match are the pointer structurally equivalent and to do that we look inside the pointers we look at t2 and t1 and we see are those structurally equivalent so yeah we get into an infinite loop we just try I was wondering how long you guys have been going for that you should have done it like more anyways so yeah we just keep applying these rules we get into this infinite recursion because we have a loop of t1 is really depends on the type of t2 to see if these are structurally equivalent so what about just so just looking at this are t1 and t2 structurally equivalent do you think they should be they are why so if you don't so that's easy that's why you never have infinite loops or recursion in your programs you just technically do anything stop doing it right so yeah I mean it's a good way to think about it right it's kind of like okay we'll look at this well you know they're very similar right the only thing they differ on is t1 and t2 and those depend on each other so they look very similar so I guess another question could be could we define a structure that was a is an integer and p is a t2 and t2 is an a and p is a t1 without the pointers you're in the same problem type checking before but I guess the question is could you ever so in that case so in the case we're thinking about the types are in this loop right we're trying to type check can we type check one thing and then type check the other thing but but we could generate like you're talking C we could generate the memory to fit this right we know an integer is a certain amount of bytes and we know a pointer to something is a certain amount of bytes right so each of these has a fixed number of bytes but without that pointer now the structure itself is cyclical and infinite so you could never allocate this much memory because you're trying to say okay well a t1 is first an int okay I can do that but then it's a t2 okay what's a t2 well t2 is an a and then it's a t1 which is an int and then a t2 so you can't calculate how big that is so you're never going to get that to stop so that's a why we need to use the pointers here so that way we could actually compile this if it type checks so the way we get around this paradox is we first assume that all the types are structurally equivalent and then we go through trying to apply the rules trying to disprove that they're not structurally equivalent because they violate one of these rules so if we did that here right if we start off assuming t1 and t2 are structurally equivalent well then we say well are they the same structures they both have two fields that's good let's compare the first fields to integers that's good let's compare the second fields okay they're both pointer types that's good let's compare the inner pointer p2 and p1 are those the same yes because we assume that they were at the beginning so we say they're the same and then we say great these are actually structurally equivalent types okay so that's the way to think about it when we're constructing these types we're going to apply the types we start off by assuming that they're all the same that they're all structurally equivalent and then we change that assumption if we find out that that's not true by comparing the types so our goal here we're going to go through an algorithm is that we want to create an end-by-end table where n is the number of types in the program and each entry is true and all the types are structurally equivalent and false if they're not structurally equivalent and so in order to support these cynical definitions we first initialize all the entries in the table to true so we essentially you can think of it as we're assuming that all the types are structurally equivalent unless we have some proof otherwise and the algorithm here is fairly simple so I'm going to briefly go over to it then we're going to look at an example of an entry to be true while the table has not changed we're going to basically go through for each entry in the table some i and j we're going to say ok check those two types if those two types are not structurally equivalent then set that entry to be false and then that's it something changed and we do it again and so we're going to keep going and here like t i and t j are the i th and j types of the program yeah because we can't prove that they are equal because if they're cyclical we're going to get to this whole thing where we keep trying to check and check and check and check but we start off by assuming that they are equal and then go through it and say ah, these are not equal because these things are different great, those are not equal and then we keep going until they're not structurally equivalent could you do it that way maybe this is actually really simple and it gives you the same answer so this is like how it's done ok so let's look at a quick example so we have t 1, t 2, t 3 so i ball it right now are these all structurally equivalent no are t 1 and t 2 structurally equivalent let's find out so we're going to build our table t 1 through t 3 we're going to set everything equal to true then we're going to start just go through the table right so we start with t 1, t 1 so is this ever not going to be true no, a type has to be structurally equivalent to itself otherwise nothing makes sense in the world so we're going to ignore the diagonal so we're going to check t 1 and t 2 ok, our t 1 and t 2 structurally equivalent so we say check the fields check the integers check the insides of those pointers t 2 and t 3 we look up in the table and we see t 2 and t 3 are structurally equivalent so right now it's structurally equivalent great then we move on to t 1 and t 3 and we say in float ah that's not right, these are not structurally equivalent so we're going to change that to false but remember the table is symmetric so t 1, t 3 is going to be the same as so we're going to change that value also to be false then we're going to check the other value here t 1, t 3 check in to float say nope, not a match, they're not structurally equivalent we're going to set that to be false set the corresponding one to be false then we're going to so we change something right so just like first and follow sets we're going to go through it one more time and we're going to say ok check t 1, t 2 check that integer integer check the value of t 3 check the inside type is t 2 equal to t 3 no, false we just updated it to be false, not structurally equivalent we're going to update our table we're going to go through it one more time see that nothing changes so we can see that these types are in fact not structurally equivalent so that's it see y'all later, have a good small parade