 As we discussed in the prior video, every programming language has some set of data types built into the language and this set usually includes numbers, strings, booleans, and perhaps a few other basic types. On top of these built-in types, the programmer can themselves define additional data types as they require. Confusely, these are often called user-defined data types, the user here being the programmer, the user of the language. The most common kind of user-defined types are formerly called product types, but informally are called structs, short for structures, or classes, depending upon the language. A product type aggregates data elements called fields, which each have a name and a type. For example, we can define a product type we'll call person, which let's say is made up of three fields, a string called name, a number called age, and a number called weight. The order of the fields, by the way, is not significant. Having defined this person type, we can then create person values, each of which will have its own name, age, and weight. We call this a product type because the set of all its possible values is the product, the combination, of all possible values of its fields. Such types are very useful because we commonly want to aggregate multiple pieces of disparate data into a single coherent unit. In this example, whatever information we want to store about each person in our program, we include that information as fields of our person type. What's formerly known as a sum type is informally known, depending on the language, as a union, an enum, short for enumeration, or an ADT, an algebraic data type. A union is defined as a set of types, an enum is defined as a set of discrete values, and algebraic data types, found in languages like Haskell, can be composed of types like a union, but also discrete values like an enum. This example union here, called cat, is a sum of two other types, tiger and panther, and so the set of possible cat values is all the tiger values plus all the panther values. For an enum, the set of values are discrete values rather than types. This example enum, called friends, has three discrete values, Monica, Phoebe and Ross. Commonly, enum values are represented as unique integers or as strings, but it doesn't necessarily matter how these values are represented. All that really matters is that they can be distinguished from each other, so if Monica, Phoebe and Ross are represented as integers, they must be represented by different integers, for example, say, 0, 1 and 2. Perhaps the most important design choice of a programming language is whether it should be statically typed. In a statically typed language, when you create a variable, you must decide what types of values it will store and label it as such. Likewise, you must decide what types of inputs and output each function should have. Having specified these types, the compiler will then enforce these constraints. For example, if you declare a variable to have type string, then assigning anything other than a string to the variable will trigger a compilation error. Likewise, if you declare a function to take as input a string and a number, then calling that function with any arguments other than a string and then a number will trigger a compilation error. The built-in operations are also checked to compile time for valid inputs. For example, a division operation with inputs that are not both numbers will also trigger a compilation error. Programming languages that aren't statically typed are said to be dynamically typed. In a dynamically typed language, we don't specify the types of things, and so the compiler cannot check for improper values in operations, assignments or function calls. Only at runtime, when operations are executed, does a language check the types of the inputs. For example, a division operation with inputs that are not both numbers will trigger a runtime error that aborts the program. Performing these operation type checks at runtime not only incurs extra runtime performance costs, it means that our simple mistakes are not caught by the language until they actually happen when the code runs. In a statically typed language, if our code successfully compiles, we know our code is free of type mistakes, and when we do have such mistakes, the compiler tells us exactly what the mistakes are and on which lines of code. This is why many programmers prefer statically typed languages. On the other hand, some programmers argue that dynamically typed languages are less verbose and more flexible. Among the most popular languages in use today, statically typed languages seem to be somewhat more commonly used, but we have no really reliable statistics, and usage varies greatly among different domains of programming. Another split between languages is whether they are weakly typed or strongly typed. The precise meaning of these terms is not universally agreed upon, but broadly speaking, strongly typed languages impose restrictions that make certain errors impossible. In a memory safe language, for example, arbitrary access of memory is disallowed, and this restriction prevents a whole category of bugs stemming from accessing the wrong parts of memory. Strong typing also generally implies that the language makes invalid operations and invalid function calls impossible. Static languages check for such errors during compilation, but most dynamically typed languages will check for these errors at runtime, and so can be considered strongly typed in this regard. Lastly, strongly typed languages tend to be stricter about conversions between data types, requiring you to make the conversions explicit. This can feel more verbose and less convenient, but it helps prevent errors. A method is basically a function that belongs to a particular data type. The method has a special parameter, usually called this, which is of the same type as the data type to which the method belongs. So say for example we have a class cat, and we give cat a method named meow, which takes a number as input. The method actually also takes as input a cat, but that is left implicit. To call this method, the syntax is a little odd. The regular arguments, here a single number, are put inside the parentheses like normal, but the cat value is placed before the name of the method, separated by a dot. So what's the point? Well, one small stylistic nice idea of methods compared to regular functions is that different types can have methods of the same name. Like say the classes cat and dog could both have their own method named sleep. Sleep of cat and sleep of dog would be separate methods that just happen to have the same name. When we call a method, the language determines which method to call based not just on the name, but also the type of the special argument before the dot. The more significant rationale for methods though relates to subtype relationships. So what are subtype relationships? Well, if a data type B is a subtype of data type A, then values of type B are considered to also be values of type A. This means then that anywhere in code a value of type A is expected, a value of type B is valid as well. In dynamically typed languages, the concept of subtypes is kept informal. If B is a subtype of A, then B should have all the same fields and methods as A, but a dynamically typed language does not enforce this. In statically typed languages, however, subtype relationships are formally declared in code and enforced by the compiler. So say for a variable of type A, we can assign it values of type A of course, but we can also assign it values of type B if B is declared to be a subtype of A. So how do we create subtype relationships in a statically typed language? Well, there are two kinds of subtype relationships, inheritance and interface implementation. If we create a class dog, which inherits from another class animal, dog then implicitly has all of the fields and methods of animal in addition to any fields and methods we define in dog itself. We are effectively saying that dogs are a specific kind of animal, and so it should have everything that makes up an animal, but then also have more stuff specific to dogs. If then we create a variable X of type animal, we can assign a dog value to X because a dog is a kind of animal. Assuming animal has a method eat, we can call this method on X and the dog value stored in X will be passed into the call. However, assuming dog has a method bark that it doesn't inherit from animal, we cannot call bark on X because the compiler knows X to be an animal, not specifically a dog. We may happen to know in this case that X is currently storing a dog, but the compiler only knows that it stores some animal. In cases where X stores something other than a dog, calling dog's bark method would be invalid and so the compiler won't allow it even though the call would be valid in this particular case. As a rule, the compiler never presumes to know the value of anything because in most cases it can't know. In this case, the compiler could figure out that X at this point will always store a dog, but for consistency, it doesn't bother identifying these trivial cases. Now, when a class inherits a method, it can choose to override it, meaning it can provide its own definition of the method. Here, dog is overriding the eat method, which it inherits from animal. Now, when we assign a dog to animal variable X and call eat on X, the compiler knows the call is valid just like before, but it still doesn't know whether X currently stores an animal or a dog, and so it can't decide which eat method to call. That choice is deferred to when the call is actually executed at runtime, and in this case, because X will store a dog when the call is made, it's the eat method of dog that will be called. If next we store an animal in X, a subsequent call to eat calls the eat method of animal itself. So the rule is, which version of the method gets called depends upon the type of the value at the time of the call. Aside from inheritance, we can also create subtype relationships with interface implementation. An interface is considered a data type, but strangely interfaces have no data themselves. Instead, an interface is a set of what we call method signatures. A method signature specifies the name, parameter types, and return type of a method, but has no actual code. Here we have an interface C creature, which has three method signatures. The first, named swim, has no parameters and returns nothing. The second, named eat, has a parameter of type food and returns a number. The third, named sleep, has a parameter of type time and returns nothing. By itself, an interface isn't useful because we can't create values of the interface type itself. However, a class that has methods matching all of the interface's signatures is allowed to declare that it implements the interface, making the class a subtype of the interface. If say then we have a variable X of type C creature, and dolphin is a class that implements C creature, then we can assign a dolphin value to X. We can then call methods of the interface on X. The call to swim here invokes the swim method of dolphin, passing the dolphin value as the special argument. Once again, the compiler doesn't ever presume to know what actual value is stored in any variable, so it won't let us call the dolphin method squeak on X because squeak is not part of the C creature interface. In this case, yes, the call would be valid because X stores a dolphin, but again, the compiler never presumes to know that, and in other cases, X might store some other kind of C creature that doesn't have such a method, so the compiler can't allow the call. If though, we assign another kind of C creature to X, say a whale, then calling swim on X invokes a swim method of whale. Like with overwritten methods and inheritance, the determination of which version of the method the call happens when the call actually executes. Understand that interfaces are generally not formalized in dynamically typed languages. With dynamic typing, any variable can store anything, and so the compiler can't check whether method calls are valid. At runtime in these languages, calling a method only triggers an error if the type has no such method. If we're using a dynamic language, the call to squeak here is valid because when the call is made, X will store a dolphin which does have such a method. Likewise, with our inheritance example, if we're using a dynamically typed language, the call to bark here succeeds because at the time X stores a dog which does have such a method. So let me reiterate why subtypes are useful. Say we have some function foo that takes an animal's input and returns a C creature. Thanks to subtypes, we're not limited to taking in just one specific type and returning just one specific type, even in a statically typed language. This function operates on all kinds of animals, and it can return different kinds of C creatures in different branches of its code. Collections in statically typed languages also greatly benefit from subtypes. Say for example, we want an array that can store both cats and dogs. Well, in a dynamically typed language, we can just put any values we want into the slots of any array, but in a statically typed language, we have to specify what the array stores. Without subtypes, we'd have to choose whether an array stores only cats or only dogs, but with subtypes, we can choose a type of which both cat and dog are subtypes, allowing us to store both cats and dogs in the same array. Assuming here that cat and dog are subtypes of animal, our animal array can store both cats and dogs. In a statically typed language, having stored, say, a dog and an animal variable, what if we want to access the value as a dog so we can access the fields and methods that dog has but animal does not? Well, this requires what's called a downcast operation. Here, the compiler only knows X to be an animal, but we know in this case that X stores a dog. The downcast operation, denoted by the target type in parentheses, gets us the value as a dog rather than an animal. Be clear, the value is not being transformed in any way. Rather, the downcast simply performs a check at runtime to verify that the value stored in X at this moment is actually a dog. The check triggers a runtime error if it fails, but otherwise, execution continues as normal, and the compiler considers the value returned by the downcast to be of type dog, which is why it allows us to assign it to a dog variable. By requiring this check, the compiler is putting responsibility on us that the value stored in X at this moment is actually a dog. The compiler just has to trust us, but the language verifies at runtime. In a statically typed language, having made our code general purpose using subtypes, we sometimes want to constrain the code to more specific types. By doing so, the compiler in some cases can help us avoid dumb type errors and also spare us from cluttering our code with downcasts. Enter what are usually called generics. A generic function or method uses one or more type parameters as stand-ins for one or more types in its code. We then specify actual types to plug in for these type parameters in each call to the function or method. In this example, we have a function foo with a single type parameter named t, and the function takes a t as input and returns a t. The parameter t is defined as an animal, so in the code of the function, everywhere t is used, it's basically the same as if we just used animal itself. With the type parameter, however, when calling the function, we specify an actual type for t, which can be any kind of animal. When we plug in dog for t, the function returns a dog, not an animal, and so note that we don't have to downcast when we assign to the dog variable d. When we plug in cat for t, the function returns a cat and so we can assign the result to cat variable c. The last call here expects a dog, so we can't pass in a cat. We can also make our types generic with type parameters. Here the class pet owner uses a single type parameter named t, which is defined as an animal. The class has a field named pet of type t. Again, be clear that the field actually just stores an animal, but for particular values of pet owner, we specify a subtype of animal to plug in for t, and then for that particular value, the compiler constrains t to that subtype. If, say, we create a variable of type pet owner specialized for cat, then the pet field can only be assigned a cat, not a dog or any other subtype of animal. Generics may not seem especially valuable in small examples, but in the context of large programs, they give us greater confidence in the type correctness of our code while also sparing us some extra verbosity from downcasts. Note also that the term generic is a bit misleading. Generics do not allow us to make our code more general purpose. Subtypes already accomplish that. Instead, generics help us leverage the compiler in a statically typed language to better enforce type safety. If a value is only supposed to contain only certain subtypes, we don't want to accidentally give it the wrong subtypes. With generics, the compiler can catch when we make such mistakes.