 In a previous unit, we discussed how the address space of a process is split into three parts – the text section, the stack, and the heap. The text section simply stores the code of the process, and so is generally made read-only. The stack is a contiguous chunk in which we keep track of the chain of function calls and store their local variables. The heap is effectively the rest of the address space, where we can store any data other than the local variables. Whereas memory for the text section and the stack is effectively automatically allocated by the operating system, memory for the heap must be manually managed by the programmer. When the process begins, the heap portion of the process address space isn't backed by any actual memory. The programmer must explicitly request from the operating system each chunk of the heap address space which they wish to use. When finished with the chunk, the programmer should notify the operating system so that the operating system can allocate the memory for other uses. Most modern interpreted languages, however, such as JavaScript and Python, have a feature called automatic memory management, or automatic garbage collection. With this feature, the programmer can simply create new objects as they like, and the interpreter will request more heap memory from the operating system as needed. The interpreter then also keeps track of when objects are no longer reachable, when no references point to that object anymore. Once an object is no longer reachable, the interpreter knows that the programmer is done with the object, and so the interpreter can give the heap memory occupied by that object back to the operating system. Automatic memory management is not only convenient, it helps prevent memory leaks. A memory leak is a bug in which a program allocates heap memory but fails to give it back once the program no longer needs the memory. Especially for long running processes, this creates a problem as the process allocates more and more chunks of memory without giving them back. At the very least, this wastes memory that could otherwise be used by other processes. Worse, your process will likely fail once it runs out of address space, or once the operating system is unwilling to give it any more memory. When the programmer has to keep track of the chunks of memory they've allocated in order to give them back once finished with them, that's one more thing the programmer might get wrong. Even when programmers follow good practices to keep track of their memory usage, it's still a naturally error prone task. What makes memory leaks especially devious is that they're a very difficult kind of bug to track down. When running a program, memory leaks rarely manifest at the start of the program, so leaks can lurk in code undetected for months or years. Even when we know we have a memory leak, tracking down the culprit lines of code is extremely difficult, because a leak from one part of code generally looks like a leak from any other part of code, and memory allocation businesses generally scattered all throughout a code base. So automatic memory management spares us a lot of these headaches, but understand that it doesn't completely eliminate all memory leaks. What can happen in a language like JavaScript is that the programmer might unintentionally leave around references to objects which they no longer need. For example, the programmer might add items to an array, assign the array to a global variable, and then forget about it. Until that global variable is assigned a different value, the array will stick around, and likewise all the items in the array will stick around as well. This is a memory leak because the program is effectively keeping around a bunch of objects which it doesn't need anymore. I did say that automatic memory management is a feature of interpreted languages, but conceivably we could implement the feature in a compiled language. It just wouldn't make sense though because compiled languages generally aim for high performance and a high degree of programmer control. Adding automatic memory management to the C language, for instance, would add performance overhead and interfere with the programmer's ability to use memory exactly as they please, thus largely defeating the whole purpose of using the C language. It's a fact of life that programmers are going to make mistakes. When writing thousands and possibly millions of lines of code, mistakes are just inevitable. One common kind of mistake we call a type error, which occurs when we perform an operation upon the wrong kind of data. For example, in this Pigeon code, we define a function which competes effectorial. In the third line of the function, the parameter of the function n is used in a greater than operation. So, if we improperly invoke the effectorial function with a boolean argument instead of a number, a type error is triggered in the function because the greater than operation only works with number operands, not booleans. At least, this is the behavior of a dynamic typing system, such as in Pigeon JavaScript and Python. In such languages, the programmer can pass any type of value to any function, but when a built-in operator, like the greater than operator, receives the wrong kind of argument, an error occurs, which in most dynamic languages means an exception is thrown. The goal of a static typing system, such as in the C and C++ languages, is to allow us to programmatically detect type errors in code without running the code. In a language with static typing, the compiler or interpreter can find and report type errors, and in fact, will generally refuse to compile or interpret the code once it finds such an error. The cost of this type checking is more restrictive use of data types. In a static language, every variable, every function parameter, every function return type, and every collection must declare a single data type. A variable can only be assigned values of its declared type, a parameter can only be passed arguments of its declared type, a function can only return values of its declared return type, and a collection, such as a list, can only store values of its declared type. So, say we wanted to turn pigeon into a statically typed language. We could do so quite simply by requiring type declarations for every variable, every function parameter, every function return type, and every created collection. Let's say the declarations are written as type names prefixed with a colon. So, here the factorial function's name is prefixed with num, denoting that the function can only return numbers, the parameter n is also prefixed with num, denoting that only numbers can be passed to n, and the variable val is also prefixed with num in its first assignment, denoting that only numbers can be assigned to val. So now here a compiler or interpreter can detect the real source of our type error, which is the call to factorial with the Boolean value. Our factorial function's parameter n was always meant to only receive number values, and now with a type declaration, any call to factorial that erroneously passes any other kind of value can be detected before running the code. Even if the argument expression is a variable or function call, such as here a call to some function foo, an erroneous type can be detected because with type declarations the type of every expression is known. Here function foo itself must have a declared return type, and so the compiler or interpreter knows whether this call to foo will return a number. Now, I also mentioned that static typing requires any collection we create to have a single declared type. What this might look like in Pigeon is that we would have different operators for creating lists and dictionaries that store different types. For instance, the numList operator would create a list that can only store number values. Effectively, any two lists which store different types would themselves be considered different types of values, e.g. a numList would be a different type than a Boolean list. Likewise for dictionaries, dictionaries with string keys and number values would be a different type than dictionaries with number keys and Boolean values. We could, however, still use the same get and set operators for all these different types. Here, the compiler or interpreter knows that foo is a numList, so it knows that this get operation will return a number value and it knows that this set operation improperly attempts to store a string in the list. While requiring each collection to store just one data type is very restrictive, removing the requirement would create a huge hole in our static typing system, and we would no longer be able to programmatically determine expression types before running the code. What then does a programmer do when they need, say, a list of items of various types? Well, as we'll see when we look at static languages like C and Java, there are workarounds for this problem, but they are very cumbersome relative to the straightforward approach of dynamic languages like Pigeon and JavaScript, where we can just stuff anything into a list. Similarly, requiring each function to have just one type of returned value can be quite restrictive, as is requiring each parameter to have just one type of accepted value. Why would a programmer accept these restrictions? Well, static typing allows all of our type errors to be detected reliably without running our code, meaning type errors won't lurk hidden in our code, as can happen with dynamic typing. Whether eliminating this one class of bugs is worth the restrictions of static typing is one of the never-ending debates among programmers.