 What's called a map, or sometimes a dictionary, or most generically called an associate of array, whatever we call it, it's a collection of key-value pairs. The items are stored with no sense of logical order, rather values are stored and retrieved by their associated key, which must be unique in the collection amongst all the other keys. Perhaps the simplest way of implementing a map closely resembles a linked list, but rather than in each node storing just a single value, we store both a key and a value. The obvious downside of this implementation is the performance costs of the basic operations. Like say, if we want to insert a key-value pair, well, first we have to actually look at the whole list to see if there's a node already with that key, because keys are supposed to be unique, so we can't just add a new key-value pair without checking first. In the special case, where we want a map where the keys only need to be integers in a limited range from zero up to some number, we can much more efficiently implement the map using just an array. Here for example, we can use a five-element array for a map, as long as we're happy with keys limited to the range of integers zero to four. We then store the values for each key in the slot associated with that key. So for example, the value of key two is stored at index two of the array. Now, while a much more efficient implementation, it is much, much more constrained. Ideally what we want is a map with the efficiency or near efficiency of this array implementation, yet which allows us to use any possible key. The solution to this problem is what's called a hash table, which is a map that uses a hash function to associate the key-value pairs. What though is a hash function? Well, a hash function is a function which takes any input value and returns an output value, a hash, which is constrained to a smaller finite range. So for example, a hash function might take in any integer as input, but then the output might be constrained to a finite range like say zero to 100. Now because the set of possible inputs to a hash function is always larger than the possible set of outputs, inevitably multiple inputs will have to produce the same output, the same hash. When different inputs produce the same output, the same hash, that's called a hash collision. Perhaps the simplest case of a hash function is one that simply returns a modulus of the input. So here, for example, our hash function mod 5 hash takes an input value and then returns simply the modulus of that value by 5. So for the inputs 5, 102,135, negative 95, and 10, those all produce the hash 0. And then the inputs 21, negative 4, 6,700, 81, those all produce the hash 1, 7, negative 303, 92, and 12, those all produce the hash 2, and so forth. And you see that there are actually only five different possible outputs, 0, 1, 2, 3, and 4. Though a very simple function, a simple modulus is generally a very appropriate hash function for a hash table because given a set of random inputs to the function, the distribution of the outputs tends to be uniform. So say we started feeding random numbers to our mod 5 hash function, the outputs we get would be quite evenly split between 0, 1, 2, 3, and 4. Each output would occur on average 20% of the time. Again, this is assuming totally random input. A downside of using a simple modulus is that it can only be applied to integer inputs. Ideally, we want a hash function which can take as input any kind of object. We could quite simply though modify our hash function to accept strings as input. All we need is some way of converting the string value into an integer value. Here we have a function string to int, which does just that. It takes an input string s, and then it loops over the individual characters of the string, gets the numeric character code value of each character, and adds that value to a sum. And in the end, we return that sum. Note here that the ORD function is a built in Python function, which translates a single character string into the ordinal value, the numeric value of that character. In any case, once we have the string to int function, we can then in our mod5Hash function first test if the type of the value is a string. And if it is, then we first need to convert the input value to an integer using the string to int function. And then we can simply take the modulus of that value. So for example, if we now pass in the string hello, all in lower case, to mod5Hash, what we get back is the value two. Now if we want a truly generic hash function, the obvious thing to do is to create a hash function that works by processing the individual bytes that make up that object. One of the simplest such techniques is called Pearson hashing, where we define a substitution table in which every possible byte value maps to a different byte value. So here, for example, we have the value 0, map to 72, 1 to 94, 2 to 8, 3 to 204, 4 to 238, 5 to 27, and so forth with the rest of the values not shown. Just understand that if 0 maps to 72, then no other value should map to 72. In other words, no two keys in this table should map to the same value. Also understand that this table is meant to be random. And there's no standard table for Pearson hashing. It's up to you to create your own such table. And understand that this means that the very same object fed to two different Pearson hashing functions will probably produce a different hash. Only when two Pearson hashing algorithms use the same table do they produce all the same hashes. In any case, we initialize the hash we're going to return to 0, and then we loop through the bytes of our object, and for each byte, we do a bitwise oar of that byte with the current hash value to get an index into our table, and then we assign the value found at that index to hash. Once we've gone through every byte of our object, we then return the hash value, which will be a value between 0 and 255. Now you might wonder why we have to loop through all the bytes. Why don't we say just use the first byte and call it good? Well what we're looking for is a uniform distribution in the output of our hash function, and so given two objects, which are very similar in their contents bit for bit, except maybe for one or two bits here and there, we want the output from the function to be very different. And if we just used the first byte or the first 10 bytes or whatever, then the small differences between objects, which might be buried, say, in the middle of the object or at the end, they wouldn't get reflected in our output. And then if we ended up feeding quite similar objects to our hashing function, we wouldn't get a uniform distribution. We would probably end up with hashes, which are clustered. That is, we'd see the same few values over and over, or the same small range of values over and over, when what we want is a uniform distribution, even when our inputs might be quite similar. In any case, Pearson hashing is one simple way to get a hash out of any object. The version we showed here is notably limited in that the hash range is only from 0 to 255. Sometimes we want a much larger hash value range. And I should also mention that hashing objects simply by reading their bytes as input is problematic in some cases because objects very often are composed of one or more references, i.e. addresses. And the thing about addresses is that they tend to be kind of incidental. When you allocate an object in one run of a program, it's generally unlikely to be given the same address in any other run of the program. But even if addresses were consistent from one run of a program to the next, there's still not really the true content of the object. Like, say, in Python, when you have a list of numbers, those numbers are not stored directly in the list, they are stored via references. The list object itself, in memory, is just a series of addresses. So even if we have two identical lists with all the same number values, their hashes are likely to be different because they don't necessarily reference the same number objects. You very well might have two separate objects representing the same number, one referenced in one list and the other referenced in the other list. So even though the two lists appear identical, a hash of the bytes of these lists as they are stored in memory will probably come out different. Depending exactly why you are hashing, that very well could be undesirable behavior. Getting back to hash tables now, assuming that we do have a proper hashing function with nice uniform distribution and say that the output of our hashing algorithm is constrained to the range of 0 to 4, we can now store any value in our hash table if instead of directly storing the values in our array, we add them to lists which themselves are stored in the array. So for example, if you want to add an object which hashes to 2, you append it along with its key to the list in slot 2. We can't of course just store the values, we must store the keys because it's the keys in the hash table which identify the items. And in fact, to make sure that a key remains unique, when we add an item, we have to first search the list and make sure that there is no such key already in the list. If there is, we don't add a new key value pair, we rather modify the value of the existing key value pair. Now the virtue here of this kind of implementation of a map over the implementation which is simply a single list is that here we have split that single list up into five. And assuming that we don't happen to add keys that all hash to the same list, the distribution of items among the five lists should be on average even. So when it comes time to retrieve an item from the hash table, we first hash the key we're searching for to determine which lists to look in and then we can search that shorter list rather than having to search a much longer list. In this example, we're only hashing to five different lists but more commonly in practice, hash tables could be much larger. They can have a hundred slots or a thousand slots each with its own list. Generally, the larger the array of lists, the fewer items you'll tend to have in each list. And so fewer items to search through when we add or set or retrieve a key value pair. This technique of using a listing each slot is called chaining. An alternative strategy is called open addressing. In open addressing, we don't add extra lists into the mix, we just store key value pairs directly in the array but in the event of a hash collision where we wanna store two keys in the same slot because they produce the same hash, we simply look for the next available slot and store the key value pair there. So here for example, say we have a key value pair already stored at slot 323 because it's key hashed 2323 and say that we get a hash collision. We have another key value pair we wanna add to the table whose key also hashes to 323 yet which is actually a different key. We can't store it in slot 323 so we search for simply the next available slot which in this case is 325 and we store it there. So when it comes time to retrieve a value from the table we look first in the slot to which the key hashes and if a different key is already in that slot then we have to search through the rest of the table to see if the key is found elsewhere. Now you may be wondering what happens when this table gets full. Unlike in our chaining example where we have lists which can presumably grow as large as we need in this arrangement we have a limited number of slots because the array is fixed in size. Well in the event that we run out of three slots the solution is to resize the hash table which means to create a new larger hash table and simply copy all the key value pairs from the old one into the new one. Obviously this is not a desirable thing to do on a regular basis because it's very expensive. Also note that in the new table because the array of slots is larger the hashing algorithm has to be tweaked to produce hashes of a larger range. Like say if our original table had a hundred slots and so we used a hashing algorithm that spit out values from zero to 99 if our new table has a thousand slots then it needs to spit out values from zero to 999. So this tweak to the hashing algorithm means that the key value pairs as stored in the original table are not necessarily stored in the same slot in the new table. When we resize the table items get moved around. Lastly another possible variant of chaining is to use a second level of hash tables rather than lists. These hash tables themselves may use chaining with the lists or maybe even another layer of hash tables or they may just use open addressing. Conceivably you could have many levels of hash tables within hash tables within hash tables. The choice ultimately comes down to what's most efficient given your storage needs. Very importantly though these hash tables should not use the same hashing function as your top level table. If they did then every value inserted into these tables would hash to the same value. They'd all cluster into one slot which really defeats the purpose of hash tables. So for example if you insert a key value pair where the key hashes to two and then when we insert into the hash table in slot two if the same hashing function is used then every item that goes into slot two of our top level table is going to end up in slot two of the hash table in that slot. So the nested hash table should use a different hashing function so that that doesn't happen. Again we want a uniform distribution in these second level hash tables just like we want a uniform distribution in our top level hash table. And this principle applies all the way down. If we have three levels of hash table if within our second level of hash tables we have a third level of hash tables well those third level hash tables should use yet a different hashing function. Otherwise again everything would cluster in just one slot. What we call a graph is a set of nodes in this context also called vertices and in this set any one node can be associated with any other node via connection called an edge. So in this diagram for example we have a vertices with the value five which is connected to the vertices one, two and four. Now what these connections signify is up for graphs depending upon context it may mean different things. And in some graphs we give these connections values called a weight. And in some cases edges are directed meaning that say the connection between five and two is asymmetrical. Just like say in a relationship between a parent and child it matters which is the parent and which is the child. The same is true with vertices connected by a directed edge. What the nature of that asymmetry is is up for graphs it can differ from one graph to the next. Like for example imagine a graph in which the vertices represent locations, physical locations and if the edges have both weight and direction the weight might signify the distance between those locations, those vertices and the direction might signify the travel between those two locations can only go in that one direction rather than the other. Such a graph representing locations travel distances and travel directions a graph like that is what we would use if we wanted to calculate optimum travel routes. Now one thing to be clear about is that a diagram of a graph is not necessarily meant to accurately reflect the respective locations of these vertices. In fact in many graphs vertices don't represent locations in two dimensional space. All that really matters is the set of vertices and the edges between them. In this diagram here for example if we were to take that vertex six and draw it instead on the right side of the graph well that doesn't change anything as long as we still see an edge connecting six and four. So again in these diagrams the depicted relative locations of these vertices is just a convenient fiction that makes the graph easy to look at. As I mentioned graphs are used for many different purposes such as the example I gave of computing optimum routes. As for the implementation of a graph it's storage and memory there are several different techniques one of which is to keep a list containing just the vertices and another list of all the edges. Alternately especially in the case where you want directed edges you could create a node type that contains not just one reference to another node but potentially many and you could then represent a graph as a set of these nodes with the references from the nodes to other nodes representing the edges. A very common special case of a graph is called a tree. A tree is a graph in which any two points are connected just by one path that is to traverse from any one vertex to any other you could only get there by following one set of edges. So here for example to get from vertex one to vertex three the only possible way is to go from one to four and then four to three there's no other route. The most common kind of tree we deal with is called a rooted tree in which one node is designated as the root and all the edges point away or in some cases towards that root. So here for example the vertex with the value two is our root node and all the edges are pointing away from that root node. In the terminology of a rooted tree we also have what are called leaf nodes which are nodes other than the root node which only have one edge. It's possible to have a rooted tree in which the root node also has only one edge but the root node is never considered itself to be a leaf node. Routed trees are obviously very useful because lots of data is by its nature hierarchical. For example the file directory structure is a hierarchy it's a rooted tree hence the term root directory. Like with the more general graph there's many ways we could implement a tree again the most obvious being to use two lists one to store the edges and one to store the vertices or we could use a system of nodes. There are many variations but we won't get into them here.