 So welcome back to the Python and programming course. This is chapter five, bits and numbers. So in this chapter, we talk about numbers, what numbers are, and in particular how they are modeled in memory and what that means for us as data science practitioners. Let's start simple with a data type we know starting from the first chapter on, the intertype. How do we create an integer? Well, that's easy. All we need to do is we need to express what value we want to write down or what integer we want to create by just typing the corresponding number as a literal. Because Python can treat it or can understand the literal notation here. And then in this case, we assign it to the variable a. And a, of course, is an object on its own. So it has its own identity. It has a type, namely time int, and it has a value. And again, the value that we see here, the 42, is what we humans would think of the number to be. So as we will see in this chapter, the number 42 is, of course, not inside the object in memory. There is something else, namely ones and zeros. And we will try to understand how the ones and zeros can be used to express the number 42. So regarding the literal notation of numbers, what else is there to know? Well, one nice feature of Python is that we can use underscores here to group digits together. So in this case, I write the number 1 million. And to make it a little bit more readable, I put the underscores around groups of three, so that for the reader, it's immediately clear that this number is 1 million. Because if I see the output here, I have a hard time to see if that is a 1 million, a 10 million, or maybe a 100,000. And imagine the number gets even longer. So we just use an underscore as a separator here. There are no syntactical rules regarding the underscores. So we could technically go ahead and put an underscore between every digit in the literal, and the number would totally ignore the underscores. It's just for our convenience, we can put them anywhere we want to make the number look nice and readable in source code. One thing about the literal notation is that we must understand that we cannot use leading zeros. So we already see that Jupyter Notebook makes the first zero here red. And that is already an indication that Jupyter Notebook doesn't like this. And if I execute this code cell, I see a syntax error. So Python doesn't even know what this number means. I cannot even read it. And we will see soon why we cannot do that. Another way to create integers is to use the int constructor. So in chapter two, we saw what constructors are. These are built-in callables that take any object as their argument. And then create an object, in this case, of type int. That's what the int constructor does. And the int constructor simply goes ahead. And in this case, it takes a float and to make an integer out of the float, what it does is it simply cuts off the decimals. So we get back 42. Now the question is if we have a float that would be rounded up by us in in math, what happens then? And the answer is in this case as well, the number would be just rounded down because what happens is the .87 is just cut away here and it's not rounded. So rounding is something different. And if we use negative 42.87, then we do get back negative 42. And maybe you remember that in the first chapter when I talked about the floor division operator, the double slash, the double slash would behave as if we rounded the result always towards negative infinity. Well, this is a subtle difference here. So the int constructor rounds the floating point number as if we would round it towards zero. So if we are above zero, the int constructor rounds down. If we are below zero, the int constructor rounds up. But that's just a subtlety to observe. We could also pass the int constructor a string. So the string with the 42 would work. The int constructor can actually make a real integer object out of that. However, when we pass in a string, we have to be careful not to include any characters it doesn't like here, for example, the dot. So if we format the string as if it were float, then it doesn't work. However, white space, leading and trailing white space here gets ignored. So the int constructor is quite versatile. And whenever we are loading in data from some outside source and we know that the data we get is supposed to be integers, then maybe we go over all the data when we read it into our memory and we use the int constructor to explicitly cast all the data that we load in as int type. That's what we can use the constructor for. Now to answer the question of how are numbers stored in memory? So how are zeros and ones in memory put together to mean the number 42, for example? All of this is done with this encoding table here. And before I go over this table, I will just give you an example because with examples, it's a lot easier to understand. So let's say we have the integer number one. How would the integer number one be modeled? So the integer number one would be modeled with a sequence of zeros and ones such that the corresponding number that is down here add together to one. So which of these numbers, which of these numbers in the last row here, do we have to add together to obtain one? And the answer is, of course, we only have to add up this one here because then it's one. And now we ask the question like at what position does it stand? And we start to read numbers from right to left. So this one is in the zeros position in the first position because we also here we start counting at zero. And that means we go ahead and we would express one as a sequence of eight bits where the first seven bits are just zero and the last bit is one. So this is a binary sequence of zeros and ones that means the number one in decimal in base 10. So let's go further. How would we model two? The decimal number two. So that's also easy. Which sequence of eight zeros and ones would encode two here? Well, it's just, it's all zeros and just one and the one is here in the position one as well. So we would write zero, zero, zero, zero, one, zero and that's the sequence of zeros and ones that together mean the number one. And so you may wonder how are these numbers down here chosen? Well, they are chosen to be the result of powers of two. So in the middle line here, we see the explanation of how the position of a zero or one of a bit corresponds to the number that we use in the summation that I just used to express the numbers one and two. And one is, of course, two to the power of zero and two is, of course, two to the power of one, right? So this is why it's also worthwhile to start counting at zero in this line because then we can just raise two to the power of the position of where the bit stands in. And yeah, we just call the zeros and ones the bits of the number, so to say. Let's do it one more time or some more examples. For example, the three. How would we model three? Well, three is the sum of two and one, right? So how about this? How about we model the decimal number three as zero, zero, zero? Zero, zero, one, one. So this is what three means in binary. Let's do two more examples. Let's also do four maybe. So what would four look like? Well, the four we can just read in. The four would be here just the third digit, the third bit would be set to one. So and all the other bits would be set to zero. So this would be the number four in binary. And here we see a pattern. So whenever we need the next bit, we just set the next bit to one and set all the bits to its right to zero. So we start with the least significant bit, it's set to one. And then we want to increase the number by one. So how do we do that? We set the zero to one and all the bits to its right, which is only one to zero. And then we want to increase the number by one again. How do we do that? Well, we just flip this bit from zero to one and we get the number three. How do we get to number four? Well, we just have to increase, we have to flip this bit from zero to one and then set all the bits on the right hand side to zero. So there is kind of a pattern behind. So it's not so hard to do this encoding manually. But let's do it for a larger number. Let's do it maybe for the number 12. So how can we encode 12? So 12 is of course the sum of eight and four. So the third and the fourth bit would be one and all the other bits would be zero. So we would write this as zero, zero, zero. And then one, one, zero, zero. So that's how it works. And this is all the magic that is needed to convert the ones and zeros inside a computer's memory to some number that we as humans can understand. Now you may wonder why were these numbers chosen as they are? Well, it can be proven and we won't do this here that to model the numbers from zero to 255. It's official that we use these eight numbers. And then we can express every number between zero and 255 as a unique summation or addition of some numbers out of this sequence here. And it's unique. So there are no two combination that will mean the same integer number. In other words, this is a one-to-one mapping in both directions here. And now you may wonder why did I choose to show you eight bits here? So, and why are the numbers from zero to 255 maybe a little bit more important? Well, groups of eight bits are called a byte. And bytes is basically a very common way of how data is transferred between programs but also stored or data is stored, let's say on disk and so on. So it would be kind of, you know, it would be not be efficient to store, to think of data as just bits because one bit of information is just not a lot. But if we have eight bits of information and we group it as one byte and we only look at data in terms of bytes so in groups of eight bits, then we can think of every byte as, you know, more than just a yes or no. And it makes sense to just think, you know, in a bit of a higher level here. So how can we get this binary representation of an integer if we don't want to do that by hand? We call the built-in bin function in Python. So for example, bin and we pass in the integer three and we get back zero B11. So in our list, in our manual list of encoding, we also have a one one here and obviously what Python does, the leading zeros here, they're all stripped away. And that makes sense because we, when we write a number like, let's say three, we write three and we don't write zero three. So also as humans, we also don't write leading zeros in our ways of counting. And Python does the same here. And now we understand why starting an integer literal with a zero also results in a syntax error. It's because the zero here is used to indicate that whatever follows is to be interpreted as a binary sequence. But wait a minute, what did bin here return? Well, obviously it returned a string, right? So let's go ahead and see what we can do with the string. So we can, of course, go back to the int function, to the int constructor. And we can pass the string in here and then int will go ahead and create an integer object. And we see here the number three that was returned. And this is basically a way to ensure that we can go both directions here, right? So what happens if I leave out base equals two? Well, we get an error. So Python cannot understand this here. And now let's take it one step further. What happens if I type in zero B11 just into a code cell without the enclosing quotes here? I get three again. And this is actually the reason why we cannot start an integer literal with zero, right? So bin returns a string, but independent of this sequence of letters being a string or just a literal that we can enter back into a code cell, we, yeah, Python understands it. So this is a way to type binary data right into a Jupyter notebook. And of course, we can combine this with an underscore. So if I type zero B underscore 11, I also get back three and I sometimes do this, especially if the sequences of ones and zeros gets a bit longer. I usually group the ones and zeros in groups of four. We'll keep it more readable. And also sometimes I put an underscore between the prefix and the actual ones and zeros. But this is just style, right? So this is like the minimalistic way of doing it. And this may be a little bit nicer to read. Okay, so let's look at some numbers that are kind of, you know, more important to know. So what's the binary representation of zero? Well, it's of course a sequence of all zeros. So I didn't write zero here, but if I wanted to write a zero, how could I do that? Well, zero can be encoded as simply eight zero bits. And then as I said before, 255 is just all ones. So maybe I write it here, dot, dot, 255. And this is mapped to a sequence of all ones. And yeah, that's kind of a standard. And you may have seen numbers between zero and 255 when you deal with, you know, colors in graphic programs. Oftentimes colors are expressed as a couple of three, a couple of three numbers. And all the numbers are always between zero and 255. And zero meaning there is basically, one of the color is not really in use. And the 255 means full brightness. So the three colors that are commonly used are red, green, and blue. And then you can create any color out of those three base colors by choosing some combination, some mixture between red, green, and blue. And you do that by choosing a number between zero and 255. And you can imagine that this is kind of like between zero and 100% basically. So you have like a 255 shades of gray if you want. So these two numbers are kind of a bit more important. What happens in Python if we want to express the binary representation of the number that is larger than 255, nothing fancy. So the only thing that happens, Python just continues to zeroes and ones to the left, just as we would do with our common whole numbers as well. So let's do a little bit of arithmetic with ones and zeroes. And this is just to show you what a computer actually does under the hood when it calculates. So if I add one plus two, I get back three. But if I really want to know what happens here, let's do the calculation in the way we are taught in high school or in elementary school, I should say. And let's do it in a way that uses the binaries, okay? So if I want to add, let's say one plus two, how would I do that? Well, one, as we said, would be zero, zero, four zeros, and then zero, zero, one. And two would just be zero, zero, zero, zero, zero, one, zero. So if I want to add those two together, this is of course three, but how can we see that in binary? Well, remember in elementary school, how did you add two numbers? Well, you just added the corresponding digits, right? Or you, yeah, that's what we did. So one plus zero would be one, zero plus one would also be one, and then we end up with all zeros. And lo and behold, this is just two plus one, which is also three. So this is how addition works in binary. And I just want to make you aware of that, that even though it looks kinda hard, it's really simple. It's basically elementary school level difficulty, just using two digits, two bits. Whenever we have just two digits, one and zero, then we just call it a bit. And let's see, how can we see that in Python? Well, I provide here a code cell, which also shows you the binary representations, and that's exactly what I see here, right? I see here one bit, and here I see one zero at the end, and one one, and that's exactly what I see in the code cell here too. So we could mimic what a computer does in memory, and we could at least look at this. And now one extension of this, just to show you that how math works in elementary school, let's add one plus three, and this is of course four. But let's do that one more time in binary. So this number in binary, and then we add three. So three is of course, as we calculated as before, just all zeros ending with two ones. And how does this work now? So here I add one plus one, and this is of course two, and a two is greater than one, so because of that, I write zero here and carry one over. Remember that from elementary school, and now I add zero plus one plus one again, which is zero, and one carried over, and now we have zero zero one, so I have a one here, a zero here, and then I have all zeros in front here. So this is of course, if we go back to our original list here, this is exactly four. I can see that in Python as well. And here's a little code cell that also shows to us that what we did here on paper is actually correct. Okay, and now let's maybe do one more thing, let's do multiplication. Let's do it here on this paper. So four times three is of course 12, right? So how does this work in binary? Well, four times three, let's write it down. The four is of course zero zero zero, and then zero one zero zero, and we multiply this with three. So four zeros, and then zero zero one one. And remember how did you do multiplication in elementary school? Well, for every digit that we have on the right operand, we have to multiply all of the left operand, and then we have to add the various sums together. So how does this work? Well, if I multiply one, let's say I want to multiply the first one here with this number, all I would do is I would just copy paste zero zero one, zero and then four more zeros. And then that's the first one here, the least significant one, and then I want to multiply with the second one with the entire operand, and how do I do that? Well, first I have to write a zero because I am moving one digit to the left, one bit to the left, so I have to first prepend the entire line with one zero here, and then I can basically copy paste this one more time. And let's hope, and then those two lines, those two they are added together, and this will add up to zero zero one one, and then four more zeros. And I was lucky, here we see we have 12, so this is of course four zeros, 11, or one one zero zero, and we have the same number here. So this is really four times three, and this would be 12. So now we have done multiplication and addition, and we will stop here with the binary because Python does that for us, we don't have to do that on our own, so we really don't want to waste our time, I just wanted to show you that you could actually look into this and understand what a computer does in memory. And here I have all the steps that are involved separately, so this is the entire result, and these are the two individual lines before summing them together. Okay, that was binary representation, this is what happens in memory. Now let's look at another representation that is quite important, it's the so-called hexadecimal representation. Hexadecimal works just as binary, it also works just as base 10 that we know from school, only that we don't have two or 10 digits, but we have now 16 digits, that's the only difference. And how is it done? Well, here's a bigger table, a big encoding table, which lists several decimal numbers that we know, plus the corresponding binary number, plus also the corresponding hexadecimal number. And I just told you that in hexadecimal, we don't have 10 digits, but 16 digits. So what happens after the nine? And the convention is, if we after the zero to nine, we continue counting the digits from A through F. So A is basically the 10 digit and B is basically the 11 digit. The only thing that is important is that we have only one letter or one symbol that means some digit. And as I said, the convention is to use the letters A through F. And then we see that the numbers from zero to 15 can be modeled or expressed with just one symbol in hexadecimal going from zero to F. And then the numbers from 16 to 31, they also get expressed with the hexadecimal numbers from zero to F, but they are all prepended by one. So just as after zero to nine, we start with zero to nine again and just prepend by one in the decimal system, we do the same logic in the hexadecimal system. So you can look into that in detail later on. The thing I have hexadecimal here is hexadecimal is the format that is commonly used to represent data that cannot be interpreted as characters or numbers. And the reason for that, the reason why programmers use hexadecimal is because if I want to, let's say, for example, encode the number 10, I would need four digits in binary, but I only need one digit in hexadecimal. So whenever the binary representation of zeros and ones tends to get very big very soon, the hexadecimal representation tends to grow only slowly because it has 16 digits. And so a common convention is to use hexadecimal because then whatever information we want to express in this representation becomes a very concise sequence. And of course, the hexadecimal representation of the number zero would be zero x zero. And now see that instead of a b, we have an x. The x stands for hexadecimal, of course. So just as we had zero b before as a prefix, now we have the prefix zero x, that's it. The number one would be the hexadecimal number one as well. And then as I said, the number 10, the decimal number 10 would be the hexadecimal digit a and the number 15 would be f. So this is the digits a through f here, 10 through 15. And for example, if I model a number that's a little bit longer, I need in binary in this regard, I need exactly seven bits. So to model 123, I have seven bits of information that I need, but if I want to express that as a hexadecimal, I only need two digits. And this is why programmers prefer the second representation over the first because it is simply shorter and it expresses exactly the same. So the exact same sequence of bits is also encoded into just two digits here. And we will get back to this representation in much detail in the next chapter, chapter six on text data because text is these days modeled in an encoding called UTF-8. And in order to understand encodings of text, we have to understand hexadecimal a little bit at least. And this is why I introduce it here. So what can we do with the hexadecimal representation? Well, I can just create back an integer if I now pass in the base 16 because hexadecimal is just base 16. Get back 123. And of course, I can also type in 0x7b as a literal. And if I evaluate this, I will also get back the number 123. And note here, as I said before, I tend to use the underscore here. Some people don't. It's really a matter of choice, a matter of taste here. Okay, that was the integer number. So that's all that there is to know about the integer type. They are, you know, expressed as binaries as a binary sequence in memory and we have understood everything now. So now I quickly put up the bool type here. So you may wonder why do I show you the bool type in a chapter on numbers? Well, the bool type, we know it's just the two values true and false. I can add true and false and this may sound weird but true plus false is equal to one. And why is that? Well, in the context of arithmetic and that is what we express with the plus operator, the true is regarded a one and the false is regarded a zero and if we add a zero and one, we get one. So this is really an integer that we get back. If you don't believe me, you can always, of course, just use the type function and then you can verify for yourself that the one we get back is indeed an integer type. And this is just for completeness sake. That's why I put the booleans here because I want you to understand even though bools are not numeric objects, they behave like numeric objects given a context in which they have to behave like that. We can of course say 41 plus true, this is 42. We can multiply some number with false to get zero because it's just like multiplication by zero. And if we ever need the numeric value of a boolean, we can also of course pass it to the int constructor and then in this case int with true, it gives me one and int with false gives me zero. And of course, one note here, how many bits do I need to express yes or true? Well, it's of course one, so I only need one bit of information and I also need one bit of information to express a no, to express a false. And now this is just to remind you, there is this none type here, the none value and none we saw in chapter three on conditionals sometimes behaves like false, but we must not confuse none with false and one way where this will not happen is in arithmetic or numeric context because if I try to convert none into a number, I get an error message, a type error. So at the int constructor, as we saw, it takes all types of data, but it does not accept a none type. Okay, so this is also a chapter or a section where I want to go over rather quickly because what I wanna do here is I want to show you some operators that we have not seen before and just show you what they are actually meant to do and then later on in a chapter on arrays and data frames, we will see those operators again when we look at matrix data, that is where they are often applied. So we have here the end operator. So what does end do? If I say 11 and 13, I get back nine. So what is end really? So to understand this, we just look at the binary representation of 11 and 13 that are given here. And then the end operator basically works like this. It compares both operands bit by bit. So it compares the least significant bit of the left operand, so this one here and the least significant bit of the right operand here. So this one. And if both bits of both operands are one, then the resulting bit in the resulting number will be one, otherwise it will be zero. So only if both operands have a one in the same position, then the resulting object will have a one as well. And so if I look at the binary representation of 11 and 13, I will see that here we have a one and the one. So that's why I have a one here. And in the fourth position, I also have a one and the one, which is why I have the one here. And here as a zero, a zero one, and here as a one zero. So the bits are basically the opposite. And because they're not both one, that's why I see a zero here. So in terms of numbers, if I say 11 and 13 is equal to nine, this is basically totally useless in terms of integer numbers. But as again, the operator itself, I just want you to familiarize you with the end operator. You will see this again. This is useful if we want to check for conditions. And if you want to make sure that more than one condition is fulfilled in the data frame, then you will see this end operator at work in a future chapter. Okay, and yeah, of course, of one, zero, zero, one, the binary means the number nine, which we saw above. Another operator that is quite useful is the logical or operator, which is just also called a pipe. So this character is called the pipe character. And if I say nine or 13, I get back 13. How does this work? Well, let's look at the binary representation again. And the OR operator works like this. If either the left or the right or both operands have a one in a certain position, then the resulting object will have a one in the same position. So because of that, if we look at the result, we have a one here, we get the one here. We have a zero here, and we have also a zero here, which is why this one here remains a zero. And then we have a one, one here and the one zero, but because we have two ones here, this is already a one. So anywhere where we have a one in either of the two operands, we will have a one in the outcome. And that's why we get the number 13, right? So one, one, zero, one is just the number 13. And then comes an operator that we have seen before quickly in the first chapter. So in the first chapter, I told you what is the exponentiation operator. The exponentiation operator, just to remind you, was the double asterisk. And I said you back then that you should not confuse the double asterisk with the carrot operator here. And the carrot operator is also called the exclusive or operator. So it works like or, but the result has a one only if, one of the two operands has a one. So here we have a one in the least significant, in the least significant bit. Here we have a one, so that's why we get a zero here. Here we have a zero, and here we have a zero, which is why this remains zero. And then we have a one here and a zero here, and that's why we have the one here. And then we have a one and the one, which also resides in the zero, and the leading zero is of course cut away. So what is binary one, zero, zero? This is of course four. And this is, yeah, another logical operator. There is one that I have a hard time to explain without going into too much detail. So I will just only briefly show it. It's the tilde character. And the tilde character is set to flip the bits. So it's set to make the ones into a zeros and the zeros into ones, but it's not entirely true. There is a convention behind. And the convention is that whatever we flip must have the same ones and zeros as the same number plus one, and then we negate it. So that's the relationship. And this goes by a name, which is called the two's complement. And the two's complement is actually not so hard to understand, but I think at this point for a data science practitioner, it's not really necessary to know how this works. So I just present you the operator and leave you with that. What this operator will be used for in another context will be negation of conditions. So when we look at arrays and data frames and which is basically the replacement for Excel data later on, or matrix data, then we will look at how we filter, you know, big data frames full of data. And then we, as I said before, we will filter that with conditionals. And sometimes it is necessary to negate a conditional so to make it true into a false and the false into a true. And this will be done with the tilde operator. And so the operators that you saw so far in this section, they have a very specific meaning when we use them together with integers, but they are overloaded in another context and the context will be data frames in a later chapter. And then the last two operators in this chapter is the left and the right shift operator. So what this means is, let's look at the binary representation of seven, which is one, one, one. And if we left shifted by two, what that means is we just add two zeros on the right. So here we add two zeros on the right. And then this just happens to be 28. And in the same way, so here we also have binary representation, which is 28. And in the same way, when we right flip something, when we right flip the bits by one, what happens is in the binary representation where we have four seven, where we have one, one, one, all the bits are moved one to the right and the least significant one gets lost. So that's why we end up with zero with one, one here. And then the sequence one, one in binary is just three in decimal. So these are the bitwise operators that are used to do different things in another context as we will see in chapter nine, actually. Okay, so much to integers. The takeaway is we have looked a little bit into how integers are implemented and a little bit of how we can work with ones and zeros. And now the question is, how far can we go in a real world application with just the int type? And usually this means we cannot get that far. Usually the most common data type in terms of numbers that is used by data science practitioners tends to be the float type. And the float type, as we have seen before, models decimals, decimal numbers, but there are some caveats that we have to look into that are very important to know. So what is yet to come here in this section may be a little bit more important than the bits of the integer type before. So how do we create a float? Well, we use a literal notation by typing here 42.0. So this dot here is enough to tell Python, hey, whatever I type here will be a floating point number. I start here in the variable B and B is of course a full object with identity type float and the value. Okay, how else can we create a float? Well, I just said that it's enough to just put in a dot. So 42.0 will create the floating point number 42.0. We can of course also use the float constructor and here I pass the integer 42, I get back 42.0. The float constructor also takes strings and that will become important in a moment because of the float. There are some special numbers that we will see. So float 42 will also give me 42.0. How else do we get floats? Well, we have seen in the very first chapter already that if I divide two integers with a single slash, the normal division operator, I also get back a float which is to be expected, but we saw that this is mitigated in Python by providing a double slash for floor division. And in this case, floor division would be zero because this would cut off the decimals. But if we use one slash, then we get back floats. Of course, if we use any type of arithmetic operator with one float and one integer, the result is going to be a float. So here I add 40.0, the float plus an integer two and I get back 42.0, the float. Also the other way around, I multiply by a float and I get back a float. So floats occur very often. They occur naturally as the result of many operations. Just in case, if we want to work with real big and really small numbers, there is something that's quite helpful, which is the scientific notation. So scientific notation means I write the significant digits of some number and then I multiply it times 10 to the power of something. And this is expressed in Python with E. So one dot 23, E zero means 1.23 times 10 to the zero power, which is of course 1.23, right? And I can of course also use scientific notation to get just the one. So this would be one E zero. And so what I can use scientific notation is, for example, to express thousands or to express milli, which is E times negative three. So one to the power or 10 to the power of negative three. And this is just something that you may need if you work with really big or really small numbers because that makes working with them quite easy. And then as I said, there are special values. So what are special values? Well, for example, if I pass the word none as a string to float, what I get back is just none. And what none is, it stands for not a number. So not a number basically means that it's a number that indicates that we don't know what the value is, but it is still float object, right? So that is due to the fact that floating point numbers are standardized and the standard after which they are built basically mandates that there are some special values among which is not a number value. At the same time, there are also two more special values which is infinity and negative infinity. And so the question is, do we really have to care about those special values? We will see that unfortunately we will have to. And also they are quite useful. So first, some fact of what does the NAN, what do NAN numbers do? So if I compare not a number to another, not a number, usually what I would expect here is that I get back a true because if I compare some object with a certain value to the same type of object with the same value, I usually tend to get back a true here when I use a comparison here. However, if I compare two NANs with each other, I get back a false because by definition, a value that is not a number that we don't know what it is compares unequal to everything even to itself. That's by definition. And also if I add any number to the not a number, I always get back another number. So that's already a very important thing to observe. So whenever we by accident do some arithmetic and some values are not a number in between, then this may destroy our entire calculations, right? So that is important because where do not numbers come from? Usually they come from invalid operations. For example, maybe they come from a divide by zero error which was not raised, but then instead of raising a divide by zero error, we may get back some NAN somewhere. And if we don't detect that and continue to calculate with our intermediate results, then maybe introducing NAN values in between will lead to invalid results at the end. So we must be very careful here. And then of course also the infinity values. So if I compare infinity to infinity, I get back true. However, if I have infinity and I add some constant and I compare that to infinity, I also get back true. And this follows from math because infinity is basically, well, as we approach infinity, any constant, no matter how big, will lose its meaning basically. And also if I multiply infinity by a sign, so for example, if I multiply negative infinity by a negative value, I get back positive infinity. So this is, if you remember the rules from taking limits in math, these are the exact rules that you see at work here. Okay, and then of course, this is, if I add, for example, and positive infinity to negative infinity, I get not a number. And this also follows from math. So when taking limits in math, we know that I cannot add infinity values of different sign, right? And this is an example of how we could maybe by accident introduce nan values into a calculation. So imagine I don't divide by zero, but I divide by a number that is very close to zero. What happens? Well, if I divide by a number very close to zero, the result will be very, very large. And maybe it will be too large to be stored in memory. And then the number may be expressed as this special infinity value. And whenever I have one infinity value in my intermediate results, I may have another one and if I add them together, I may end up with a nan here. So we just have to be careful. We have to know that these three values exist and they have very special rules of how we can work with them. Of course, if I subtract infinity from another infinity, what do I get? Should I get zero here? No, because how do we know which infinity is bigger? Do you know that just because two limits that go both to infinity, how do you know that they approach infinity in the same magnitude, right? So in the same speed, so to say. So we cannot tell anything about the difference of two infinities. Okay, these are special values. So the takeaway is especially the nan value. So the infinities may not be so important, but the nan value we have to consider that it exists. And now we look at some factor of life that is a bit unfortunate, but it's not a bug. And what I'm about to show you may seem as if it is a bug, but it's not. So if I add, for example, one to 10 to the power of 15, I get one and many, many zeros and then a one at the end. However, if I do the same thing, but I just increase the exponent by one. So if I add one to 10 to the power of 16, I get back one or 10 to the power of 16. So the one gets lost here. And that's weird. And this seems like a bug in the software, but it's actually not. This happens in accordance with the standard after which floating point numbers are implemented. So how else can we see this imprecision? For example, let's look at a square root. What should I get back if I take the square root of two and then raise it to the power of two? Well, of course I should get back two, right? Cause that's what we learn in math. What do I get back? I get back a number that is close to, but not exactly two. So now the question is, can we live with that? The answer is yes and no. We can live with these imprecisions if we deal with them correctly. One more example. What happens if I add 0.1 plus 0.2? Should I get back 0.3? Well, in theory, yes, but in practice, I get back a number that is close to 0.3, but not exactly 0.3. Okay, so let's go ahead and see what happens when I compare the result of those two previous code cells to the number that we expected. So if I take the square root of two and raise it to the power of two and I compare it to the number two, I should get back a true, right? But I don't. And the reason for that is because the first expression on the left-hand side is, of course, imprecise. So the left-hand side is not really true here. So that is kind of a bad thing here. Again, it's intentional. So the learning is, if we ever need to compare two number for equalities and we know that floating point numbers are involved, then maybe it's not a good idea to use the double equal comparison operator here. And then, of course, the same also holds true for the next equation here. This also should be true, but it's not. So how do we deal with that? Well, the one way to do that, and that's a very popular way, is to set up a threshold of a very large number that is, it has to be positive. It's kind of like an epsilon, right? That's the maximum deviation that we want to allow. And then I take the left-hand side of the equation and I subtract here the left-hand side and I subtract the right-hand side and I compare it to the threshold. And if the difference between the left and the right-hand side is smaller than the threshold that I'm willing to accept, then I can say both sides are equal. That's usually how it's done in practice. So we can do that also for the second example. So I subtract 0.3 from the sum of 0.1 plus 0.2, and I also compare that against the threshold and the answer is, of course, also true. So the takeaway is floating point numbers are imprecise. They're imprecise by design. There is nothing to be done about this. It is not a bug. That's an important observation. And whenever we need to compare two results, we should not use the comparison operator for floats. This is a bad practice. And what we should do instead is we should define a threshold or an epsilon and compare differences to this threshold. So let's look at a little bit of why are those numbers imprecise? So here is an example. I look at the number 0.1 and I use the format built-in function and what the format function does is it allows me to print out an object, an object's value, in a way how I desire to format it. And what I do here is I pass here a format string to it, which basically says print out the value of the object that is created by 0.1 and print it out the first 50 digits in memory, the first 50 decimals, so to say, are printed out. And what do we see? If I print out the first 50 decimals of the number 0.1, I see this, I see that after 13 or 14 decimals, where I have all zeros and everything seems perfect, this happens. I have some random sequence of numbers. They are not random. They are strictly deterministic. However, that is basically the exact value of what is represented in memory when I create an object with this literal here. And that's the reason why this is imprecise. Let's look at the 0.2. And 0.2 is also imprecise and 0.3 as well. And because of that, what's the chance if I add the first two numbers together, what's the chance that I by accident get the last number? Well, the chance is pretty much zero. So that's why the equality comparison doesn't work. Let's look at 1.3. And let's print out the first 50 decimals of 1.3. Well, it's okay until 13, 14 digits and then random numbers begin. Let's round it. Let's round 1.3 to five decimals. And let's look at what happens in memory. What happens in memory is I see, well, the rounding seems to work. I have five decimals with three. So this is what we would expect. But if I go ahead and look at all the 50 decimals, I see this, I see five threes and I see a bunch of zeros. And then at some point, I also have those random decimals here. So even rounding doesn't really help here. However, some numbers somehow magically seem to be precise. For example, 0.125, this is a precise number. Let's also look at 0.25, that's also a precise number. Because of that, I can add 0.125 plus 0.125 and compare that to 0.25. And in this case, comparison works. So how is it that these two numbers are precise and the other numbers we saw before are not precise? This has to do with how floating point numbers are represented in binary. So just like integer numbers are represented in ones and zeros, floating point numbers are as well. And there is a standard for that. And the standard defines that we model a floating point number with 64 bits of ones and zeros. And they are structured like this. The first bit basically tells us if the number is positive or negative. And then we have 11 bits that go into the exponent here. And then we have 52 bits which are basically used to model the exact decimals that we see. So we also see here in this equation why the floating point numbers are called floating point numbers. So the sign here is either zero or one and we have negative one to the power of either zero or one. So this will be either positive one or negative one. So this is just a sign. And the fraction part, these are 52 bits. So it's a lot more than the ones we saw for the integers. For the integers, we only modeled eight bits. Now we have 52 here. So this is a very long sequence if interpreted as an integer, right? This would be a very long sequence. But whatever this sequence is, we will interpret it as one dot the sequence. And then we will multiply this by two to the power of exponent minus something. And what the minus does is it centers whatever number exponent is. But exponent is also a positive number. Exponent is an 11 bit number. So exponent will be positive, a positive number. And the number will be between zero and two to the power of 11. That's what this number will be. And to center it, we subtract something from it. So in other words, we have some decimals that get either shifted to the left or to the right. So in other words, this equation helps us to make the decimal dot float from left to right. And this is where the floating point numbers receive their name from. However, the important fact is that we only have 64 bits. So whatever precision or whatever number we want to model, we only have 64 bits available. And you know that from math, there are infinitely many real numbers. So in between any two real numbers, there will be many, many more real numbers. And real numbers don't stop. So just think of one-third. One-third is a number, a real number whose decimals never stop. How can we express something that never stops with a limited number of bits? Well, the answer is we can't. And that's exactly the problem. So because of that, by design, floating point numbers are imprecise. There is no way to make a floating point number precise. In fact, there is no way to model an infinite number of numbers in a precise way. It just doesn't work. And because of that, we have to live with the imprecision. OK, let's go on a little bit. Let's look at some more examples. Here is 1-8. So 1-8 is, of course, a power of 2. So 1-8 is 2 to the power of negative 3. And because of that, we can model it in a precise way. And if I go back maybe some slides, the question was, why are some numbers precise and some are not? Well, the numbers that are precise are precisely numbers that can be expressed as simply powers of 2. So this would be 1 over 8, which is 2 to the power of negative 4. And this would be a quarter. So this is 2 to the power of negative 2. So any number that can be represented as a power of 2 will be precise. And why do we have powers of 2? Well, going back to the integer chapter before, we said that if we encode integer numbers into bits, we do that with the help of powers of 2. And the same kind of encoding is behind the floating point number here as well, which is why whenever we model a number that can be expressed as a power of 2 perfectly, then the number is inherently precise. However, any number that is not a perfect power of 2 will never be precise. So that's why 1 over 8 is precise. And as I said before, usually these numbers, they are expressed in the hexadecimal notation. We don't really have to understand this representation here. It's basically these two numbers, the exponent and the fraction in hexadecimal. And why in hexadecimal? Well, I have more than 50 bits here. And I don't want to write more than 50 ones and 0s because I could read that. And if I want to check what is the ratio that a floating point number is, there is a method on floats, which is called as integer ratio. And this method will give us the ratio of two integers that is as close as we can get to the actual floating point number. So let's look at roughly a third. Roughly a third is just a bunch of hexadecimals. And we can see here that roughly a third is not a power of 2. And we see that with the hexadecimal here. And so roughly a third is not a precise number. And we also see that if we express roughly a third as an integer ratio, so the closest we can get is by dividing the two integers that are shown here. And this already indicates that that's not a roughly a third, right? This is definitely something else here. So what do we do? Well, with floating point numbers, we have to live with it. There is nothing we can do. We have to know about the caveats. And that means we have to know that we cannot check for equality really, or we shouldn't do that. And so we have to just know it when we use floats explicitly. But what could we do if we don't want to live with the limitations floats give us? Well, we can use another data type. For example, the decimal data type. And what does the decimal data type do? Well, the decimal data type is kind of like a float. But with the added feature that we can decide how precise the number will be. So the fact of life here as well, decimal numbers as floats will never be precise. No number will be precise in a computer. That's simply not possible with a limited number of bits. But with the decimal type, we are at least able to influence how many decimals we will model. So how do we create decimal numbers? Well, from the standard library, we import from decimal the decimal type here. And also I import a helper function called getContext, which will help us to configure how much precision we will use. So getContext tells us that we have 28 decimals of precision. And we've seen before floats, by default, they or floats generally have like 13 to 14 decimals precision. And for the decimal type here, by default, this is set to 28. So by default, a decimal type is twice as precise as a float. But it's still imprecise, right? OK, let's see what we can do with the decimal. How do we create decimal? Well, we can just pass an integer, 42, to the decimal type. And we get back a decimal now. And now we get back a representation that is quite similar. Decimal obviously uses a string in here. But if we copy-pasted this back into the cell, we would also be able to recreate a new object with the same value, so this works. We can, of course, use a string like decimal 0.1. This would be 1 over 10, or 1 tenth. And this would be as precise as we set it up. So this would be precise to the first 28 decimals. We can, of course, use a scientific notation. And what we should not do is we should not pass in a float to the decimal type. Why? Because floats are less precise than decimals. So we learned decimals have, by default, 28 decimals of precision. The float has just 14. So because of that, the decimal will use those random numbers that we just saw before and will pretend as if these random numbers that are definitely not precise as if they were given as a precise number. So if I do this, I really model a decimal here with these many digits here, or decimals. And this is, of course, useless because now decimal thinks that that's the number we want to model. But we don't really want to model this. So because of that, we should never create a decimal from a float. And what can we do with decimal? Well, we can do a arithmetic. So we can say decimal 0.1 plus decimal of 0.3. And we get back a decimal 0.3. So here, all of a sudden, decimals are precise enough. So here, this works. So if we now check for equality here, this actually works. And this works because the decimal type has a special way of storing the bits internally. And it knows that by 0.1, we mean 0.10 forever. And by 0.2, we mean 0.2000 and 0 forever and so on. So it kind of has some features built in that make this happen. Of course, if I create a decimal from a string which has more decimals in it, like here, we have five decimals, what happens is I get back the decimal with 0.3, but also the four trailing zeros here. So the decimal is actually aware that if we give it five decimals in both operands, then we get back five operands here as well. And let's say, if I remove the last two here, then what happens, this also remains the same here. So in a way, decimals can be used to mitigate a lot of the imprecision. We can intermingle decimals with integers. We can multiply them. We can, of course, divide them by an integer. But what we cannot do is we cannot divide them by a float, as we will see soon. So if we divide decimal 1 by 10, I get 0.1 in a precise way. This is now 0.1 in a precise way, whereas if I left out the word decimal here, I get back those random decimals again. OK, so that's the decimal type. And just to compare it, yeah, we look at this. So what should we not do? We should not multiply the decimal by a float because what should the decimal do? The decimal knows that it's precise to 28 decimals. But the float 1.0 is not that precise. So by doing this multiplication, the result would actually suffer from more imprecision than before. So the decimal with 28 decimals would actually lose some decimals. And because of that, it doesn't allow that. So that's actually good. So the decimal now screams at us and says, well, please don't multiply me with a float. And that's an example of what I would call a loud failure. So instead, when we saw before with the float type and the special values of the float type, when I do some operations that lead to another number, then I just get back another number, not an error, which would be an example of a silent failure. But here, when something goes wrong, so to say, I really do get back an error. And that's an example of a loud failure. So I kind of like that because this means I cannot, by accident, continue to work with numbers that are imprecise or that are not a number. And yeah, it just prevents situations where we don't see a bug in our program until the very last step, or maybe not at all. So the decimal type has some methods that it provides, for example, the square root method. And the reason why decimal type provides its own methods is because the math module in the standard library is made for the float type or for float type objects. So if we use the math module to take the square root of decimal 2, we would also lose precision. So this is why we should rather use the method provided by the decimal type. And then we get the square root still imprecise, of course, so after 28 decimals, the decimal stops. But at least until 28 decimals, the decimal type is precise. And so we can do this. I can say I take a decimal 2, I take the square root, and I erase it to the power of 2, right? So that's what we did before. And I store the result into a variable called 2. And I see that even the decimal type suffers from imprecision. So instead of giving me back just 2.0, I'm giving back 1.99999 and so on. However, because I have 28 decimals to which the precision is preserved, what I can do is I can ask the decimal type to quantize the result. Quantize means it's something that is similar to rounding. So quantize means we look at the 2. So 2 here is the object 2, which is this imprecise number here. And I say the 2 object, I ask it to be quantized to 5 or to, what is this, 4 decimals. So if I execute this, I get back 2.0, 0, 0, 0. So what Python does or what the decimal type does here, it looks at this imprecise number and it rounds it to 4 decimals behind the comma. And it does it in a way so that the rounding is correct. So because of that, I cannot gain more precision than I had before. So because I have only 28 decimals of precision, or 27 in this case because I have 2 comma something, that's because I cannot round it to more decimals than I actually have available. That's why I get an invalid operation error here. OK, so much to the decimal type. So what does the decimal type do? First, it gives us the control to tell the computer how many decimals we want to model. And the decimal type is commonly used to model accounting data or to model financial data. So in the exercises, I actually provide one exercise where we look at some simulation of a stock portfolio. And the stock portfolio is modeled using floats. And the thing is imprecision gets even bigger if we multiply rather big numbers with rather small numbers. And this happens naturally for business people. So in the stock portfolio, you usually have many, many large numbers that represent, let's say, the market cap of all the stocks of one company that you hold. But then in a portfolio, you also have to allocate your budget. And allocation is relative. So how do we allocate? Well, let's say if I want to allocate 2% of all my fortune on one stock, I will say whatever the price is of the stock that I want to buy, and times 0.02, because I want to spend 2% of my budget on some stock. And so I easily come into a situation where maybe I want to invest 2% out of my 10 billion portfolio. And then I all of a sudden end up with one number that is very big, 10 billion, and one number that is very small, 2%. And I multiply them. And if I do that with a data type like float that is imprecise or too imprecise, then I end up with rounding errors that are so big that we can actually show in a simulation that we actually incur losses in a situation where we don't have to make losses. So in other words, had we used a more precise data type, we could have made better investment decisions, and we could have avoided some losses. So the decimal type is really helpful when we want to model accounting data or financial data and when we really need to control the precision. A relative to the decimal type, also from the standard library, is the fraction type. The fraction type, as we can tell by the name, can be used to model fractions. So we import it from the fractions module, from fractions imported fraction. And then I can create a new fraction object by calling the fraction type here and by passing it two numbers, two integers, one and three. And this would be one-third, but now with full precision. What do I mean with that? Well, I mean that the fraction object, basically it models, it's basically a collection of two integers. It's the nominator and the denominator. And the fraction object stores both integers to full precision. So at no point will the fraction object store somewhere the 0.333 and so on. It will just store the one and the three. And whenever we want to do arithmetic with the fraction object, the arithmetic gets translated into what we would do with the denominator and the numerator. So the fraction object is a bit more limited in that we can only model fractions. So we cannot model any real number. We cannot model the number pi, for example. But as long as we only have deal with fractions, we can get away with full precision. And so, for example, if I model a fraction zero point and then a couple of threes, then I get back a fraction where I have 10 threes divided by a one and 10 zeros. OK, and what can I do with fractions? Well, first and foremost, as we know, a fraction of three over two is the same as six over four. And Python knows that. Python only keeps the lowest numbers possible to model a fraction. We can create fractions, for example, also from a decimal. Why? Because a decimal is rather precise here at this point. We should not create a fraction from a float. As we see here, floats are imprecise. And because of that, the resulting fraction is also kind of imprecise here. And then we can do arithmetic. We can, for example, add two fractions. We can, of course, subtract an integer, a whole number from a fraction. This also works. We can multiply fractions. So three times one over three would be one. We can, what else can we do? We can multiply two fractions and so on. Of course, it doesn't make sense to model to multiply a float with a fraction because this number is inherently precise because we don't model the digits. We only model the numerator and the denominator. But once we multiply this by a float, then all the troubles that we had with floats will also be here because the result of a float times a fraction will be a float. So we lose some of the precision here. So yeah, that's the explanation. And then there is one more data type. Just to complete the picture is the complex data type. And I don't want to go into too much detail here on what complex numbers are. The reason why I put them in the lecture notes here is because if you want to go on and study data science, one of my recommendations would be to also study linear algebra. And linear algebra is the basics for all of data science and also all of statistics. And in order to fully understand a good course in linear algebra, which I have one in the further resources section, you should understand the complex type because complex numbers are necessary to understand some more advanced linear algebra concepts. And because of that, I just put it in for completeness sake here. So just to remind you, for those of you who know this, what are complex numbers? Well, it all starts with this equation here. x squared is equal to negative 1. In high school, we are most of the time taught that this doesn't work. We cannot take the square root of a negative number because in order to solve this equation, we would have to take the square root of both sides. And then we would say x is equal to the square root of negative 1. This doesn't work. However, mathematicians are very creative people. So they made up a number called i, the so-called imaginary number. And the number i is defined to be the solution to the square root of negative 1. And as you can tell, if something is defined to be, then we don't have to ask how this is calculated. It's just this by definition. And then what you end up with in math is you end up with what is called the complex plane. So down here, we have RE, the real numbers. So we have zero here. That's the origin of the number of the numbers arrow. And then we have all the real numbers down here. So we have zero, the whole number, 1, 2, 3, and so on. But we also have all the fractions on there. We also have all the real numbers. We have numbers like pi on there. All of these numbers are part of the set of real numbers. But then there is a second arrow like a second coordinate axis, which goes along the so-called set of imaginary numbers here. And whenever we add a real number plus an imaginary number, so let's say a real number A plus an imaginary number B, which is B times i, then we end up with a so-called complex numbers. Again, I don't want to go into too much detail here because I don't expect too many of you to already know about complex numbers. I just want to emphasize that if ever you are in a situation where you need to work with complex numbers, you can of course do that in Python. The only thing you need to know is that in Python, complex numbers, they are expressed with a literal on their own, which is the J. And J has been chosen. This comes from engineering because the I is usually taken by engineers to mean something else, for example, electric current. So this is usually why many programming languages use a J to mean I, but this is really the I from math, right? And if I type one J and I set that equal to X, or I assign that to X, I'm sorry, then X is of course a new object and it's of type complex and it has a value of one J or one I, one imaginary unit. Then the question is what can we do with it? Well, for example, I can take, I can raise the imaginary number one to the power of two and check if it's equal to negative one and indeed it is. So just like mathematicians can indeed solve the square root of negative one, we can also model this in a Python code. And then for complex numbers, and again, a complex number is always the sum of some real number or at least some non-complex number plus an imaginary number. So that's a complex number. And if we add those two together, then the entire thing together will become one. So that's an integer two, of course. That's the complex number, 0.5, which we could say express a 0.5 I or half an I. And whenever I add like an integer number plus the imaginary number, the total will be a complex number as well. So the two is now part of the complex number. We can also create a complex number with the complex constructor. And then let's say I have two complex numbers. What can I do with them? I can add them. They are added obviously on a piecewise basis. So the real parts I added and then the imaginary parts I added. And the same holds true for subtraction. Multiplication and division is a little bit more involved, but can also be understood quite easily. And yeah, then one more thing that we saw earlier when I told you what are complex numbers in a rough way. I had this image here from the imaginary plane. And I say that that's the complex plane. So any number in this plane is a complex number. For example, the number at this coordinate. And then we can of course, for any number in this coordinate, ask the question how far away from zero do are we? And that's a question that we usually call the absolute value. And the absolute value we know exists for integers. So the absolute value of negative one would be plus one of course. It exists for floats and so on, but it also exists for complex numbers. And so what happens here is behind the scenes, the Pythagorean theorem is used. And that's why the absolute value of the complex number three plus four j is five. So three squared would be nine. Four squared would be 16. The sum of them is 25. And the square root of this is five. So this is just Pythagorean theorem here at work. And then given a complex number, I can use the dot notation to get the real and the imaginary parts on their own. So let's look at maybe C as a whole. C is the complex number one plus two j. And so C one, the real part is one and the imaginary part is two. And this is what we would call an attribute. So C one is an object of type complex, a complex number and just as different types come with different methods, different behavior, they also come with different attributes. And the complex type comes with attributes called real and imac for imaginary. And this is just a way for us to access the two parts of the complex number individually. There's something called the complex conjugate. I don't wanna go into detail what that is, but it's an example of a method of course that only exists for complex numbers. Okay, so let's get back to Python. So we have seen now all kinds of data types that regard numerics and numbers in Python. The most important ones for us are probably the integer type and the flow type as before, but we have seen there is a lot more to be learned in this regard. And then the question is, given all the data types that we have seen, how can we sort them? And one way of doing that is what programmers call the numerical tower. So what is the numerical tower? So back to math for one minute. So in math, when you started in elementary school, you learned to count. So you learned about the numbers zero, one, two, three, and so on. These are known to be the natural numbers abbreviated with N. And then these numbers, they don't include negatives. If we add the negatives, like negative one, negative two and so on, we call that the set of whole numbers. So this is the set of natural numbers and this here set would be the set of whole numbers and abbreviated as set. This is usually what you learn maybe in second grade. And then a little bit later in maybe fourth grade or so, you learn about quotients, about fractions. So any number that can be expressed as the ratio of two whole numbers is what we would call a ratio. And all ratios are collected in the set called Q. All the ratios. However, there are some numbers that I'm missing here that are not part of Q. What are these numbers? Well, for example, square roots, right? Square roots, they never stop. So for example, square root of two is an infinite series of decimals. It never stops. And because it never stops, and because it has no pattern, that's important. So the 0.333 here is also infinite. So one over three would be 0.333 and so on. The threes would never stop here as well. But for square root of two, the decimals don't stop either, but they also don't have a pattern. And the same holds true for the number E, Euler's number and Pi, of course. These are called the transcendent numbers. And these are summarized in the mathematical set R for real numbers. And this is usually where we end our education in math. And then as I just said, there's also one more set around this, the set of complex numbers. So what does this tell us? This tells us that there is a certain hierarchy. So every number that is in the innermost circle of natural numbers is also in the outermost circle. So and also in this circle set and within this circle Q. In other words, every set we see here is a perfect subset of some other set and with the set of complex numbers being the most enclosing set here. So there is a certain hierarchy. And in other words, whenever a function that we call says that it requires a real number to work, we can as well pass it a natural number because every natural number is automatically also a real number, okay? And so how can we model this in Python? Well, in Python, there is in the standard library a numbers module and the numbers module, if we look into it what it contains, it contains several things. And let's only look at these five that I focus here. These are what we will call abstract classes, abstract base classes. And they happen to be five names that correspond with five of the six circles that we see here. So the one thing that we don't have an abstract base guess for is the set of natural numbers. So the set is like the first kind of a number for which we have an abstract base class. So what this is is, this is really just a way to classify numbers, that's what it is. So we have integral, integral numbers are whole numbers, so set, then we have rational numbers, these are fractions, then we have the real numbers, we have the complex numbers, and then we have something that is called just number, and just number is even more general than complex. So any number is always a number, but not every number is, for example, an integral number. Okay, so how can we use these abstract base classes in our code? There is something called duck typing. So what is duck typing? So the apps function, which calculates the absolute value of something, works, for example, for whole numbers, for integers. So I have an integer, negative one, I call apps with it, I get back one, it works. Now I pass in a float. So I pass in a float to the apps function, and it also works. And this should make us suspicious a bit because we have two different data types that we pass into the function, and both work. And when we talked about functions in chapter two, we put a lot of emphasis on writing good doc strings. And in the doc strings, you remember that we specify all the arguments that a function should accept, and also what their data type is. And usually we specified one very concrete data type, like this takes a list of numbers, or this takes an integer or whatever. Now we see that here a function, obviously, seems to be working for different kinds of different types of data. So let's see one more example that we saw before. The apps function, of course, also works for complex numbers. So the apps function is very generic. And in fact, the only thing the apps function requires to work is a complex number. And because, as we saw with these circles, a float number, like a real number, so to say, and also a whole number, an integer type in Python, since they are also a perfect subset of complex numbers, they also behave like complex numbers. And because of that, the apps function has no problem dealing with integers and floats. Okay, so in summary, the apps function is implemented to deal with complex numbers, but the integer type and the float type, they behave as if they are also a complex number, even though they are not. And this is where the term duck typing comes into play. So duck typing, here, duck typing is probably not a formal word in computer science, but it's, I would call it a technical term in Python programming in particular. And what duck typing means is that even though the apps function requires a complex number as its argument, an integer or a float walks and quarks like a complex number. And then the saying goes like this, if something walks and quarks like a duck, it must be a duck. And this is usually the saying that you hear, and what a programmer means by saying this sentence is that, well, I don't really care about what data type you pass in, as long as the object you pass in as the argument behaves in a certain way, we can just continue to work and we won't raise an exception. So in that sense, the concrete data types that we put into the duck strings in chapter two, they may be too narrow, right? Even though I would still suggest that it is always good to have a concrete data type in mind when building a function, because we have to make some assumptions of what the argument is that we pass in. If you become more advanced in programming, then you will be able to write functions that maybe assume one argument to be of a certain type, but can still work if an argument of a different data type is passed in. And whenever we build functions intentionally to be able to do that, then we say that we implemented duck typing. And again, one more time, we say the app's function is built to work with complex numbers and complex numbers, they saw to say walk and quark in a certain way. And then the integer type and the flow type, they also walk and quark alike. So the app's function cannot tell difference between the three types. And let's see an example of where this does not work. So I want to round an integer. So maybe you ask the question, hey, how can I round an integer? An integer is already rounded. Well, you can round not only to decimals on the right side of the decimal period, but you can also round towards the left side of a decimal period. So if I round the number 123 to the negative second position, I get back 100. So what that means is we basically round these two digits here to this one here. And because we have a two here, we round down, which is why we get 100. So I can not only round to positive numbers here, which is like this, which doesn't round effectively, but I can also round to a negative place. This also works. So in summary, the round function works for integer types. Let's go one step further. Let's call the round function for a flow type. Of course it works. We have seen this before. So in a way, the round function accepts both an integer type and a flow type. So in that sense, the integer and the float, they walk and quack alike. And let's go one step further. And now let's try to round a complex number. Now, how do we round a complex number? We don't really know. I haven't told you too many details about complex numbers, but you know by now that the complex number is made of two parts, an imaginary part and a real part. So if you wanna round a number, if you want to round a complex number, how do you round a number that consists of two parts? Do you round both parts or maybe one of them or none of them? And the answer is, it's not really a good idea to round complex numbers in the first place, which is why the round function gives us a type error if we call it with a complex number. In that sense, the complex number does not quack and quack like a float or an integer. And in particular, you can think of the round function to be built, to be designed to work with floating point numbers, primarily, but it's also able to handle numbers that are a little bit less, within the subset that are a number that are in the true subset of the floating point numbers, just like integer numbers. So the set of whole numbers is a subset of the set of real numbers and the function is designed to work for real numbers. So because of that, it also works for whole numbers. That's the story. Okay, and that is duck typing. And duck typing is pretty common. So I think duck typing, explaining what duck typing is should be on any good exam that is given out to people who claim to know Python. So, but then there's another concept which is less known, which is called goose typing. And that's quite similar, but it's a little bit more abstract. And because it's a little bit more abstract concept, I will try to explain it to you with a concrete example. So what can we do with one over 10? So one over 10 will of course result in a float, 0.1 and so on in an imprecise float. And if I wanna check what data type something is, I can use the built-in if instance function. We have seen if instance before. We have used the if instance function in chapter four in particular to implement input validation and type checking schemes in our function so that you remember the factorial function and so on must not be called with a float. So we used this instance to check if the argument passed to the factorial function is a whole number. So if instance checks, if the object in the, as the first argument, the one over 10 is indeed an object of type float. And it will answer this question with a yes because of course the type of one over 10 is float. But we can also ask the more abstract question, is one over 10 a number? So now you would say of course it's a number, but so far we have seen is instance to be used only with a concrete data type, a constructor so to say. Now I use it with the abstract numbers class here. And let's run it. And Python says, well yes, one over 10 is a number. And so what this line basically tells us is it says one over 10 is a number in the most generic of all senses. So it's a number that is more generic than even a complex number in a way. We can also ask the question, is one over 10 a complex number? And the answer will be of course true because all the floating point numbers, they are automatically also complex numbers because the set of real numbers so to speak is a true subset of complex numbers. Then we can ask the question, is one over 10 a real number? And the answer will of course be true. And this also must be the case because floating point numbers primarily aim at modeling real numbers. And let's go on one more step. We can of course ask the question, is one over 10 a rational number? And now here it becomes tricky. It becomes tricky because one over 10 from the way it is shown here on the presentation, it looks like a rational number. It's one over 10. And that's just the definition of a rational number. Rational numbers are all numbers that can be expressed as the ratio of two whole numbers. That's exactly what we see here, right? So unfortunately this is false. And why is this false? Well, the reason is that one over 10 is actually not a perfect rational number. We have seen because of the inherent impreciseness of floating point numbers, one over 10 is not exactly one over 10. It's 0.100 and then dot something else. And because of this imprecision, we cannot regard one over 10, the floating point number, as a rational number in the abstract sense because there is a loss of precision. In other words, the fraction data type that we saw before is actually capable of modeling a rational number because it only stores the two integers that are divided by each other, the ratio. So because of this different implementation of the storage, the fraction data type would result in a rational number, but the floating point number would not result in a rational number here. Okay, so note how the first line here is the concrete question, is this object of data type this and that? So is this a float? And these are more abstract questions I ask. So what I do here is I use the abstract base class, number, complex, real and rational in place of a constructor. And this is something that you will often see when Goose typing is implemented. Okay, let's go on. I give you an example and trust me, that is the last big example in this chapter. And then we will, after that, we will only call this function and see how it works. And so this is the last real big slide here. So we have here an old, a familiar example again, it's the factorial example and it takes two arguments and we only look at the first one, which is n. And then the last version that we saw in chapter four of factorial, it checked if n is of type int and it used the is instance built in function to do that. And then we said that this may not be a very good decision because what if I call factorial, let's say not with an integer, but with a floating point number that has decimals or zeros. This should also work, right? And in order to make that work, I changed the input validation logic here. So nowhere here, do I check if n, which is passed in, is of type int. I only ask, is n abstractly speaking an integral? So is it a whole number? And if it is not the case, then what I do is, well, in other words, if it is the case, then I know I will be fine. So if n is an integer number, a whole number in the abstract sense, an integral, then I can immediately go down here and check if n is negative and if not, I can calculate the factorial. However, if n is not an integral number, then there is still a chance that it is a real number that happens to have all decimals set to zero, like the float 42.0, for example. And in this case, I want to still be able to use the function. And how do I do that? Well, all I do is I take n, the input argument, and I pass it to the inconstructor to get a real integer object here. And don't try to understand what the strict here does. I will explain it to you later on. So the important thing is, if I define this function, and I now go ahead and try outside test cases, for example, factorial of zero, factorial of zero is one. It works just like in chapter four. Factorial of three would be six. It works just like in chapter four. Nothing new so far. Now, here comes the first thing that is different to chapter four. In chapter four, if I called factorial of 3.0, it would not work because in chapter four, the last version of factorial basically only worked if I passed in an int type, and here I obviously pass in a flow type. So by passing in factorial of 3.0, this would not work in the old chapter, but in this chapter it works. Why does it work? Well, if we go back to the definition here, 3.0 is not an integral. However, it is real and it's not only real. We can also convert it or cast it as an integer because of that without any precision. And because of that, it works. And then if I pass in factorial of 3.1, then I get the error because to calculate the factorial of 3.1, you know, what should I do? How should I calculate the factorial? I mean, what I could do is I could just cut away the dot one here and calculate the factorial of 3, but then I would change the value because 3.0 is something different than 3.1. And because of that, I see the error message. And I provide one way out here. I provide a second argument called strict, which is to be used as a keyword-only argument. And if I set this to false, then what happens is the 3.1 gets converted to an int and with the help of the int constructor. And we have seen earlier in this chapter today that this means that the dot one just gets cut away. And then the factorial of 3 is calculated, which is of course 6. So the strict mode, if the strict mode is on and it's on by default, then I cannot pass in a float that is not ending with dot zero. But if I disable the strict mode, so to say the factorial function still works. So what have I achieved with that? Well, I have achieved one big advantage. I can now use my factorial function, not only for ints, but also for floats, if the floats can be interpreted as an int. And so in another way, I interpreted what is called duck typing. And so in a way, I made my function here a little bit more generic. I allowed my function to work with more different types of objects. And I made it more usable for some user, but I still ensure that I cannot pass in something that doesn't work like 3.1, for example. Okay. And then of course, if I pass in, for example, a complex number, I get a type error. So before that, here is also a type error. And I have however, a different error message. So here, if I pass in 3.1 in strict mode, I get the error message that says, N is not integer like, it has decimals. And if I go ahead and pass in the complex number, I get back the error message vector is only defined for integers. And if I go back to the source code of the function, we see that I raise three different types of errors here. I raise a first type error here, a second type error here, and a value error here. And the value error, I haven't actually shown you, but for example, if we would pass in a negative three, then of course I would see a value error because a factorial is, of course, not defined for negative numbers. Okay, this is, yeah, the last slide here. And this is just something we can do here by using abstract base classes. So maybe you read up this section again in the chapter, in the text version, because I guess abstract base classes, because they are a little bit abstract, they may not be super easy to understand right away, but it's actually not quite hard. So all we use the abstract base classes for was to talk in numbers in an abstract way. So I don't want to care about how a number is implemented in code. I only want to care about how a number behaves or what a number can do. And that's the idea behind abstract classes. Okay, and yeah, this is the chapter on numbers. So I am pretty sure that this chapter is one of the chapters that is a little bit more theoretical in this overall course, what are the takeaways? The takeaways are you know that numbers are nothing but zeros and ones. You don't have to deal with the zeros and ones, the bits of a number. This is done by Python for us. However, you should still understand a little bit of what's going on in memory and the implication of the usage of zeros and ones is that for the floating point data type, this automatically means that it's inherently imprecise. You know, we cannot model a number that is, you know, it has possibly infinite number of decimals with a finite number of bits in memory. It just doesn't work and it doesn't work in theory. And then another takeaway is because of these downsides that we saw, usually it's actually okay to still use the float type for most applications, but we saw also some ways to mitigate the downsides. So if we ever, I need to decide for ourselves how many decimals of precision we need and how a rounding works, then we can revert to the decimal type. The decimal type is in particular important in the world of business. If we want to model, let's say, accounting data where we cannot accept any rounding errors. And then also another takeaway I just want to reiterate is expect in data sets that you get from other people that there are possibly nan values included. So usually when you get a real data set, what happens is, let's say you get a spreadsheet and you have several columns where data for every, let's say, person in a university or so on are collected. And some measurements, they are not complete. And so they're usually one of several things that are done here. Sometimes people just leave out a cell in a spreadsheet. That's probably the best way to just not put anything else. But then sometimes also people will use dummy values. And that's important to know. And one such common dummy value could be nan, for example. And but nevertheless, you should expect nan values and you should expect that they don't create errors, unfortunately, but they still make everything else that they work with also nan. And that's a big implication. And then other than that, a small takeaway at the end of the chapter is a first introduction on how we can classify concrete data types and concrete objects in abstract concepts and how we can talk of things in an abstract way in Python. And yeah, that are the contents. And the next chapter will be on text data. Text data is the next layer of abstraction that is built on top of numbers. And yeah, I will see you soon.