 Hello and welcome. In this lecture, we are going to see how floating point numbers are represented inside a computer. Here is a quick recap of the relevant topics. We have already seen the architecture of a simple computer. And in a previous lecture, we also looked at how integers both signed and unsigned are represented in a computer. In this lecture, we are going to see how a computer internally represents floating point numbers, and how in a C++ program, you can declare floating point variables. Now, this is a picture that we have seen earlier. This is the basic structure of our simple computer with different parts. And at any snapshot of operation of this computer, you will see sequences of 0s and 1s everywhere. So, what we want to ask is how do we represent numbers like 3.14 into 10 raise to minus 23 in a computer using sequences of 0s and 1s. So, these are called floating point numbers, numbers with fractional values, very small numbers or very large numbers that cannot be represented as integers. And what we are going to study today is when we write a number like this, how is it that we can represent it in a computer using a sequence of 0s and 1s. So, let us look at this floating point number minus 3.123 into 10 raise to minus 11. This is the quote-unquote floating point in this number. Now, this number, when we look at minus 3.123 into 10 raise to minus 11, this is of course written in decimal. And in this representation, there are several parts and I want to highlight them. The first is the sign. This is a negative number. Then we have this 3.123, which is also called mantissa. And then we have the base, which is 10 here. And finally, we have the exponent, which is minus 11 here. Now, there is nothing sacrosanct about representing floating point numbers in decimal notation. We could also use binary notation, where the mantissa would be a binary representation, the exponent would be a binary representation and the base would be 2. In this decimal representation, if I look at the mantissa 3.123, it is basically saying 3 into 10 raise to 0 plus 1 into 10 raise to minus 1 plus 2 into 10 raise to minus 2 plus 3 into 10 raise to minus 3. And of course, there is a minus sign because we have the minus sign here. Now, I could write a similar number in binary, where I could have a minus sign. Instead of the mantissa being represented in decimal, I could represent it in binary. And I could still have this radix point of floating point. The base would be 2. And the exponent instead of being a decimal number, I could have a binary number over there. So, if I look at this number represented in binary, the mantissa here is, we have of course, the minus sign because it is a negative number. This one just to the left of this radix point. This is the binary radix point. This corresponds to in an integer, the least significant bit. So, here also we will multiply it by 2 raise to 0 just like the least significant bit of an integer was multiplied by 2 raise to 0. And then for the rest of it, just like here, we multiplied this by 10 raise to minus 1, the 2 by 10 raise to minus 2 and 3 by 10 raise to minus 3. Here, we will multiply this one by 2 raise to minus 1, the next one by 2 raise to minus 2, then 0 into 2 raise to minus 3 and finally, 1 times 2 raise to minus 4. And if you do all of this calculation, you will find that this corresponds to minus 1.8125 in decimal. The exponent is once again a binary number and this is just the way we read off integers represented in binary. So, it is 1 into 2 raise to 2 plus 1 into 2 raise to 1 plus 0 times 2 raise to 0 that is 6. So, this number represented in binary actually represents minus 1.8125 times 2 raise to 6, if I have to talk about the decimal representation. And this basically gives us the core idea of how floating point numbers can be represented using bits inside a computer. We are going to have a bit for the sign, we are going to have the mantissa represented as a sequence of 0s and 1s and of course, we will have to agree on where to put the decimal point. The base of the radix will always be 2, since we are talking about binary representation and we will have the exponent which will be another binary number. Now, when we write mantissa, you know the same number can be represented in 2 different forms. For example, in decimal 0.02345 times 10 raise to 12 is the same as 2.345 times 10 raise to 10. So, which of these should I take as my mantissa 0.02345 or 2.345. So, here we say that a mantissa is normalized if there is a single non-zero digit to the left of the radix point. So, to the left of the radix point, there must be only a single non-zero digit. So, therefore, this is not normalized because to its left there is 0, this is normalized to its left there is a single non-zero digit and the same notion of a normalized mantissa carries over even for binary representation. This is not a normalized mantissa because to the left of the radix point, we have more than one non-zero digits whereas, here to the left of the radix point, we have just a single non-zero digit. And here is an interesting observation that if we are representing numbers in binary, then in any case we only have 0s and 1s to be used in our representation. And if we are saying that to the left of the radix point, there must be a single non-zero digit that has to be 1. So, there is always a 1 on the left of the radix point in a binary representation of a floating point number where the mantissa is represented in a normalized way. And because it is always going to be 1, we need not store it, we know that it is always going to be 1. And therefore, this basically gives us 1 bit of information for free. However, as you can imagine that there will be some difficulties in representing the number 0 because in the number 0, we will not find any bit which is 1 in the mantissa, but we will see how to deal with this in a couple of minutes. So, floating point numbers are represented by allocating some fixed number of bits to store the mantissa in a normalized form and to store the exponent. Now, of course, since you are allocating a fixed number of bits for the mantissa and the exponent, you cannot represent all real numbers. And in fact, there will be gaps between real numbers that you can represent. And this is also called finite precision artifacts because we cannot represent all real numbers, some funny things happen. So, for example, suppose I said that you have only 3 bits to represent the mantissa and look at this floating point number in binary 0.101 times 2 raise to 111 where this is a binary exponent. This mantissa is of course, not normalized as you can see. But suppose I said that this is my binary number and I want to add 1 to it and what would be the result. And you will see that if you have only 3 bits to represent the mantissa, you cannot really represent this exact result of adding 1 to this number. The number of bits you need in the mantissa would need to be far, far larger in order to represent this result exactly. And this is what we call finite precision artifacts that there are certain numbers that we cannot represent exactly. And so, we will represent them approximately. We will represent the closest number that we can using the number of bits available to represent the mantissa and exponent. Now, in C++, how do we declare floating point variables? There are 2 data types called float and double. Float basically is 32 bits for representing a floating point number. 32 bits can also be thought of as 4 bytes of which 1 bit is reserved for storing the sign of the number, 8 bits are reserved for storing the exponent and 23 bits are reserved for storing the mantissa in the normalized form. And the approximate range of the magnitude of floating point numbers that you can represent goes from 10 raise to minus 44.85 to 10 raise to 34.83. It is an interesting exercise to actually look at the number of bits that you are using to represent the exponent and mantissa and come up with these ranges. I encourage all of you to try and do this. There is this other data type for floating point numbers called double in which you use 64 bits or 8 bytes, 1 bit to store the sign, 11 bits for the exponent and 52 bits for the mantissa. Here you have a much larger range of magnitude, 10 raise to minus 323.3 to approximately 10 raise to 308.3. Once again, it is interesting to calculate these ranges from the knowledge of the number of bits in exponent and mantissa. And as I said, we cannot represent 0 exactly if we are trying to use normalized mantissa. So, special bit patterns are reserved for 0 and not only for 0 for some other kinds of numbers like positive infinity, negative infinity and even for things which are also called not a number which are the result of certain operations. For example, if I try to divide 0 by 0, then I would get something which is not really a number and in a computer, this is represented by a special bit pattern which is also called not a number. And similarly, there are other special bit patterns. How do you declare variables in C++? You put this data type keyword float, then you give the name of the variable or you put the data type keyword double and then you give the name of the variable. How do you represent floating point constants in C++ programs? We can represent them the usual way and in a C++ program, it is not necessary for you to write the mantissa in the normalized form. When the computer stores it internally, it will store it in its normalized form or you could also use what is called the scientific notation in which you specify a number in this case 2357.2 and then you say E minus 2, you could be small case or upper case and what it really represents is 2357.2 times 10 raise to minus 2. Note that when we write this in a C++ program, the base we are using is 10. So, E minus 2 really means 10 raise to minus 2. Constant floating points can also be declared in a C++ program using the constant qualifier which we have already seen earlier. So, for example, if I say constant float pi equals 3.1415, it means that pi has a floating point value which is not going to change during program execution and this is the value. Recall we used such constants when calculating the surface area of a tank in an earlier lecture or I could say cons double E is 2.7183. So, in summary what we studied in this lecture are binary representation of floating point numbers and these are the components of the representation. There is a sign bit, there are certain number of bits for mantissa which is stored in a normalized form and there are certain number of bits for the exponent which could again be a positive or a negative integer and we have seen how to declare floating point variables in C++. Thank you.