 In this video we are going to take a short break from the topic of iteration and we will look into another topic that is totally unrelated namely the topic of input validation. So in a couple of previous videos when we defined functions we already could guess that if for example we caught the factorial function with a negative integer or a floating point number that maybe something goes wrong. And therefore in this video we look into ways of how we can ensure that the function gets the input that it needs in order to work. So let's start by creating a new notebook file and call it input validation. And let's briefly review the factorial example. So remember that there are two ways to solve the factorial problem. One way is using a recursion and the other is to use a looping strategy. And here we are going to use the recursive strategy. So I am going to go ahead and write define factorial and the function takes n and we always assume that n is a positive integer. Let's write a little dog string, calculate the factorial of a number. And here we specify under the arcs section that n is supposed to be an integer. And it's the number to calculate factorial 4. And also the function returns the factorial which happens to be also an integer. So we give the function an integer and we get an integer back. Now let's implement the function and using the recursive strategy. So first we go ahead and implement the base case by saying if n double equals zero then we simply return one. And the reason for that is because zero factorial is simply defined to be one. That's a mathematical definition. There is nothing to be calculated here. And then otherwise, so in the else clause, we will go ahead and we'll say return whatever n is times the factorial of n minus one, that is the recursive step. So that is basically when we write that in math notation, this is what we would say when we say n factorial is defined to be n times in parentheses n minus one factorial. So this is usually what mathematicians would write in math notation and in Python code this looks a little bit different. And then also in a previous video on several previous videos we learned that after the return statement the function is done. So in other words, what we do is to simplify this a bit, we get rid of the else clause and unindent the second return statement. And so now how this works is if we call the function with n set to zero, then basically the condition is true. The base case is executed, the function immediately returns by saying return one. Otherwise, if the condition is not true, the code will Python interpreter will jump right to the second return statement and will execute this expression here n times factorial of n minus one. And the second, the right hand side, the factorial of n minus one is what calls or what triggers another function call. So the function calls itself and this is basically what a recursion is. So in layman's words, we said a recursion is a function that calls itself and has a way out. And the way out is of course the base case, okay. So let's define the function and let's quickly check if the function works. For example, for zero, we get back one. And if I want to calculate the factorial of three, I get back six. So what could go wrong here in terms of input? Well, let's play a little bit with what we have here. So maybe let's go ahead and let's call the factorial function not with three, but with 3.0, a floating point number. So let's see what happens. Well, we get back 6.0. So in a way, the function works and we get back a correct result. However, the data type we get back is not the one that is specified in the doc string. Okay, so this is kind of, it's kind of okay, but it's not ideal. Now let's also go ahead and say what happens if we call the factorial function with an argument of 3.1. Before I execute this, let's guess what could happen. Well, what happens is in the first iteration or in the first time the function is called, n is 3.1, so it's not zero. So the second line here is going to be executed. So the function calls itself, which triggers another function call for n minus one. So for 2.1 then, because 3.1 minus one is 2.1. If the function is called with 2.1, we end up in the last line again. This triggers another function call where n is set to 1.1. So it starts again to run from top to bottom. 1.1 is still not zero. So it happens one more time that we end up in the last line. This calls the function again with 0.1 now. Well, 0.1 is also not zero. So what happens is we reach the last line one more time. And then 0.1 minus one will give me negative 0.9. So the factorial function at some point will be called with negative 0.9. And negative 0.9 is below zero. And from then on, the base case will never be hit. And it's also not theoretically possible to hit the base case. Why? Because once n is negative, it only becomes more negative, because we are subtracting one here, OK? So in other words, this function is going to call itself an infinite number of times, forever, basically. So let's see what's going to happen. So we first see that Python tries to execute the cell, and it takes some time. So we see the star here for calculating. It basically is there. But then at some point, luckily, Python has a built-in detection for finding out infinite recursions. And this gives basically us this red arrow message here, OK? So Python has like a fail-save system built in. It is roughly about 3,000 function calls that can be made simultaneously. And if we exceed this number, then Python basically says, well, this function has been calling itself 3,000 times already, and it has not ended. The process has not ended. Therefore, I just ended for you, OK? If Python would not do that, then we would have basically a case where the cell here would run forever. And then we could only stop it with the stop button to interrupt the processing, OK? So that is important. So now, what we could do is, in order to prevent this infinite recursion case, what we could do is, we could maybe go ahead and replace the double equal sign where we compare n to 0 to smaller than or equal to 0. So what will then happen is the first three function calls are just the same. And the last function call now is going to return us a value instead of raising an arrow message. However, the result is, of course, wrong, OK? So this is not the factorial of 3.1. So in other words, even though in math world, you could calculate the factorial of decimal numbers, but that is not what we want to model here. So according to the doc string, we only want to model the factorial of whole numbers, of integers. Therefore, this result is kind of not good. However, we see that by exchanging a double equals into a lower than or equal to, we can at least prevent the infinite number of times the function is calling itself. So this function will only call itself in total four times. However, that is not good, OK? So this would be a quick fix, so to say, but it's not good. It's not solving the problem really. So what we do is we leave it at the double equals because this code, basically, we know it's correct. It does what we want it to do. But then we go ahead and we make ourselves a little bit of space before the if statement here. And what we are going to do here is we are going to implement some input validation. So that means we first want to make sure that below here where the problem is actually solved, the program has the data type that we expect. And there are a couple of strategies. So let's, first of all, introduce a couple of words. So when we saw here, when we called the factorial function with 3.0 instead of just 3, the integer, it works. And the reason is because the 3.0 integer behaves like the 3 as an integer type. So 3.0 as a float behaves like 3 as the integer type. That is why this function called here works. And that is an example of what we refer to as duck typing. And what do I mean with the term duck typing? Well, this is a term that is often used in the Python words, probably also in some other programming languages. But duck typing basically means if it works like a duck and it works like a duck, it must be a duck. So in other words, if something, so in this case, the floating point object behaves like the other object, then it must be the other object. So that is usually a strategy used. So in other words, when we say the function needs an integer, then we are not so strict, really. We only say the function needs something that behaves like an integer. It walks and walks like an integer, so to say. And because I pass in 3.0, the 3.0 as a float, as we can guess, behaves like 3 as an integer, so we don't get a big problem here. So sometimes, this solution approach is enough, so doing nothing, basically, and saying, OK, this is just an example of duck typing. However, we saw that in general, this does not work. So we want to use a different strategy here. And a different strategy is we want to make sure that n down here, so maybe let's say here, we know that n is a positive integer. That's what we want to do here. This is input validation. So let's look at a couple of strategies of how we can do that. So the first one is what we refer to as type checking. So type checking basically means we go ahead and check the type of something. So let me maybe do that below here in another cell. So let's say if I have the number 3 as an integer, and I want to ask Python, hey, Python, is this actually an integer? How can I do that? Well, Python has a built-in function called isInstance. And the function takes two arguments, the first one being the object, which we want to check, so 3 in this case. And the second one being the constructor or the class of what we want to check if it's a type of. So in this case, the function gives me back true. And that basically means, yes, 3 is an integer. So let's do the same thing again. But let's input 3.0. Then the function gives me back false because 3.0 is a floating point object. It's not an integer. Therefore the function returns false. So now what we can do is to implement type checking, what we can do is we go ahead and write if isInstance or if not isInstance. And this is the object that we want to check. And we want to check if it's an integer. Then let's go ahead and say, we want to raise an error here, okay? So what is raising an error mean? Well, we have seen a couple of times these red error messages. This is always when JupyterLab shows us that Python couldn't deal with something when some error occurred. In a real program, this would mean the program would just crash. This is what the red error messages mean. And we can generate these red error messages ourselves. So let's see also in an example down here how we can do that. We can use the raise statement. And the raise statement basically generates an error. So let's do an example. Let's go ahead and create a so-called type error. And let's give it a message by simply saying wrong type. And this way we can generate these red error messages ourselves. And here we see the message, wrong type here. So what we can do here is if the function is called with an n that is not an integer in the type checking strategy, what you want to do is we want to simply raise an error. So, and of course we raise a type error because if we entered the wrong type, then it is a type error. And then we put a nice message in here. And we could say factorial is only defined defined for whole numbers, okay? Now let's go ahead. I redefined the function. Let's call it for with a argument of zero, it works. With an argument of three, it also works. And now let's go ahead and call the function with 3.0, a floating point number. And all of a sudden we see the red error message here. And also of course for 3.1, we also see the red error message, okay? So this is already an improvement because now we don't get back a wrong result and unmeaningful result here, but we get back an error message in return. So in other words, we know now as the user of the function that something must be wrong. However, what I don't like about the function here is this case because of the idea of duct typing that 3.0 behaves like a three. This ideally should not raise an error here. Ideally, I get back a six here, right? Okay, so ideally in an ideal world scenario, I want the function to accept and float and handle it and then give me back six as an integer. And before we saw that, I got back 6.0 as a float here and not the six that I get back when I pass in three as an integer. So let's continue. Another case where we need to model this is the following. Another example would be, let's call the factorial function with a negative number. So let's, for example, calculate the factorial of negative one. So what would you expect to happen? Well, negative one is an integer, okay? So in other words, our type checking here will not fire, okay? So we will end up down here. And what is going to happen down here? Well, if n is already negative to begin with, then the base case will never be reached. So in other words, we are going to run into an infinite recursion again. So let's go down and prove that. I execute factorial of negative one and we see that I also get a recursion error. So this is again, Pythons built in fail-save system so that the computer program here does not just run forever, okay? It just stops it on purpose here. So here, I would also like to have a nice error message because recursion error can be anything. I don't know what happens. As the user of the function here, I don't know what is going wrong here. So let's also include this in the input validation. So let's go ahead and say, L if, so if the first condition here is not true, then we know it's down here, we know it's an integer, then we could say, L if n is strictly smaller than zero, then we are also going to raise an error. But this time we are not going to raise a type error, but we are going to raise a so-called value error because the semantic value, so to say, of the object is wrong, okay? Negative one does not make sense for factorial, semantically speaking, so for us humans speaking. And therefore, we will go ahead and say, factorial is defined only for non-negative numbers. Okay, and let's go back down to the case where we have factorial of negative one. Last time I executed this, we got a recursion error. If I execute this one more time, I now get a value error and it gives me back our own custom error message. Okay, so this is already an improvement because as the user, I now know what did I do wrong. Okay, as a user, when I get an error message, I wanna know what did I do wrong. So here I know it, okay? So this is one strategy. So now we have a full-fledged input validation scheme. So this is now okay. However, as I just said before, what I don't like is the fact that I cannot call the function with 3.0, okay? So because of that, what we can do is, we can use another strategy. And the other strategy, so this current strategy here is an example of type checking, I'm checking the type, but the other strategy I mean that I am now introducing is the strategy called type casting. To cast an object basically means, we make sure that it is of a certain data type. Or in other words, if it is not, then we convert it into this data type. And how can we do that? Well, let's get rid of this if check here, the first line. Then of course, we have to go ahead and we have to change the LF into an if. And now we have to make sure that N is an integer. How can we do that? Well, very easy. We can simply go ahead and use the built-in int constructor and pass N to it. So in other words, if I now pass in N and it is a float, it will be converted into an integer. And from then on it works. So let's see what difference does that make. So factorial of zero works as before, factorial of three works as before. And the factorial of 3.0, that before that gave me a type error, will now give me six. And now let's check what does the factorial of 3.1 do. Well, unfortunately now it will also give me a six. And factorial of negative one will still give me the error message because the number is negative. So now I have basically changed my problem, so to say. So now factorial of 3.0, which implements tug-diving, basically accepts a float and gives us back a correct result. However, for 3.1, it also gives us back six. Why is that? Well, quite briefly, if I call the integer constructor with 3.1, the decimals are just cut off, right? Also, if I call it with, let's say 3.9, I also get back three. So no matter what the decimal is, the decimal is just cut off by the int constructor. So we shouldn't leave the function like this, okay? Because if we did, then this would be a semantic error here, okay? If this worked here, then this would be an example of a semantic error, a case where a computer program does not give us a red error message, but still something is wrong, so-called semantic error. So now let's go back up and see how else could we deal with that. So there are a couple of strategies, but now I show you one that you could use here and this already gets a bit complicated, but it's just to illustrate a point. So instead of saying we go ahead and we set n equal to an integer of n to the constructor of the return value of the constructor, what we do is, I now go ahead and I will say if n double equals int n, but then the double equals will be changed into a not equal to. So in other words, if n does not compare equal to the integer value of n, then we know we must have, basically we passed in something that cannot be read as an integer, that cannot be transformed without loss of precision. And in this scenario, we would also give back a type error because this is already an indication that this can only happen if the user passed in some other type and the object of the other type could be converted into an integer. However, it gave us, it lost some decimals and therefore it changed the value of the input and this is what we don't want. So here I can, I write an error message like n is not integer like. And also as an explanation, it has non-zero decimals. And now we can go ahead and below this line, what we do is I simply go ahead and now I set n to whatever int of n gives me back. Okay, so this is now a more sophisticated approach. So now the function basically takes any kind of input. It first tries to make it an integer. If it does not work without losing decimals, it gives me an error message. And then it also checks if n is negative. And then if everything passes the tests up here, then we know that the calculation down here will give me back the correct result. So let's try that. So let's see if factorial of zero gives me back the correct result. Factorial of three gives me also back the correct result. Factorial of 3.0 also gives me back the correct result and also the correct data type. Now not 6.0, but six, this is what I get back. If I call factorial with 3.1, I get a type error. It says n is not like an integer. It has non-zero decimals, okay? And the type error tells me as a programmer, okay, something is wrong with the data type. So I should have passed in something that is an integer or something that walks and quarks like an integer, like a float that has no decimals, okay? And also the negative one still does not work, okay? So now this is basically a full-fledged example of how far you can go with input validation. Now the question is in the real world, how often do you do that? Well this really depends on who defines the function and who uses the function. If both people are the same person and this is often the case, then it is often worthwhile to simply get rid of the entire input validation and leave it just like this. Because I know if I call the function in the wrong way, since I am also the author of the function, I can either go ahead and adapt my function to work with the bad input or I can change the bad input into good input. If I'm the same person defining the function and calling the function, then usually I can deal with that. However, if I'm developing code that someone else is going to use, so if I assume that someone else is going to call my function, then it is always worthwhile to make sure that the input gets validated and in case something goes wrong, the user of the function who does not necessarily know me gets a clear error message that really tells the user what to do or what is wrong. Otherwise, I would get a recursion error as we saw before, but the recursion error simply doesn't tell me anything. It simply tells me that the function runs forever and at some point it's just cut off by Python, but here in this example, this would now deal with all the bad inputs. So this is different strategies of how to handle input, of how to validate input and sometimes you will do that and sometimes you won't do that. It really depends on where the data in here at the parameter, where this is coming from. If it's coming from a trustworthy source or some user that does not know the code has to enter the data, okay? So this is really it, but these are the kind of strategies. So the important takeaways from this video is the term duck typing, of course, and it's of course also type checking and here in this case, this would be value checking. And in general, we call that together input validation, okay? So I will see you in the next video.