 So welcome back to the second chapter of the introduction to Python programming course. This chapter is about functions and about how we can modularize our code. So we have seen functions before in chapter one. And so let's look at some of the built-in functions that Python comes with. To construct an example, we will first create a list of numbers and assign it to the variable numbers. It's the exact same list that we know from chapter one. And now, in a second step, we call the built-in function sum and pass it numbers as a list. And we get back 78, the sum of all the numbers from 1 to 12. So what can we say about the sum built in here? Well, somehow, magically, Python knows about sum. So when we start Python, sum is already there. And then also, what are the parentheses here? The parentheses here are what we will call the call operator, which instructs Python to call the object to which the name sum belongs. So sum is basically a variable. And it points to an object, as we will see. And this object will be called. And we pass it in an argument. That's what we will call it later on. That is basically a reference to the list of numbers. OK, so that's the call operator. And that's calling a function. So as I just said, sum is an object. How do we see that in code? Well, if I ask with help of the id built-in function, if the sum built-in function is an object, the id function gives me back an address. So the sum function obviously has an identity. Also, the sum function has a type, namely the type built-in function or method. So because of that, we can assume that when Python starts a new process, then what it does for us is it puts some objects into memory that contain code. In this case, this is a built-in function. I may abbreviate this. And this function goes by the name of sum. So the variable sum references this object. In other words, everything in Python is an object, as we know by now. And even built-in functions are objects. So everything that holds for objects in general does also hold for built-in functions in particular. So now let me introduce another abstract concept here. So we say that anything, any object in Python that we may call is a so-called callable. And how can we find out if something is callable? Well, Python comes with the built-in callable function that takes one argument that may be any object. And it then tells us yes or no if the object we passed in is callable or not. So we pass it here, the reference to the sum function. And Python confirms to us what we already know that the sum built-in function is callable. And we called it before. But as you see here, I don't write parentheses here, right? So I don't call the function. I just pass in the name that references the object in memory. So in other words, sum is treated as if it were just like any other variable that we may create. To contrast that, the numbers list is not callable. And this makes sense, right? I mean, why would we call a list? What does that mean, calling a list? But if you want to see that the list is not callable, what we can actually do is we can actually go ahead and try to call the numbers list. And then we see we get in type arrow and it says list object is not callable. But again, we don't have to try this out. We can actually use the callable built-in function to check if any object in Python is callable or not. So the callable function, usually we don't really use it in Python. But what I use it for is to make you familiar with the term callable. That's what it's about. And if you read the documentation, you will find the word callable quite often. Now you know what a callable is. A callable is anything that may be called. And in this chapter, we will look at three different things that can be called. You know, three different distinct ideas of what is a callable. And one of them, one example of a callable is the built-in function, which is of type built-in function or method. So let's look at another example of what a callable is. Namely, constructors. So what are constructors? So for example, we saw in chapter one that depending on if we end a number with a dot, O for example, the number is either a whole number, an integer, or it's a floating point number. And in order to basically get an integer number out of a float number, we can use the int built-in. And the int built-in is also callable, obviously, because what we do here is we call int and we pass it a float. And what do we get back? Well, we get back the integer 7. So int is definitely something that is callable, but is it a built-in function? And the answer will be, of course, no. It is something else, because otherwise I wouldn't have a separate section here. So what other constructors exist? Well, for example, first, let's look at the int constructor again, and let's observe that I can also call the int constructor with, for example, a text string, which consists of the character 7. And if I execute this, I also get back an int object. So the int constructor is basically able to take any type of object, and then it tries to cast out an integer object, and that's what we call that. So what we say, we say that the float 7.0 is casted as an integer here, and then it returns the integer 7. Important thing to note, the 7.0 is a different object. In our memory diagram somewhere, we have, let's say, a big box, and it has a 7.0 in it, and it's a float. And when we pass this to the int constructor, what happens is a smaller box will be created of type int. It does not contain the decimal. And in this case, we don't store the returned value, we don't store the 7.0, we don't store the 7 that gets returned. So here, both of these objects, they will basically be garbage collected right away because we don't reference them from anywhere. But the point I wanted to make here is that this one code cell here creates two objects. So a float object can never be changed into an int object. The only thing that we can do is we can create a new object type int out of the existing float object. And the same was true for the next code cell as well. So what does the int constructor do? So for example, when I pass 7.99 in it, we get back just 7. This is different than rounding. So the round built in function, when given 7.99, will give back us an 8. So we have to be careful here a bit. The different built-ins behave differently. And also here, syntactically, they are a bit different. So round, you can believe me, is the same as the sum function before. It's a built-in function. But int is something else. And we will soon see what this int constructor is. And now, this also goes the other way. So we can, of course, take an int, in this case, the int 7, and pass it to the float constructor. And then we get back a float 7.0. And also, we have a string constructor. Lastly, so we can pass this an int to 7. And we get back a text string. So we see that with the single quotes here. We get back a text, which consists of the one character 7. OK. So what is int? So int, first and foremost, it has an address. So it has an identity. So it must be an object. So in other words, somewhere in memory, we have a box with something in it. And this must have a type. So let's quickly find out what is the type of int. The type of int is just type. So there exist boxes or objects that are of type type. And in this case, we have a name, which is int. And this name has a reference to this box. In other words, the int constructor also exists when we start Python. And so there are many, many more, just as we just saw float and string and so on. They also exist just like that. But everything in Python is an object. So everything in Python works this way. Everything has a box on the right-hand side with a type at it. It has the value in it. And on the left-hand side, we collect the names. And when we start Python, Python already populates the names list with many, many names that we don't create explicitly. And now what is the int constructor? The int constructor is first and foremost callable. So this is our second example of what a callable is. So we now have two examples, the first one being the built-in function and the second one being the type. And another word for an object that is of type type would be a constructor. This actually is called constructors. And what do constructors do? Well, usually they just create a new object of a certain type, given some data. They construct objects. And now let's look at a third example of what a callable is. And the third example will be so-called user-defined functions. And that means functions that we create on our own. So to show you a familiar example, I will take the same code that we saw in chapter one, where the task was we are given a list of numbers and we first filter for the even numbers. And then once we got the even numbers, we calculate the average. So that is how this looks like. This is a so-called function definition. So we see that it's a definition because it starts with the def statement. So def is a header line. Just as we saw the for line and the if line before, these are also header lines. And now def is another kind of header line that we see. And then we have here something that we will refer to as the name of the function. And then the function definition continues with parentheses, so opening parentheses and closing parentheses. And those parentheses are not the call operator because here we don't call the function, we define the function. So before we can call the function, we must, of course, define it. And then as every header line in Python, we end it with a column. And then all of the things that follow the header line must be intended by four spaces. And then this is not required, but a very good practice is whenever we define a new function, is that we document it. So this here is an example of a doc string. And a doc string is basically just a text object in Python, a string object. And instead of using just normal double quotes, we use so-called triple double quotes. And the reason for that is that triple double quotes enable us to write the string over several lines, of course. That's what we want here. And then what is a good style for the doc string? Well, at first we have like a subject line. So this is a short description or a short subject of what the function does. And then we usually have a section where we list all the arguments the function takes. These are the things that we pass into the function when we call it. And we see that we have the argument integer. So this function only has one argument it takes. And it's called integers. And then in the access section here, what we do is in parentheses, we write basically down what type we would like the user to pass in. And we say integers is supposed to be a list type, a list object that contains integer objects. And then we describe in a nice descriptive message what the user should put in. And then also we provide a section called returns where we specify the type of what is given back and also give a descriptive name. So in this case, the return value of the function will be a float and it will be an average. So it makes sense to have it to be a float because if we average integers, there is a high chance that whatever the result will be will not be a decimal number. And then the next two lines of code, these are actually just the lines of code that I copy pasted from the chapter one. The only thing that I changed is instead of the word numbers, I wrote the word integers here. And this is only so to illustrate a point. And then towards the end of the chapter you will see the name numbers here again. And then the third line of code starts with the keyword return. Return is a statement and what return does is once Python hits the return, it stops executing the function. The function is done and whatever expression we have on the right hand side will be returned to whoever called this function. And in this case, we just return the result of the calculation. Okay. And now what happens? Now I define the function, what happens? And we see nothing really happens. So we don't see any output. I have executed this code cell and nothing has happened. So I can tell you, of course, in our memory diagram what happened. So what we did is, effectively, we created a new box and we put something in there, namely code. And we will soon see what the type is. So I leave this blank for now. And then the name that references this object would be average events. And then we make a reference to the box. And this is what just happened. So by executing the cell, all we do is we create a box with code in it and a name that references it. And because of that, now a variable called average events exists. So I can just evaluate this variable. And I get back something that says, hey, I'm a function obviously, something with main that we don't understand. And then it says average events and in parentheses integers. So whenever you see these so-called angle brackets, what that means is that is a convention. And the convention means that you cannot copy-paste this back into a code cell. So if I copy-paste this back, I will get a syntax error. Usually, so far, we have only seen objects that have a value that we can copy-paste back into a code cell. And we call this a literal. And this is our first example of something that is not a literal. So I cannot copy-paste it back. So the value of the object here is basically only the function that we see as the humans, that we define it to be. There is no other value. And then, of course, the average events has an address and it has a type. And the type is just a function. So I write here function. So now we have seen our third example of a callable. And they are all here on this diagram. We have the built-in function or method. We have the type, which is type. And we have the function type. And all three of them are boxes or objects that may be called. And we see that. Here, if I ask Python, is average events callable? Python will say yes. So now that we defined the function, it makes sense to call it. So how do we call a function? Well, I just write the name of the function and then I use the call operator. And within the call operator, I pass in the argument that I want to pass to the function. In this case, I pass in the list. And here I use a literal notation. So I just write the list here and then at runtime, when I execute this code cell, this list will be newly created in memory and immediately passed as a argument to the average events function. And then the function has a return value and the return value is 7.0. So this is the same example from chapter one. Just now, what have we done? We have basically given our code a name and the name is average events. And by doing so, we not only named the code, but also we made it reusable. So I can now go ahead and execute this cell as often as I want and then the calculation is done as often as I want. So this is now a big improvement to chapter one. In that, I can now reuse code. Of course, I can also pass in just numbers, the reference or the name of the list that exists. And I also get back 7.0. So what's the difference? Well, the difference is up here. When I execute this, this list will be newly created, which leads to 13 objects being created in memory. So one list object and references to 12 integer objects. And if I pass in just numbers, the reference to the numbers list that we defined in the first code cell in this chapter, then no new list gets created. So the existing list gets just evaluated. And so, of course, we can basically write in any legal expression in here. What we cannot write in here is a statement. So remember from chapter one that I emphasized the difference between what is an expression and what is a statement. And within the function call operator, I can only pass in an expression. Usually what we do when we run a function, we want to capture the return value. And I do so here with a variable called result. So I run this and now a variable called result exists and has the number 7.0 in it. So this is usually what we will do. We will write some function somewhere and then we will use it again and again, usually in a notebook, for example, and do something with the result. Okay, so let me quickly show you another tool that is nice for beginners. It's called Python tutor, python tutor.com. And what you can do here is you can go there and you can copy paste your code of anything where you are stuck with, any problem you might have. And you can copy paste your code into this... into Python tutor. It has, like, on the start page, it has a clipboard where you can copy paste it in. And then what you can do is I can push the next button here and what happens is the code will be executed step by step, so to say, in slow motion. So if I now click next, then the first line of code that defines the list numbers will be executed. So now I did that and the red arrow here will always show me the line that will be executed next and the green arrow shows me which line has just been executed. So after the first step, in my list of names, which is my left-hand side here, I now have a variable called numbers and it references a list, a list object with 12 numbers. Note how Python tutor takes the numbers, the integers, and puts them inside the list. Last time in chapter one, I showed you that this is actually not the case. List objects, they hold references themselves that reference the objects that are somewhere else in memory. So the numbers 7, 11 and so on, they are somewhere else and we have references to them. And for simplicity, Python tutor shows the numbers within the list, even though this is not the case. So now I execute the second step. I define the function and all that happens is I get a new box in memory and the new box basically is of type function. That's what the type is up here and we have a variable name called average event and it references the functions just like in our diagram down here. And now in the third step, I will now evaluate the last line down here. And what that does is it will evaluate the right-hand side first. So the result equals part will be skipped and the right-hand side will be executed first and this leads to a call to this function and it will pass in the numbers list as the arguments to this function here. So let's see what happens. Now the function starts to run, so this code now is actually run. And now what happens is we see here a new blue box and this blue box is what we call the function scope. So whenever a function starts to run, it will get its own area in memory where it can keep track of the names. So this would be like, for example, giving the function as it is called a new area let's say here and here we can put in names that belong to the function. However, this blue box here in Python Twitter will go away once the function is done. This scope of the function will just live just as long as the function is running and once the function is done, the scope will go away. And now we see the following. We see that within the names that belong to the function as it runs, there is a name called integers and this is exactly the parameter name that is specified up here and this is a reference to the numbers list. So now we see that in memory we only have one list and we have two references pointing to it and this happens naturally in Python whenever we call a function and pass in an argument. So in Python, whenever we call a function, the arguments are passed in what we say by reference and what do we mean by that? We basically mean that whenever a function is called the only thing that the function gets before it runs are the references to the objects. The function does not get its own objects. That's important to understand. So in a way, we already see that this may lead to difficulties. So numbers is accessed from outside the function via the numbers name and it is accessed from within the function via the integers name. So we have two different names pointing to a list object and we saw in chapter one that this can lead to difficulties. So let's go on. It's executed first line in the function's body. So what it does, it runs the list comprehension, so the right-hand side, which is kind of like a for loop here and what this will do is it will go over all the numbers in the list and if n is divisible by two, so if n is even, then we will keep it and if it's not even, we won't keep it. So this loop will run 12 iterations because we have 12 numbers in it. So let's see. And what happens here is we get a new blue box and it says list comp here. It's an abbreviation for list comprehension and this means basically that as this loop here runs, we have a scope that is reserved only for the list comprehension but that's not so important for now. So let's see what happens if I click next. So the first n here is 7 and that makes sense and the next n is 11, that makes sense. So we see here as I click next and so on as the for loop runs and n will always be changed to the next number where the loop is in. So now 8, 5, 3, 12, 2, 6, 9, 10, 1 and 4 and that is now the last iteration and now after the for loop inside the bracket is done what happens is we get a second list because that's what the list comprehension does. It derives a new list out of an existing one and the new list gets assigned to a name events which exists inside the function. So now we have two variables inside the function scope and the first name is integers that points basically to the numbers list outside that existed before and events name that points to this new list and this is the list of only the even numbers. And next I call the sum built in function and the lend built in function and I pass in the reference to the events list. So basically when some events and lend events is called the same thing happens that I just described for average events it just happens with some other functions. So we basically reuse some functionality inside of core Python here and then Python gets the sum it gets the length it divides with the division operator and the result will be 7.0 and this will be stored in the variable average. So now we see a third variable average here, 7.0 and if I now go next what I see in the instant before the function call is over I see here in red the return value of this function will be 7.0 and that's important. The return value is the thing that in the next step what we will see is the blue box will go away and Python will forget all the names that are inside the blue box except for the return value. So the reference to the return value will be kept all the other names are forgotten and because we will forget the other names here the reference going to the numbers list and this reference going to the events list will be gone and because the events list here only has one reference to it the events list will be garbage collected. The numbers list up here will survive because it still has another reference to it. So let's do it. So the upper list remains and the other list is gone and now we have a new variable called result which exists in the global frame. Okay, so I hope that you can follow this and what we learn from this is that function definition and function call are two different things. They happen at different times and whenever we call a function a function gets a so-called local scope or function scope and this is just its own area where it can store names to and work with them and once the function call is over the names are forgotten and only the return value survives. Okay, so that's Python Tutor we will look at two more examples in Python Tutor soon. So let's go on and let's discuss the scoping rules. So we have seen that the function scope went away and that is actually one of the rules that is important to know. So the so-called local scope or function scope disappears after a function call is over. What that means is the variables integers that exist inside the function I now try to read it and it doesn't exist. It gives me a name error. Why? Because it only existed inside the function call and only as long as the function ran. And the same holds for events so events also doesn't exist outside the function, okay? So the local scope disappears. That's the rule. This is how Python behaves. There is another rule and the other rule is that the global scope is everywhere. So let's look at an example. So in this example I call the function average wrong because I intentionally made a bug in this function for illustration purposes. So the function works just like average events before. It takes one argument called integers and then let's look at a code and what we see here is I calculate events. However, I reference numbers. So where's numbers here? Well, it's not here. So numbers is somewhere else and where is numbers? Well, it's outside in the global scope. So remember the first code cell in this chapter that I executed set numbers to the list of the 12 numbers. So numbers does exist but it exists somewhere else. So let's look at the function. So this may be a problem, maybe not. And also note that the integers parameter is not referenced anywhere. This also should make you think that why would we need to specify a parameter up here if we don't use it? So this already shows you that this is kind of buggy, right? And now let's see how this function behaves when we call it. And now let's call it. So as mentioned numbers still exist. It exists outside the function. It's the numbers from 1 to 12 unordered but the same numbers as before. And now I call average wrong and pass in numbers. And I get back 7.0. And the problem is that this is actually the correct result that I get back so I don't see the arrow here. And the reason why is because by accident I pass in as the argument the variable that is referenced inside the wrong function. However, if I call the function with some other example like the numbers, like the three numbers 1, 2, 3, 4, 5, 6 and 7, 8, 9. What should I get here? Well, it's very easy. The function first filters out the odd numbers, so the 1, 23 and the 7, 8, 9. They will be filtered away and then the average of one number is just the number itself, right? So this should return 4, 5, 6. However, this also returns 7.0. Let's look quickly in Python Tutor how the scoping rules are. So here we have average wrong in Python Tutor. I run the first line so I define the global numbers variable. In the next step I define my new function object which holds the code that is run. I give it the name average wrong and then in my third step I run the code for the example for which I know the answer should be 4, 5, 6. And now what happens is as the function starts the function gets its new scope and it has a variable inside called integers and this references the list 1, 2, 3, 4, 5, 6 and 7, 8, 9. And now we go in the first line of the function and we execute it. And now this for loop in the list comprehension will also run and this will run over the 12 numbers that are up here. Why? Because we are referencing numbers. So if we look closely n is now 7 so if we run over the list comprehension we see the same numbers that are in the global list out here and now that this is over the result the resulting list will be assigned to the evens variable. So now we have inside the average wrong function a second variable called evens and this references a list with the numbers 8, 12, 2, 6, 10 and 4. So this is the bug here. So we shouldn't have this. We should have a list here that only consists of 5, 6 but it's hard to see bug here actually. And then the last steps are rather trivial so we just take the sum of it and the length of it and this will be of course 7.0 and now we have average is equal to 7.0 and then the 7.0 become the return value in the next line and I actually return to and assign to the result variable here. So this is what it looks like in memory and so the rule is what I mean by the global scope is everywhere. Let's go to the beginning again. So when we start with the first line here in the function it references numbers and numbers does not exist inside the blue scope here. So what Python does it, it automatically looks up numbers one scope above. So if a variable and that's the rule if a variable does not exist in an inner scope then the scope will be checked. And if it does not exist in the outer scope it will go even to one scope further until it cannot go. And the most enclosing scope is the so called global scope and this is the list of names that we work with that also defines the built in functions. Okay. So let's continue here. There is a third rule. I call it shadowing and what I mean by that is when two variables with the same name reference the same object. Now you may wonder how does this happen? Well, it happens rather easily. Let's see average evens again. It has one parameter integers and now I include one more line of code and what does this do? It takes, it goes over the list of numbers we pass in and it rounds the numbers and this makes sense because maybe a user wants to pass in a list of numbers that are floats and floats can also model whole numbers as long as we have a .o everywhere so maybe we want to allow the user to also pass in floats and then just to make sure that the function still runs the user does pass in a function that does not end with an object that does not end with a .o we just round it because rounding makes sure that whatever the user passed in we will end up with an int. And then we assign this to a variable called numbers and here that's what the shadowing is so we will have a numbers variable in the global scope and we will have a second numbers variable inside the function scope and then here when we calculate the events list we reference numbers and the question is which numbers object do we reference the inner one or the outer one and you can already guess that we will most likely take the inner one and this will be the rule but let's verify this so let's pass in a list with the numbers 40 through 44 with comma 0.2, 0.3 and 0.4 and what Python will now do or what our function will now do it will first round these numbers and then it will filter out the odd numbers and then it will average the remaining even numbers and of course the result will be 42 because the 40 remains as an even number the 42 when rounded will remain as an even number and the 44 when rounded will remain as an even number of three numbers 40, 42 and 44 and the average is of course 42 and the global numbers is of course untouched so now let's go back to Python 2.0 one last time and to also see that in this nice interactive memory diagram so we first create our global numbers list again then second we create our new average events functions here is the box function code in it and then we call the function with the new example here and then when we go inside the function when the function begins what happens is when the function is called this list with the five numbers down here will be created the list is here and then this list will be passed by reference to the function this is why the function internally is referencing this list here and then we go through the first list comprehension we loop over all the numbers in this list down here and round all of them and assign the result to a new list called numbers so let's do that quickly so we loop over the list of unrounded numbers and then after we have done that we have a second list of the rounded numbers referenced inside the function by a name called numbers and note how we now have two variables called numbers one name is in the function scope in the so called local scope and the other variable numbers is in the global scope here and then in the next line of code we loop over numbers and check if the number in it is even or odd and we only keep the even numbers and assign the result to a new list of unrounded numbers so how does it work so now python needs to decide which numbers variable to use and python has the following rule it goes from inside out so it tries to look up the numbers variable in the local scope if it finds it we use that and because numbers is here as well just as outside that's the reason why the inner numbers list gets used in the local scope so now we loop over as we see the 40 the 41, 42, 43, 44 and then we obtain a new list with only the even numbers and this is referenced by a variable called evens at this point I want to point out something that we will come back to in chapter 7 namely that this code is not optimized for big data why is it not well imagine what is the worst case scenario in terms of memory when I call this function well what if I call this function with an argument that contains only numbers that are even what is filtered away, well the answer is nothing so what this means is I have now internally three copies, three different lists and in the worst case scenario if all the numbers here were even all of the lists inside the function they would be of the same size namely five elements here and that means that means that the memory consumption is very bad so after we are done with you know after we created the inner numbers list for example why would we need the integer list here I mean this we don't need anymore right but we still keep it in memory so I just want to point out that this function is rather memory inefficient and in chapter 7 we shall see several ways of how to deal with that in particular we will find one way which will be called map filter map filter reduce paradigm which enables us to do all the calculations with only one number at a time and to be super memory efficient and of course is about preparing you for further studies in data science so you should understand the memory a bit in a way that allows you to handle big amounts of data and that means we should not waste memory as we do here and then towards the end we just average again and return the average the average is now 42.0 as the inner lists now those three lists will go away after the function is returned ok so the rule here is as we saw the inner numbers list shadows the outer numbers list and Python goes from inside out and that's what we mean by shadowing ok so a Python tutor is maybe a nice tool for you to use for debugging in the beginning I prefer to use to draw diagrams by hand because the diagrams drawn by Python tutor are not so accurate as not as accurate as I put them here but at the end of the day whatever helps you to study you should use and Python tutor is definitely a good tool ok so one more example here that also considers shadowing so of course we started out by naming a parameter here integers but now that we know that Python is able to keep track of different variables with the same name we can of course rename the parameter here into numbers right so now we have numbers here and also note if I wanted to round first let me quickly do this this rounding step we just copy paste and also copy paste it in here and then of course we have to rename integers into numbers and now what we see here in the final example in this section is we have a parameter called numbers we remember we have an outside name a global name called numbers we have the parameter numbers and then we create a new variable numbers and then we use that so we can use numbers in several stages and Python is able to correctly use the right one and also in this case if I assign to numbers here what happens is whatever the parameter numbers points to when I call the function this will be overwritten by this list here so Python does all of this for us behind the scenes so if I now execute this cell that is probably our last version regarding the variable naming and now we will one more time check it because whenever you change a function then you should always make sure that the code that worked before still works so now we call average evens with numbers we get 7.0 and the global numbers is still unchanged and so this is everything we need to know what happens with function calling in memory and now let's continue the study of defining functions a little bit namely how can we build functions that take more than one argument and what kind of arguments exist actually let's look into this so I give you a built-in function as an example a function called this mod and which takes two numbers 42 and 10 and divides the first number by the second and it then will give us back two numbers namely the first one is basically the result of the floor division so 42 divided by 10 is 4.0 and the 4 in the 4.0 something will be kept or will be extracted and then the 2 is the result of the model low operator so 42 divided by 10 will give me 2 because 42 divided by 10 is 4 and the rest of 2 so this is how we read it I read it one more time 42 divided by 10 is 4 and the rest of 2 so that's what the this mod function does it does both divisions at the same time and now for the this mod function the first argument is the number that is divided and the second argument is the number that is dividing that we are dividing by so if I exchange the two numbers I will of course get a different result so for some functions that's the point here for some functions there is something like a natural order in which we can pass in the arguments let's continue our example of average events assume for now that I want to change the function a little bit and I want to call it scaled average events and the only thing that is different with this function is that in the very last step we take the average and we scale it with a scaler that's it so for those of you because we are talking here about a context of a business school for those of you who want to who say that this may be too much of a toy example but if you think of this think that maybe you are a large corporate an international corporate and you have an accounting system in all of the different countries that you are working in and at the end of the year or at the end of the day you want to collect the sales data of different countries and you want to do statistics with it and let's say a per day average so then you are faced with the problem that let's say you have a subsidiary in the Czech Republic for example and they report the balance sheet in Czech coronas and your headquarters are in let's say Germany and they are reported in Europe so you have to like scale or convert the Czech corona into euros and how do you do that by just multiplying with conversion rate and in other words what you would do in this example is you would maybe first look at individual sales data average them after maybe filtering them and then you would convert it and that's exactly what the function here does so we take some numbers here and we clean them a little bit by rounding we filter them a bit so maybe we are filtering out some outliers some sales that just we don't want in our average then we do calculate the average performance that we are interested in and then we convert it into euros for example so even though this example is a bit of a toy problem you can easily make up a story where this fits into a management context and so this function needs to take two parameters first the numbers list and then a scalar so because of that I write two parameters here numbers and scalar and I also adjust my doc string here so I say here scalar is supposedly a float and it multiplies the average ok so so far so good so how do I use this function now that takes two parameters well I can call it like as before and I just pass in two different objects here as the argument so I pass in numbers first and in the scalar the two second so here I call the function and I get back 14.0 so twice the average before so that seems good but now the question is is that a good order is it natural to have numbers first and the scalar second so it depends so in this case I would actually argue that we really don't have a natural order here so we also yeah we you know there's nothing that the use of our function could expect and so a use of our function may every time when he writes the function and wants to call it may be thinking of hey which order was it I can't really memorize what the order of the argument is you can of course always go back and read doc string but sometimes when you write code you don't always want to go back and see the doc string so also so what can we do so maybe we can pass in the arguments in a different way so the first way the way that we saw before is what we will call passing in an argument by position and the arguments are referred to as arguments because the position is how they are matched to the parameters in the definition but then we can the second way the alternative is we pass in the arguments by keyword so the two parameters that we defined were numbers and scalar and so I pass in numbers as numbers so I pass in the existing numbers list which is now marked here as the numbers are keyword the first term here the first word here is the keyword the second one is the actual variable that evaluates to the object that exists in memory and then we pass in a two by the keyword scalar and if I execute this I get back the same result so now calling the function is at least more explicit so now I know exactly which argument we are talking about what about I change the order if I change the order in the scale average even function this wouldn't even work if I change the order up here the function call would result in an arrow and here I exchange the order but I use keywords so here Python has actually a chance to match the arguments and of course it works so once we use keywords the code is more explicit than the arguments and Python has no problem with it but now the tedious thing is do we always want to type out the keywords because then the function call can easily get very long and also many functions that you will use from existing third party libraries they will have like I don't know 30 or 40 arguments do you really want to type out all of them all the time probably not so we have to study a little bit more what we can do with function calls so of course we can combine the two approaches we can pass in numbers first by position and then we can add scalar equals 2 so we can pass in the 2s by keyword argument and this also works and the second or this last way is probably the nicest because it makes sense this kind of reads very natural now so we average functions in the list called numbers with a scalar of 2 so this is kind of probably the best way and yeah so there are different ways but not every way works but then let's ask another question now that we have two functions average events and scaled average events the question is can we define the design of the two functions a little bit better and what I mean by that is let's look at the two functions that we defined so far so I abbreviated the doc string a little bit here but it's actually the two definitions are exactly the same as the slides before so I first defined a function called average events and then I defined the second function called scaled average events and now I have a problem with that and the problem is this when I read this source code I already see that in both functions consist of four lines in the body and both functions have three lines in common so there's a lot of repetition going on and this is something that I don't like why don't I like it well I don't like it because if for whatever reason I have to change the way I do things in one function there is a very high chance that whatever the change in logic is I also want to do that for the other function as well and usually two functions they may be next to each other in the source code but they may not be next to each other so there is a danger that if ever we want to change the logic in one of the two functions we may forget to make the same changes in the other function as well and that would be bad so in another way I would like to redesign this code to make the function basically basically reuse the commonalities so how could we do that here is one way so we say that averaging the events is basically the same as calling the scaled average events function but we call it with a scalar of one so our scaled average events with a scalar of one is the same as calling average events in other words I can in the average events function just forward the function call to the other function so whenever a user calls average events what our function does it immediately calls scaled average events and passes on a scalar of one so now we basically reuse the code so we only have one area in the source code where the code is written out and the other function just reuses it this is a nicer way of writing the code so in the beginning of course you won't be concerned too much with such questions but when you want to work on a bigger project then this is a question you should always keep in mind how can you keep your code more maintainable however we will see that using some other features we can actually reuse code in a different way but first let's look at something else so the average events functions that we had before had one problem or the scaled average events function had one problem we always had to pass in both arguments so even if the scalar was one we didn't really we couldn't really not pass it in we always had to pass in two arguments to the old scaled average events functions so what I do here is I turn around the cases so what I mean by that is the following the scaled average events function is the more generic function and the average events function is the special case so we have the general case down here and the special case just forwards the function call to the general case and now I mean that makes sense conceptually but we can turn around the logic so when we go ahead and say well the scaled average events function will be used so rarely we just assume this that maybe we can go ahead and define a function average events that also takes an argument called scalar and we assign to this argument a so called default value of one so that means whenever we don't pass in any scalar explicitly the scalar of one will be used and then this function handles both cases so both the general case and also the special case are now handled in only one function and talking about code maintainability we did make some progress by forwarding one function to another but even more progress would be made if we could remove one of the two functions entirely this is like the most maintainable code and how can we get there? well we turn around the special and general case in a way by saying that our special case is now the only case and whenever we call this function without a second argument without a scalar argument then we basically pretend as if the scalar is one which means we get an old average event function so even though now the code contains the scalar this function could replace the old average event function let's check if that works I define it and now we call the new average events with only one argument and indeed it works so we couldn't tell from calling this function here if we actually have a second argument or if we are using the old version we couldn't tell here both functions work in the same way however we could also go ahead and pass it a second argument let's say two and then we get back a return value of 14.0 and this is nice however now I would claim that some people in the long run would be confused as to what the two means so we are not being explicit here in the source code when we call the function we could provide the keyword as we saw before this works but wouldn't it be nice if we could force our user the caller of the average event function to basically have to do this so in other words I want to forbid that a user calls the function in this way here and I only want to allow this case and this case because what do I achieve by that I achieve by that that I can use the average event functions in the normal way without a scalar just as if I used the old function before but I also provide the option to using it with a scalar but if the user wants to use it with a scalar he has to be explicit so I don't want to allow the user to call this function in an implicit way so to say and we have seen in the best practices in chapter 1 that explicit is better than implicit and how do we do this there is a syntax called keyword only arguments and this is how it works we just take the exact same definition as two slides before and I put an asterisk here and what this asterisk does it enforces that whatever argument is on the right side to it if we want to set this argument we must do so with a keyword this is why this section is called the keyword keyword only arguments we can only specify this argument by keyword so now let's go ahead and again if you go back nothing changed except for these two letters here the star and the comma of course but the star is what's important so now let's call average events and pass in numbers it works as before and now I call average events and I pass in scalar explicitly and it works and now I try to call average events and I pass in the scalar implicitly and I get a type error and the type error says the function takes one position argument by two were given so in this formulation the second formulation here the first argument is passed in by position the second one by keyword and in this situation here both arguments are passed in by position and this is not allowed and this is why we get exactly this error message here so let's go back and summarize what we learned about function definitions so we have learned that basically function definitions are easy to do in python we summarize that even from beginning on when we only learn about the first couple of functions that we write we should write a dog string the dog string should always specify what arguments and what the type of these arguments are so that we can always look up the function in for example a help message or something and know how the function works and then regarding the arguments usually most functions that we will write in the beginning will have only a limited number of arguments but sometimes when we get to more advanced settings it makes sense to define other arguments here as well that are usually seldomly used and whenever this is the case I want to ensure that the user calls this function with this argument only in an explicit way I can do it with the asterix and what have I achieved with that well I have achieved that the user has or that we as the maintainer only have one function in two different ways and so our code remains minimal and maintainable okay so now let's finish up this chapter with a short outlook on how we can extend core python what do I mean by core python core python is the python that we install and that is basically the same for every python installation and python can be extended in several ways and we will briefly go over them so the first and probably the best way to always extend the functionalities in the code you write is to try to find something in the standard library that we can use what is the standard library well the standard library is part of any python installation so even if we only install python from let's say python.org and not the anaconda version so that all the third party packages that anaconda ships with are not installed even then we have the standard library and the standard library is a collection of many functionalities from which we've already seen we will see soon some good examples and the standard library has the advantage that it is written in a way that the code is fast and more importantly the standard library many of the functions in there they are rather old and they have been tested extensively so we also know that most likely whatever we use from the standard library is also correct so whenever we can get away with something from the standard library it's always worthwhile to do so so one module in the standard library that is very valuable for us is the math module and the math module allows us to do some more advanced math than just plus minus multiplication and division so how do we get new functionality in our current python program where we just import it so I just type here import math and the python will find what math is and now the question is what is math and the answer is math is an object like anything else so maybe let's go ahead here and also draw some memory diagrams so after I executed the import math statement what happened is in memory there is a new box with something in it and of course always a type and this box goes in the name of math and we have a reference here so what is math we can already see in the value first just remember that here we have the angle brackets which means we cannot copy paste this back into the code it's not a literal however we can still see some descriptive message here for example it says a module made by the name math loaded in from this path and this is my home user path here I have a python installation on my local machine here and then we see there is some very cryptic file path and it ends with .so and .so basically means whatever we are now using is actually written in the C language so all the functionality in the math module is super fast it's highly optimized so what is math it's an object as I said in the memory diagram it has a location and of course it will have a type and the type is module so I will write in module here so what can I do with this math module well I can access the functionality that it provides and it provides two different kinds of functionalities first it provides what we will call attributes for example the number Pi and how can I access this I use the .notation we have seen this before in the first chapter so the math module comes with a variable called Pi and to access this we just say math.pi and then of course we get back the number Pi so what does this mean in memory what this means is we have some some box here and I treat it a little bit small but definitely it is a float and this is the number Pi and somehow we have a reference from the module here and the reference is named maybe write it in green here it's named Pi this reference so this is what must be somewhere in memory so we don't know where exactly Pi is but Pi is definitely also an object the same for the constant E so there is somewhere in memory another box and it has the number E in it and let me tell you it is for sure also a float and E is also referenced from the module and let's put the name E here okay so that's how we attribute which is basically variables that exist inside the module so now the dot notation allows us to follow so to say the references here and then also besides the constants here we also have some functionalities namely functions for example the square root function and the square root function of course also exists let's say square root and this will also have a type and I can already tell you that the type of the square root object will be function and of course we have also a reference here and this goes by the name SQRT so and now let's check what can we do with the square root function well the square root function if you want to know what it can do we can for example use the Python's built in help function to read it and then it tells us well the square root function takes one argument called x and the function returns the square root of x so nothing fancy and now we can call this function of course for example with the integer 2 and we get back the float 1.4142 as expected of course when we use functions from the standard library we can not only pass in objects but we can also pass in expressions and this also holds true for any function call so within between the call operators here we can write any single expression so for example 2 to the power of 2 and if I take the square root of that I should get back 2 hopefully and I do so we will see in chapter 5 that at this point I actually don't get back the number 2 so we will get back a number that is very close to the number 2 but unfortunately due to rounding errors we don't get back the number 2 exactly but this will be part of a discussion in chapter 5 and will be also very important to know that numbers are not precise and then of course we can mix function call so we can call in this case our own function average events with 3 numbers here 99,100 and 101 and of course the 99 and the 101 are filtered out the 100 remains if we take the average of 100 it will still be 100 because it's only 1 number and then the square root of 100 will be 10 hopefully so I get back 10 here so I can just what I'm saying here is I can do what is called function composition so I can write any legal expression inside the call operator and calling another function so because of that I can put in a function call inside a function call okay that's the math module now there's also the random module so let's import it import random and what can the random module do so the random module is there to help us model random numbers and it has among others a function called random so we see that random.random is a function call just as we followed the math module to the square root function now we followed the random module to the random function here and what does the random function do let's try it out let's call the random function so up here I don't call it I just reference the function and here I really call the function and I get back a number I'm immediately calling this code cell what happens, what type of numbers we get back and it seems like we always get back a number between 0 and 1 and this is exactly what the random function in the random module does it is defined to give us back a random number between 0 and 1 and all the numbers between 0 and 1 in the interval are equally likely to speak of a uniform distribution and what can we use this function for for example in chapter 4 we will see how we will model a coin toss and a coin toss is of course a random event where both sides of a coin are equally likely and we can use the random function in the random module to model that there is another function that we often use in this course it is the choice function in the random module and what does the choice function do well the choice function takes any so called sequence and we will see what that is for example the numbers list and it draws a random chance so if I execute this cell over and over again I will always get back a random number and we just see it several times so obviously the random choice function draws a random element from some list but it does not change the list so what we are doing here is like drawing a random number from the list with replacement we put it back so whenever we work with random numbers especially in a scientific environment we want to make sure that our calculations and our analyses are reproducible and in order to do that we set the so called random seed and what that basically is is you can think of it as if we determine the first number in the sequence of random numbers you can think of it's a little bit different but that's the best comparison we can make here so I set the random seed in this case to the number 42 and let's say I draw a random number again and I get back 0.639 and if I now set the random seed again to 42 and I draw another random number I will get the same random number so of course computers don't know what is random it doesn't exist with computers we speak of pseudo random numbers because of that and these numbers are in the random module they are random enough so that we can use them whenever we need randomness like in business for example maybe in supply chain management you want to model a simulation of something also in data science there are many predictive algorithms that rely on randomization so whenever we need some randomness in our software in our simulations we can use anything from the random module okay that was the short introduction to the standard library now let's look at the second way of how we can extend core Python namely by so called third party packages and third party packages things that we can download from the server on the internet and it's basically code written by other people and the code is not yet part or not part of the standard library for many many reasons but just because something is not part of the standard library does not mean that the code is bad so in particular the third party packages I will show you here they are super high quality packages that you can use anytime so for example the NumPy library so NumPy stands for a numerical Python some people also pronounce it numpy but I call it NumPy and what NumPy does it basically helps us to model anything that we understand as tabular data so matrix data basically vectors and matrices anything that we want to do in linear algebra can be expressed with the help of NumPy and of course more and NumPy is a very efficient library so it can keep track of a really big amount of data so what do we do in the beginning when we want to use a third party library we have to install it so you can do that with the pip install command and whenever we have an exclamation mark in the beginning what this really means what this really means is that this is as if I typed this directly into a command line so let's open this scary thing called a command line again and maybe I just go into the folder where I keep track of the materials for this course which is in the project folder and now within this folder I can go ahead and I can say first of all I can say python and I would get a python command line here and now I want to make sure that NumPy is installed how can I do that well I can just say pip install NumPy and then I get a message and I'm already satisfied that basically means NumPy is already installed and now I leave the terminal window again and I'm back into the notebook and whenever I prepend a command a code cell into the notebook with an exclamation mark what that means is everything that follows will be executed in the command line basically so let's do that and I see the exact same message again it says the NumPy is already installed so after installing it and also one more note if you're using the anaconda distribution which I said in the beginning you should use then you should also see this message because NumPy is already included in the anaconda distribution so how can I use NumPy well I import it and because I don't like this long word NumPy I import it as np and this is also a convention for NumPy and then there is a variable now in the global namespace called np and now np is also a module, it's the module NumPy and this lives somewhere else on my disk and note something here when I checked the path of the mass module before there was a different file extension and now the file here is .py and .py as we shall see soon is the default format in which you keep a Python source code outside of let's say Jupyter Notebooks and then whenever you have code inside a .py file you can of course import the file also into Jupyter Notebook so let's see quickly what can NumPy do so NumPy has a data type and we can pass with the array and what we can do is we can pass our numbers list to the array and then whatever we get back we store in the variable called vec you can already guess that this is going to be a vector and now if I evaluate vec and ask Python what is vec Python will basically tell me that this is an object that is an array with the same 12 numbers that I had before in a list and now you may wonder why would you do that we already have the numbers in a list why do we need them in a thing called array it has many advantages as we will see in chapter 9 but we have also some other added functionalities by keeping the data in an array list so the official name for the type of an array is nd array for n dimensional array and in this case in this example the array has only one dimension and a one dimensional array is just a vector and so what can we do with a vector we can scale and multiply it so 2 times vec in this case would give me one vector with 12 components is just twice the component of the old vector can I do that with a list so I multiply 2 times the numbers list and now what happens well I get back what do I get back I get back the same numbers as before but somehow I have now 24 numbers in the list somehow the list got doubled 2 times numbers gives me double the list so this kind of makes sense but still this is not really what I want usually if we have vector or matrix like data we don't want this to happen we want say scalar multiplication to occur or maybe matrix vector multiplication and so on so yeah and the numpy library basically enables us to use the data as a vector in this regard and then of course and this is an example of what we will call duck typing so we can pass vec to the built-in sum function and what do we get back we get back 78 we can pass vec to the random.choice function that we just saw and the random.choice function draws a random number out of vec just as it did for the list so notice something here the sum function up to now we have only used with a list object in Python and the random.choice function we also used only with a list type as the argument and now I pass in an array as the argument and the function continues to work both functions continue to work so the question is how is that the case so in other words what we observe is that the sum function does not rely on a certain data type to be passed in and so does the choice function so in other words it's more like the thing we pass in as the argument has to fulfill some other abstract ideas and in many many ways a list object behaves in exactly the same way as for example an and the array object we will see more of that the concept that I am aiming at will be called sequence and we will look at this in chapter 7 and sequences are super important because most of the data that as business people we want to analyze come as a sequence of data so matrices and vectors are sequences of data and most of the data comes in this format so we will spend a great amount of time on sequences but the important fact here is that because both lists and also and the arrays are sequences abstractly speaking they both can be passed to the same function and it works and in many other programming languages a function always requires the argument to be of a certain type and it only accepts the arguments of this one type and in Python the type is so important so in Python the behavior of the type is important and we call this duck typing and why do we call it duck typing because we say that in some aspects that are important to the context of these functions here the and the array object and the list object they walk and quark alike and the same goes if it walks like a duck and it quarks like a duck it must be a duck and because of that because both the and the array and the list object they walk and quark alike this is why we say they fulfill this criterion here which is also known as duck typing they behave in the same way okay let's see what else is in the slides so the third way of extending core Python is by writing our own modules and what do I mean by writing our own modules and packages if you go into your project folder then what you will see is that among all the below all the notebooks the chapters of the book you will find a file called sample module.py and if you open it and I have done so on the right hand side in a text editor called sublime text you will see that this is a module that contains lots of documentation these are all the gray letters here and then it contains several functions and one function is called average events here it's basically the same function that we were talking about today in the entire chapter but it's written in a different way somehow so what you will see if you look at the sample module.py file in detail is that polarization is very important here so I have defined two helper functions up here that do some of the logic and then I have defined three functions that we will actually be used outside the module and these functions only make use of the two helper functions so they reuse code they forward a function call and in terms of usage the average event functions in this sample module.py file exactly as the latest version of the average event function we saw in the Jupyter Notebook but it's implemented in a different way and so the question is now we have a sample module.py file and the question is how can I get the code that is in this file into my Jupyter Notebook well it's very easy so because the file is named sample underscore module.py we can just import it with the command import sample module so the .py gets removed but the rest stays the same and because sample module is a rather long name I import it as short for module okay that's it and now what can I do now I have a new variable called mod and I can look at it and now it says mod is a module which is called sample module which gives me the location on my machine where this file lies and we see it lies in my home folder under my project folder in the intro to python folder that you also should have and there it says sample module.py so I can actually just copy paste this path into my command line and I can actually open this file right away and edit it but now we have imported into Jupyter notebook here and now I want to I want to execute some of the functions inside my own module here how do you do that? Well, you already know how to do that because it works in exactly the same way as it works for third party libraries but also for imported modules from the standard library so we just say mod.with.operator and then we just look at average events and what is average events? Well, it's a function that has a certain that has this name and then of course I can use the pythons built in help function and pass the reference to my average events function in the module to it and then I get back the doc string and I get the sample module.py file and now I know how the function works and I can use it just like that I can call it with mod.average events and then I put the call operator there and I pass numbers to it and of course I've seen many times before I get back 7.0 Okay and of course I can also use a scalar to be used with a keyword argument because I am forcing the usage of a keyword argument here and then I get back a 14.0 So this is the last code cell in this chapter too so let me quickly summarize what we have learned in this chapter well this chapter was all about functions in many ways and functions is basically already a very narrow word the more generic word would be a callable so anything that can be called in Python this is what this chapter is about and it was also a little bit about how can I modularize my code to make it easier to maintain and maybe also easier to use and read in the long run we've seen three different examples of a callable the first one was built in functions like sum or len and so on we've seen the build in constructors like int, stir, maybe float and some more and then we have a defined so-called user defined functions or custom functions and we defined them with the dev statement and we did that in yeah we basically did that for our example from chapter one and then we have talked a little bit about the scoping rules which are that the local scope that gets created as a function is called is immediately forgotten after the function call is over whenever we call or whenever we reference something in the global scope from within a function then we can reference that for this works and also we saw that if there are several identical names in different scopes Python can manage that for us and the innermost scope will win we call this shadowing and then we have looked at a couple of ways of how we can specify what parameters a function takes and how it takes it and in particular we differentiated between positional arguments like numbers here and also keyword arguments like scalar here and then we added one little asterisk, one star in the function definition and then we used the user to use keyword only arguments so that's what it's called keyword only arguments so in the last example here I could not just pass in a tool I have to specify scalar and then we also discussed a little bit of how I can maybe design the functions that we saw here in this example in this chapter in a modular way and then what else did we do we looked at how we can extend core Python the first way is of course the standard library maybe I should mention that there is a web block called Python module of the week and they have very good tutorials for many packages and modules from the standard library and you should in the long run make yourself familiar with the standard library because that's always the first source of extensions that we want to look at third party libraries the example was NumPy these are libraries that we usually have to install with for example pip install and then lastly we saw how we can put our own code in a .py file and import it in a Jupyter Notebook and this way we have one advantage we can write our code in a .py file and reuse it in several Jupyter Notebooks it's also very important to know ok and this now concludes chapter 2