 Hello, and we are back and I hear an echo so Is it you it's not me because it's often me Okay, okay So yeah, so next up for the rest of the day we have two lessons about an hour long first is advanced numpy So in past years, we've given a basic numpy lesson about the basics of using numpy array objects So since we've already done that we have the material you can read we have a video from the previous year Actually, I should make sure to link that but we're trying something different and Talk going under the hood and looking inside the array object and how it works and why you might need to know how it is So this is advanced so if you're new to numpy then basically watch and observe and Then if you haven't already then you can Follow the basic material later, but then we have pandas But with me here are two great co-instructors Mariah Van Blent and Johan Helzwick And I'll give it to them so Great. Thank you Richard. Yes. Welcome everyone to advanced numpy where Johan and I will try to teach you some some things that that are not covered in the the basic Numpy tutorials that you find but are actually very useful when you actually start using numpy for any type of serious data analysis and Throughout this this this hour there's three things we want to cover with you and The first is to discuss like why numpy is fast Python is quite slow as a programming language if you try to do any type of serious like Data analysis in plain Python. It's actually really slow, but numpy is really really fast and why is that? But numpy is not always fast When you actually start using numpy you will sometimes encounter points where sometimes things are fast and sometimes things Are are way slower than you would expect them to be And this is not obvious at all in a place that we will we will try to explain more about that how that works And then there's a third topic That is all about memory management Sometimes numpy will copy data sometimes numpy will not copy data It's not always obvious when that actually happens We will go into that and this is relevant for when you have like Large data arrays and if you have like a few gigabytes of data loaded into memory Then you really want to know if numpy is going to make a copy of the data or not all right So Well, let's get right into it to the first thing like why is numpy fast so Yoan what maybe maybe you can tell me when I mean Python is quite slow, but numpy can be really fast and Why is that? What what do people usually say when when we ask? Yeah, so there are a number of reasons. So numpy has been designed to be fast and One way to achieve this has been to write the the kernels of numpy into C code which is compiled code Another thing is this that a lot of the arithmetics which are performed in numpy are standardized arithmetics with as all the form which is contained in mathematical libraries such as the Blas basic linear algebra library and and also more large Yeah, packages as them Intel math kernel library and this have been optimized over many decades To perform and be stable on a large variety of the hardware Exactly so to recap numpy is fast because it's Implemented in C code. It's it's not in written in Python. It's mostly written in C But not any C as we will see So, yeah and For this we have designed to first exercise In this in this new course So when we say numpy is fast because it's implemented in C. Let's let's put that actually to the test I have a snippet of C code here. So C is a compiled language That is compiles down to machine instructions and this is why it's really fast And this snippet of C code here It generates 10 million random numbers and then acts of them all together. So we're adding 10 million random numbers here And you know, and maybe you can demonstrate what this code actually does you can grab grab my screen. I Think you have it all loaded up. So you don't have to so for you For the students at home if you have to see compiler at hand, you know how to use it Feel free to to type along with this. If not, and just feel free to just watch Watch this as a demo Yes, thanks for you. So we have the the C code snippet here And as usual, there is a copy button here if you would like to copy the code into an editor and compile it So I have here a terminal window And let's see what I have here. I have here The C source code speed random C And then I have the executables. So speed random sum underscore opt. That's my fastest binary. So let's see how this performs So right here time and speed random sum Opts And that took in yeah, the whole time here was 1.4 seconds And the user time nine point Let's zero point nine six Yeah, but the real time is what we care about most I think so it took 1.4 seconds this C version to to run Okay, so that's the time in C um and I I'm gonna grab the screen share back Okay, that's the time in C Now for you All following the course I have a challenge for you the first exercise a first like numpy warm-up if you will Can you write A little python script or just a little cell in a jupyter notebook or or anything that uses numpy That can do this faster Then you always left it could do it in C So to recap The task for you is to use numpy to generate 100 million random numbers And then add them all together So a good hint here is not to use python for loops But use the numpy functionality to do this And can you beat the c-verse of course you will run into your own laptop or or own machine So a different hardware we will compare python versus c on yoan's laptop After the exercise But as a warm-up I challenge you to do this for yourself If you find you are having trouble writing this Then That's a good reminder for you to to after this lesson go back to the the basic numpy lesson Please do check out the materials through there Okay, we will give you 10 minutes for this I'm gonna leave the exercise here on the screen. I think it's also on the hack md. I believe And I wish you all the best of luck Welcome back everybody I see in the hack md That a lot of you managed to actually do this and a lot of you got really fast times Much faster than the c-code But of course the the c-code was running on your laptop. So to be fair Yoan, could you also run the the python code on your laptop and Raise c versus python on the same machine Yes So I do it here within a jupyter notebook report numpy Then print So I'll first just execute it Yes See that it's correct. Yes. Okay. So that was correct Then I copy it And I had in the beginning here a time it command let's see So it will now be executed a few times seven runs and 1.39 Oh, okay. That's not not so good. It was actually more than when I tried it five minutes ago I guess it really depends what else is running on your machine Exactly because We are not running on a dedicated resource just to run this random number generation, but we have here the operating system and Perhaps not least we do have here both marines computer and on my computer. We have the the zoom client in order to capture the video and sound So that it can be both causes on tweets and zoom as you probably Notice many of you can be quite happy So even still it was faster Even still it was faster We could also for the sake of comparison we could see what happens if we execute The python code not in jupyter, but we invoke it From the terminal window So switch to a terminal window And Yeah, okay. We have a few timings Yeah, so we were sometimes we were faster already. Yeah. Yeah, that's right. So I write it again time height on Speed random sum dot pi Okay, one point three Yeah, okay Let let let let's continue we can we can we can play all day with this The point is that Well, even like and this is a clear example even with taking the time that python takes to start into account NumPy is faster than that. You see code Why is that? Well, because python numpy Doesn't use naive like poorly written ccode numpy uses really Highly optimized ccode for this. So as as joan already told you Numpy is compiled against a library such as mkl or a library such as blast And these are software libraries. I mean blast stems from like 1979. It's been optimized ever since I mean, this is like over 40 years ago and Mkl is developed by intel who optimized it especially for intro processors. So these libraries implement things like multiple array multiplication and vector math and things like that and they've been optimized Through and through So Well, let's have another demonstration this for for example. So this um One of the things that uh blast has for example, uh blast has a dedicated function to compute a vector norm and um The the point is whenever numpy has a dedicated function for something chances are very good that it sort outsources that to a dedicated blast function Or like a dedicated function underlying Libraries, so even numpy can be faster than numpy So joan could you could you demonstrate this one for example? So let's let's make um Sort of a manual version of computing the vector norm. The vector norm is like the length Of a vector in in in the coordinate space, which can use compute using the PyTagoras theorem Um, and also demonstrate How that works if we just do it through numpy The through the dedicated function Yes So at first I take this code snippet here Yep, and execute it This is simply to store away The set of random numbers into the variable a Now we want them to implement the PyTagoras theorem manually So start here with the time it command Then I have a variable l That will be equal to the Numpy Square root And then we go to the Summation of all elements of the array so some And it's not the array but the square of each element in the array so a Star star two Let's see now Yeah, we can also have green And as before it will execute a few times so that you get an average and it clocks here to 560 milliseconds in average And let's see then But what if you use the dedicated function? Yeah, yes, exactly. So just again the timer Using the same variable nine l equal to Numpy Lynn Alec is a class and then the norm function So norm here is the The the standard two norm as you have it in PyTagoras Okay, let's see. Oh, this is faster One can see with a bear eye It's executing many times Yes, the time it command will sort of dynamically tune how many times it runs depending on how long it takes One hundred sixteen milliseconds. So that's a vast improvement. That's a factor of five five times faster Okay, so let that sink in. That's the the the first big takeaway from From this advanced numpy course If you want to write better numpy code if you want to write faster numpy code one of the best things you can do Is study the numpy documentation go over Sort of try to learn a bit the functions that are there The more you know of numpy what kind of functions are there and using those functions the faster your code will be Because if if there's a dedicated function to do what you want to do Chances are it's much faster than if you implemented yourself All right, and that is of course because it outsources that to like BLOSS and MKL and that sort of sense Okay That's one way numpy is fast. Um, let's go to a second way Uh, numpy is fast and that has to do with data managing so Let's uh, well, let's also start that you're on with with an example um, so let as Let's do the transpose example. So let's let's make a nice big matrix. Let's make like a Matrix 10,000 rows 20,000 columns. So we have a nice big chunk of data to work with Yes, so um I do that in Jupiter Or maybe this one you can just copy paste Okay Yeah, okay, I'll paste the remaining Copy and paste Yeah, so this was then Creating the matrix a which is 10,000 by 20,000 and we don't care about the elements here. So we just fill it with random numbers Then But we yeah, but we care about this nice and big. So this this matrix is 1.6 gigabytes So this is quite quite quite a substantial amount of data It's not a small matrix And the transpose yeah, well before you run it. Um, so What does the transpose function actually do? Johan, so what what what are we doing here? Yes, so so the the transpose for the case of a two-dimensional array Then transpose is nothing else than Swapping what are rows and what are columns? Yes, but all the time you are playing with the same elements in the array Yes, so it it it transforms the array that rows becomes columns columns becomes rows So we might expect that's an awful lot of data to now shuffle about because Uh, almost every element in this matrix is going to change places with some other element That's a lot of data that now needs to be shoveled around. Let's time it How long does it take numpy to actually do this? It feels like it's taking a long time, but that is actually because time it run it The time of command ran it 10 million times this piece. Oh, okay. Yeah. Yeah And it took 104 nanoseconds Order of nanoseconds Right, even though the matrix is now completely shoveled up nanoseconds What's going on here? Um, what kind of like deep magic is numpy Doing here that a matrix transpose is basically for free It doesn't matter how large your array is how large your matrices are transpose is always Extremely fast. What's happening here? And if you want to know that You need to understand how numpy actually manages memory. Um, this has all to do with memory. So Um, let me actually grab the screen again So, um, yeah, so this is transposing. So this is an important image Look at this for a little while. Um So even uh, even when you when you create, uh, a two day a 2d array in numpy say we create a matrix in numpy um We index this array saying so if we want an element from this array, we say to numpy. Okay. I want a row Give me row number two And then give me column number three, right? So we when we want elements. We have two numbers there But of course computer memory doesn't really work that way the the operating system exposes the memory always as a flat list a long 1d list of of values um And this is also how numpy always sees your data numpy always sees your data as a flat list um, so if you make a matrix a 2d 2d array a matrix Numpy will actually store this like row by row it concatenates all the rows together into a one big long thing That is actually what numpy is operating on So numpy sort of is faking this second dimension for me whenever you print an array or you you do some math on it Then numpy will quickly like pretend that this this is a 2d thing But always in the background all the data all your data is always a big long 1d array strength um so and a thing now A thing that numpy needs to solve Because this is where the magic comes from this will become apparent later um, a thing that numpy always needs to solve is say the user wants um this element like element on row number two column number three Numpy needs internally to translate this into the correct element in its long 1d list So element two comma three here this element is actually in a flattened thing is element number 11. Why because um We first have like the first row. So if we want the second row So we at first have to skip through the first row skip through the second row then we're in the correct row Um, and now we need to correct column. So it's column number three. So now we need to go All the way there to the column. Yes, that's number 11. That's what numpy needs to do constantly Um, and in order to understand that properly, um, I'm calling for for a second exercise um And at least I want you to give this a try. So it's okay. If you don't uh, completely succeed in this We will continue but at least try it for a little while So let's let's give the the students 10 minutes, uh again to do this um And your exercise is now this To write a function, uh, you can call it ravel. Um, and that function will Will translate between a row index and a column index To the proper element In the 1d array. So if I give that function, I want two comma three that functions would give. Well, that's that's that's element Number 11 in my list. If I give that function zero comma zero. That's the first one. Is it uh, remember python zero indexed not one indexed Zero comma zero. That's the very first one, right? Uh, or uh, one comma zero is this one. So that is item number four Take 10 minutes see if you can solve this problem yourself So the yeah the exercise description is here, uh, we will be back in 10 minutes With the answer and we'll continue we'll pick up from there You know, good. Okay. Welcome back everybody um It was interesting reading the hack and D for some people. This was really easy for others. It was more complicated the reason why we we we gave you this exercise is Um, that we wanted you to think like what does it take to actually do this computation? Um, because then you are in a good headspace to understand how numpy solves this So to recap I I've opened the solution here. This is the solution Uh to this problem So if if you want for example, if we want row number two and column number three What we must do is we must like move In the in uh in this one the array We need we need to like take steps if we want to go to the next row We take a step in that array and that step is like the number of columns wide and we take certain steps And then for the columns we take a smaller step That's how you solve this thing now numpy, of course Doesn't only solve this for the two-dimensional case numpy needs to solve this for an arbitrary number of dimensions you can make a An array with 200 dimensions and still needs to be able to solve this. So this is how numpy does it Every numpy array for example here. I uh, this is a four by eight LA Has a strides parameter and the strides lists for each dimension So this is a tuple of two the strides of this one Because it's a two-dimensional array and this tuple lists For each dimension, how many steps do I need to take in this big long one d array? To get to the next element along that dimension. So say the first dimension in uh, Will be rows to get to the next row in this array I need to take 64 steps in the one d array and if I want to get to the next column I need to take eight steps into that array This is a four times eight Matrix so you might wonder now if you're clever like why 64 that doesn't add up Um, if I want to move down one row. I need to skip over eight columns But that is because this is measured in bytes um, so by default numpy operates in double precision floating point numbers and each of those bad boys takes up eight bytes So each element is eight bytes. So to move to the next row It's eight times eight is 64 bytes that we need to skip along in so that's So that's how it solves it in uh, that's numpy. How numpy solves this Okay, why am I telling you all this? Why do you need to know this? You need to notice because now We can solve the mystery of transpose Let's go back to the question. Why transpose was insanely fast. Why is it so fast? Well, uh, you know, maybe you can show what's happening to the to the strides parameter Or not parameter the stride property of the arrays when we transpose something So maybe the the quick one someone you can already Think of the solution. So Why do you think uh transpose is fast try to think about it and meanwhile, Johan will Write the code there or copy paste the code there. Yeah Yeah, so here we have the strides example. Well, you can run it and see that. Yeah Yeah, and actually we had the solution here already in in the comments here in the code itself So for a four by eight matrix the stride was 64 and eight and then for the five dimensional array And the field with zeros the strides Yeah mentioned here, but they go out here. Okay. Yeah, but what happens when we transpose an array? So that's that's the next example So Maybe we can bring that in. Yes Scroll down here. Yeah, this piece of code it just it creates a nice big random array It's actually the same one we had before And we transpose it. But now we print so we print original the strides of the original matrix and we print the strides of the Transpose matrix. I'll see what happened. Well, the answer is also there But look at that Yes, we have the two variable names a is the original matrix. Yeah, and b is the transpose So now you see how numpy is actually doing this Um, and it's the same for matlab. It's the same for r. It's the same for julia Matlab of a numpy Um, can't do transpose instantly because all it needs to do Is flip the values of the strides You see so now rows becomes columns columns becomes rows. That's just a trick Um of messing around with these stride values It's just a trick of messing around how many elements in this big 1d array Do I need to traverse to get to the next row? How how many do I need to go to the next column? So this is why transpose is really really fast and now that you know this You might also be able to guess why the reshape operation is really really fast so Reshaping changing the men the dimensions of a matrix saying we first have like 20 000 rows and 10 000 columns um, also this uh If you then reshape it to be Like 40 000 times 5000 you can you can you can reshape uh arrays in in umpire as long as the number of elements stays the same um You can also see uh, this is implemented by just messing around with the stripe So the big 1d array memory buffer is completely untouched. It's not messed around data is not moved at all The only thing that changes is the strides parameter and the shape parameter of uh of an array okay um So That brings us to the to the final point. Um, they want to instill in so we've seen Transposing super fast great. We've seen reshaping very fast because we can just do this by messing around with the stride So if we have an array and we first transpose it And we then reshape it it should be fast, right? Here you actually execute this bit of code You know, let's let's let's try it should be fast. This should run instantly Um, let's see. Well, this yeah this piece of code should run instantly um Let's see with yeah, we have a big example there And sorry, it was the It's okay. First run the one above this. Yes, exactly. Uh that one. Yeah, yes So we create the large array Yeah, and then first transpose it and then reshape it Yes, and that's in this composed statement Uh transpose and then reshape Yes So it uses here actually a shortcut from transpose. So before we wrote transpose transpose all the way as a method call um Actually transpose transposing is so fast that numpy included a shortcut matlab does this as well um You can just write your matrix dot capital letter t. Um, that's short for the skimmy transpose version and it's just it's free. So This takes no time at all compute. Why is this taking so long now, Johan? So transpose will be fast reshaping should be fast somehow. This is taking an awful lot of time now Um, yeah, numpy is actually copying over all of our data. It took 39 seconds to do this Uh, and this is one of the the the things that are not obvious about numpy All right Why is that? Well, if you uh, if you could run the second, um example there So let's take a look actually. So what what happens when we transpose a matrix and then reshape the matrix So this example code We we create a small matrix so we can actually print them out and and see what's happening Yes, exactly. So Here we have only the elements one two three and four five six in a matrix which is then three by two Let's see What we get the Sorry, the rhythm matrix is two by three two rows and three columns And When we transpose it you can see you have one two three in the first column four five six in the second column But when we then apply also the reshaping Look here at the elements. It's one four two and five three six in the rows So even though it's a six Same elements that we had from the very start. They have been re-sruffled And there is no way that this re-suffling can be done without Numpy copying data in the background. Yes, that's the important part So the like transposing or other reshaping. Yeah, the elements have reshuffled, but look there now we shovel in such a way We can't obtain this this shuffle ordering by being clever with the strides Um, it's shuffled too badly now. Um, there's no way to apply a shortcut with just modifying strides Um to obtain this and this is why numpy in this case numpy had no choice But to actually go over the data and copy over everything So that's something you need to realize now that you know this memory model When you know that that every array is basically in stored in memory is a big 1d list You can understand why transposing and reshaping is fast sometimes um, but when you stack them together or when your Matrix has been transposed before and saved and it's now in the wrong order. Yeah, wrong order. It's it's an alternative layout And then you try to reshape suddenly. Uh, it's super super super slow Um That's one thing. Um, there's one more point. I want to give you before we go into the break Um, there's one other big consequence of this Another thing that's really fast and numpy is is this is selecting a subset. So say we create a huge array Um, and then uh, and then only select for example the first row or only select some snippet in the middle This is also super fast in numpy numpy does not need to copy data and you realize why Um, what numpy will do is just it will tweak the shape. It will tweak the strides Um, it will also apply an offset. That's another parameter that an array has we haven't covered yet But then also with with the offset the strides and the shape numpy can just Would uh create what is called a view of this data But it will not touch the original big Uh data array But this comes with a caveat So if you want to save memory for example, you say, okay, we first load in all of my data Now my memory is almost full. So in order to free up memory in speed is up I will only select the first row of my array and I will continue my analysis on that Be mindful even if you select only first row this big data array is still a memory. It will not get free You will not free up any memory. Um, actually when you do this So what what does the situation now becomes this you will have like your big memory buffer still still there You will now have just multiple views into this memory buffer Each array is basically a view into the memory buffer. Um, and different views can share the same memory buffer Memory will not get free um, if you do that another thing to take note of is when you modify data in place so, um Because the memory buffer can be shared between various arrays Um, and it will be shared. Uh, what if you change the data in one array? That data will also be Will have changed in your second array because they share the same memory buffer So be aware of this. Um, when does numpy make a copy? When does numpy just create a view into the same data? It's very relevant Um, that's funny. Oh, we didn't edit here. Okay The final thing I should of course tell you that arrays have have a dot copy function attached to them So if you at some point really want to make a copy and sometimes you want to Um, you can call array dot copy and it will create a new memory buffer for you a new copy So there's no surprises there All right Um, and now I think you have all deserved a break Um, and then after the break we go on to hand us, but That's it from us from now And we'll see you in a bit