 All right, so Numba, so Numba is sort of a big library and provides plenty of optimization routines, but maybe the thing that it's the most well-known and useful for is they're just in time, so JIT compilation, all right? And what it does is that it's fairly smartly programmed. It goes into your code, it tries to understand your code and from that it rewrites it in C and C++ and execute it for you, all right? But then of course it can only be so smart. So it works much better if your code looks like what other programmers do, all right? Because then the people who have programmed Numba have basically programmed Numba to react to classical looking code. So without further ado, let's try it. So I execute that, I import this. Now to JIT again, like profile, it's a decorator. So you put at JIT there and I also add the little option, no Python equal true. That means that Numba, I tell to Numba that it should try to convert everything to non-Python so to full C code. And if there's anything that cannot convert because it doesn't understand my code well enough then I tell it to fail so that it kind of, I kind of ensure that the optimization is not the best possible, but like top on a high level. So now what I do, you see here is my pairwise distance function for NumPy. I did not change a single thing. I kept the same function there. I just added at JIT there, okay? And now I create some data here and now I will execute it and we will see what we get there. So it runs and you see now it takes one second. If you remember a little bit before I think it was there already before I think. I think it was in the other notebook but for such a data I can sort of tell you that it's actually we have lost something. But the thing is that here with JIT what we do is just in time compilation. That means that before it actually executed the function Numba has understood the code and then translated it and then compiled it and that in itself is fairly costly operation. So that's why it's slower. But now if we were to do that a second time it has already done these steps, it won't do it again. And so if I executed the second time now you see it's 11 millisecond, all right? So it's an important thing about Numba is that if you just do it to execute a function once it will not be very useful but if you do that to like execute a function plenty of time it will only pay the compilation price once. And so that's when now if I were to indeed compare the NumPy version and the Numba version remembering that the NumPy version is already much faster than the native version. You see that we have now a very, very nice gain there, okay? And the cost for that was just one line of code. So if you ask me that's a superb ratio in there and I would say that's one of the easiest sort of trick that I know of like to just gain a lot of performance with just like minimal code usage there. Now it kind of relies on the fact that our code is simple enough that Numba is able to apply its trick. If your code is a bit weirder if you have a lot of external functions and so on and so forth it can actually be harder to apply Numba on it. So keep that in mind but very, very, very nice tool to know. Another little way of doing that rather than rewriting your function and adding an at-jit there you can also do something like this. So I can say there is now something called pairwise distance Numba which is the result of this code where I give the function that I want to compile to Numba.jit, all right? The result is the same, okay? Now in general that's actually quite bluffing it works very, very, very well. As I said, there are some cache, all right? Most external libraries are completely missing from Numba. That is, for instance, if you want to have like a Numba function but inside that you want to integrate I don't know your favorite function from scipy or from sklearn or whatever it will fail because it will not have they will not have ported these function into C, all right? And so they will only have the Python interaction, sorry, interface to it. So that means that then the compilation will not really work well because you will have interaction between Python and C elements. Also it works fairly well with NumPy but not all of NumPy's code have been ported. You have here a little link to their documentation where they say what has been ported and what has not been ported. Most of the core and simple stuff have been ported. So for simple things you don't have to worry too much, right? In general, I give you a couple of links there. Consider that if you want to do fairly simple operation and the simplest operation that you want to do the better it will work, right? And one thing even is that sometimes it pays to not even give it the NumPy code but just to give it like super simple native Python code because Numba can be generally a bit, let's say a bit better at understanding what you do. Like the simplest your code, the easiest easy to understand the better Numba is able to optimize it. And here it is demonstrated. So now I've taken the code for the native Python solution and just added the adjit. And we see that now if we consider first the jitted version of the NumPy function and the jitted version of the native function, the jitted version there, okay. We worked much better. Let me check again. I think that there were some compilation. Yeah, there we go. So now without the compilation artifact, the NumPy the right one is slower than the native derived one, right? So there, it also pays to come back to dumb down if you will your implementation, keep it as simple as possible because then the algorithm of Numba can understand it much easier. All right, so that's what I wanted to show with this tool. It's really, really quite impressive and very simple of use. So I really recommend it wholeheartedly. But for some usage, as I think you will see during the exercises, it doesn't shine that well. For example, it works very well on simple stuff and it works very well on numbers but not so much on other data types. Okay, any questions? Ah, there's a question by Victor. If I run a snakemake pipeline, is the code will be recompiled each rule executions? That's a great question. Normally, no, as long as the instance stays the same, it will not. Now with that being said, there are also, if you want to be sure, if you want to spread this across different executable as one may want to, there are ways, and I give a link here to compile some number functions ahead of time. So you compile them once ahead of time and then you're sure that it will never be recompiled while you run or as you create code or stuff like this. And that can be maybe a trick that can help you out in this.