 Good without further ado, I'd like to present our next speaker Sebastian Sebastian will be talking about writing faster code Give him a big hand. Hi everyone. Can you hear me? Okay, so I would like to talk with you about writing faster code and Last time I was giving a short talk short lightning talk about writing faster code I remember someone pointed out that well basically you can take your Python code rewrite it in C or C plus Plus and it will be faster And I mean the guy was right So take any piece of Python code rewrite it in C or C plus plus and probably it will be out on automatically faster So I was thinking hmm if I say just writing faster code It might not be clear if I mean only Python or not So I had to fix the title of my presentation and I was very happy with a new title I mean it makes it clear. We are not gonna talk about C or Java today, but then I realized the title is too long I mean even though it's very clear it barely fits on the slide. So I had to change it again So I got the version that means exactly the same thing, but it's shorter So this is how I ended up with writing faster Python as a title for my today's talk So let's put aside the flame or about which programming language is better. We all know the answer. You're that's why you're here Python was not created to be a fast language that you would use for some computations where every nanosecond counts and That's fine with me. I mean Python is a great programming language that is very easy and fun to use Python is very easy to learn. The fact that it's so easy to read and write code in Python It's very encouraging for people to new for people new to software development I see that it's getting more and more popular in schools or at the universities as the first programming language And I am not surprised. I mean Imagine you're completely new to programming and someone tells you hey Let me show you how much fun programming is and let's start with something super simple Let's just print some text to the screen and then he shows you those two examples. I Mean one of them is clearly not something that you want to show to a beginner to encourage him or her to start programming But also Python is not only useful useful for learning There are many big companies that are using Python companies with millions of users and billions of requests per month So your website is going to be fun using Python So Python is usually fast enough But what if you decide that it's not fast enough anymore for example your website starts giving you timeouts or Maybe a faster code will bring more money to your company. So it's time for optimization But how do you optimize the code? Well, probably you need to follow some rules So let's try to Google for that and if we open the first thing we see that there are only three rules Wow, it's gonna be easier than we expected. So let's take a closer look at those rules First rule of optimization done don't okay. That was easy any questions Well, actually now there is more to that so what does it mean don't well nine out of ten times when you think that You need optimization. You don't especially in the early stage of your products life You might think it would be nice to optimize your code a bit But well first of all you will waste time doing something that is probably not needed If you want your code to run faster, you can start with getting a faster hardware in the first place and Second of all optimization comes with a cost Most often the only cost is the time that you spend optimizing your code Well, sometimes you it's also the time that you need to fix what you just broke with your optimization But also optimize it the optimized code might not be as readable as it was in the first place And maybe your code is running faster, but it's now using more memory So unless you have really good reasons to optimize don't do this And if you know that you have reasons to optimize then we can move to the second rule of optimization Don't do this yet This is how I understand this rule. So first make sure your code works Then make sure you have a good test suit and only then you're ready for optimization I love this role. It always reminds me how many time I broke it I mean so many times I was working on a piece of software and I started thinking Maybe I can change this piece of code and it will be faster and maybe I will save like two lines of code Was it a good idea? No, and not only because I ended up breaking things But most often I did end up breaking things But also I started jumping around the code and at some point I got confused what I was writing in the first place Did it make my code faster? I have no idea because I had nothing to compare it to in the first place If I would finish writing the code and then try to improve it I could measure both solutions and compare them but in that scenario I could only guess and that brings me to the last rule of optimization Don't guess always profile your code human are terrible in predicting the bottlenecks of your code. So If you think that your code is slow first profile it and then see what part of it takes most of the time Otherwise you made up you may end up spending time Rewriting one piece of your code to just get like 1% of speed improvement Well, there are other parts of your software where you can gain way more improvement with less effort So there are plenty of profiling tools There were already quite a few talks during EuroPython about profiling So I will not go into the details But if you don't know where to start you can take a look at the C profile module and it will show you a clear overview of How many times each function is called and when the code and where the code is spent and if you want to have Some more advanced formatting you can add the PSTADS module. Also, if you prefer the graphic interface You can take a look at the Run Snake Run or SnakeVis libraries So once we're ready for optimization We have to decide on which area we want to focus. So there are different levels of optimization Starting from the highest level you have the design level optimization So depending on the constraints and priorities of your system, you can optimize it by redesigning it. It might require Rewriting your software in a different programming language or changing the database type or changing the architecture to perform less database queries So this kind of optimization will usually give you the best improvement, but it also takes most time to do this So I don't encourage you to rewrite your software from the scratch But if you have some critical parts of your code that are run often, you can optimize them by rewriting them in C or C++ Because C is faster, you will have some good speed improvement for free. Well, not really for free now You will have Python and C code in the same project So one level lower we have algorithms and data structures So knowing good algorithms together with their complexity is Definitely helps you creating a good and efficient software For example, if you want to get the sum of numbers from 1 to n The first idea might be to get a loop that goes through all the elements and ask them. It will work, but it won't be fast Instead you can use the algorithm for the arithmetic arithmetic sum which will give you the same results and it will it will be more efficient So the next level is the source code optimization And this is something that I will talk about in the second part of the presentation Now we're moving to the build level which involves setting up some specific build flags So in your daily work, it's not something that you will do often You can optimize Python for a specific architecture, but if you're a web developer like me This is either something that you will do once per machine or you won't bother at all Next we have the compile level So we can make some optimizations if your programming language has an ahead-of-time compiler And since I'm talking about Cpython today, which doesn't really have ahead-of-time compiler We're gonna skip that part as well And last but not least we have the runtime level So it's related with a specific compiler that you're using. Some compilers are faster than the others So for example, if you replace Cpython with PyPy, you can get some improvements depending on the use case of your software But it really depends on what kind of piece of code you're writing So most of the time once you set up on a specific language implementation There is nothing you have to do to benefit from this kind of optimization It's usually up to the creators of the compilers to optimize them So simply updating to the new version of the programming language you're using can make your code run a bit faster So when you optimize you probably want your code to run faster and Also use less memory and Basically less of everything The bad news is you can't have all of it Optimization in one area will usually cause deterioration in other areas so you always have to decide which resources are crucial and You have to optimize in that direction So it's possible that optimization will have nothing to do with the speed because there are other resources more important than their own speed for example Who cares that your Program is now ten times faster when it's crashing half of the time because it's running out of memory Also another important resource that people are often forgetting is the sanity a Sanity of a person that will be maintaining your code. So please be nice to that person. You'd never know who that might be Yeah, so unless you're really writing a throwaway code if you're making your code Harder to read and maintain then you're probably doing it wrong So having those things clear let's jump straight to how you can write faster Python also known as source code optimization My examples. I'm using the version 3.5 point one of Python Together with my Python. Although the example should work in both Python 2 and Python 3 So for measuring the execution time of my code, I will be using the magic time it function It has some overhead comparing to the standard time in library But it doesn't really matter because as long as we use the same method to measure execution time of different functions We only need to know which method is faster than and by how much? So for each of my examples, I will write different versions of code measure the execution time and compare them So let's start with something simple Let's say you want to count the number of elements in a list You can easily write a simple loop that will increment the counter and well, this will work. It will be very so You can achieve the same results using the built-in LAN function and as you can see for only one million of results The difference is insanely huge. So my first advice is not to reinvent the wheel But first check if there is a function that you can use Python 3.5 has 68 building functions So it's nice to take a look at them and keep them in the back of your head because they might be handy at some point Also before you start writing your own version of order dictionary or a dictionary with default values Take a look at the collection module from the standard library even though it contains only like 10 Different data types. Those are probably the data types you are looking for if the standard ones are not enough So let's say you have a list of one million elements and you want to select only the odd numbers So the naive version would be to use the for loop So for each element of the list you check if it's odd and if it is you add it to another list But I already show you in the previous example that in most cases for loops can be replaced with something better in this case you could use the built-in filter function instead and In Python 2 filter was returning a list directly in Python 3 is returning an iterator So I have to call the list function to get the same results as in case of the for loop And even though the list function has some impact on the performance It's negligible comparing to the time spent in the filter function yet. You can see that filter performs even slower than the for loop Why does this happen? Well the fact that filter is returning now an iterator is a clear sign that it's a wrong use case for this kind of function So if you want to get the whole list as a result, it's better to use the list comprehension It's around 75 percent times faster than the for loop and at least for me it looks more clear When you want to execute a piece of code, but you are not sure if it will be successful Maybe some variables are not set or like in this case the class might be missing some attribute So you want to protect yourself somehow the first way you can do this is called look before you leave or ask for permissions What it means is that you first check if the class has a specific attribute and then you perform the operations Usually this checking is done with the if statement However, there's different approach that you could use and it's called back for forgiveness So in this scenario you perform the operation without checking the conditions first But in case you're you expect that something might break you wrap your code in a try except block and you catch the exceptions that were raised And as you can see in the simple example begging for forgiveness is like three times faster But it gets even better if you're checking for more conditions So here we are checking if three up tributes are present and asking for permission is still slower And now it's also getting more difficult to read So following the back for forgiveness approach will result in a faster and more readable code So we could say that asking for forgiveness instead of checking the permissions. It's always a better way But we won't say that because it's not true exception handling is still quite expensive So if the attribute is actually missing then begging for forgiveness will be slower than asking for permissions So as a rule of thumb you can use the ask for permissions way If you know that it's very likely that that the attribute will be missing or there will be some other problems that you can predict Otherwise, if you're expecting that your code will work in most of the times Try using using try except will result in a faster and quite often more readable code So for example, if you're fetching some files from the internet and you expect that everything will be fine Unless there is no internet connection So instead of checking if there is internet connection if it's fast enough if there are no timeouts just go for the try except But then again, I strongly advise you to measure both solutions and see maybe in your case it will be different So let's tackle another problem the membership testing So if you have a list and you want to check if it contains a specific element you can use a for loop But the problem is you are iterating over the whole list even though you're not really doing anything with all those elements So you can replace the for loop with the in statement It will check if a specific element belongs to a given set of data and it will do this twice as fast But there is still one big problem with this approach The lookup time depends on where your element is located in that list If it's at the beginning of the list, you're lucky and you will get it fast If it's at the end of the list you have to wait So what would be really nice here if we had the data structure that would have a constant lookup time and actually in Python we have we have both sets and Dictionary that have constant lookup time. So if we replace the list with a set Then the lookup time becomes faster from just few times faster to hundreds of thousand times faster So where's the catch? Well, you pay some time to convert the list to a set and in this scenario Converting this list to a set takes more time than any of the lookups in that list. So it doesn't really make sense However, if you're checking membership of different elements, quite often it makes sense to first convert it to a set So speaking of sets, they have another interesting feature. They don't contain duplicates So basically if you have a list of elements and you want to remove the duplicates The fastest way to do this is to convert this list to a set But be aware that sets are not not ordered So if you need to preserve the order take a look at the order dictionary from the collection module so if you want to sort a list you can either do this in place using the list.sort function or You can call the sorted function that will create a new list and Unless you really need to have a new list sorting in place will be like six times faster in this scenario This is for one million of random numbers If you want to perform the same operation on a large set of data, then you have two options You can write a function that performs the operation and call this function 1000 times or you can call a function that takes this set of data and performs the operation inside and The second approach will be faster So if you can in an easy way replace multiple calls to one function with just one function Then quite often it's a good idea So what's the best way to check if a variable expression is true? Well, you can explicitly compare this variable to true But in most cases you're adding additional redundancy so you can simplify your condition to just if variable and It will return true unless the variable is false non zero empty string empty list or other falsely expression And by doing that your comparison get faster by like 70% And the same rule applies when checking for false So the fastest way to do this is to use if not variable unless you really need to distinguish Falls from that's a non or zero or other falsely values It also applies to an empty data structure. So Simply doing if not a list is will be almost three times faster than explicitly checking the length of a list So let's take a look at different ways of defining functions in Python The most common one is to create a function with deaf keyword The other way is to declare an anonymous function with lambda if you assign this lambda to a variable it will act in the same way as a function created with a deaf keyword and As you can see they are both equally fast. Why? Because both versions do exactly the same thing We can disassemble the code of both versions with the disk library and we'll see that inside is the same code So is there any difference? Well, if your function has more than one line, you can use lambda and you can't really put documentation inside of lambda Also, you could have Pepe it enabled in your code editor it will complain each time you try to assign lambda to a variable and in his right Lambdas work really nice when you need a simple one-liner callback for your functions especially for functions like filter mob or reduce and There are also some quite few narrow use cases where it might be necessary to use lambda as a callback So if you want to read more you can check the link at the bottom but however In any other case, I would definitely recommend to use deaf. It's much cleaner You can document it properly and the performance is exactly the same So there are two ways how you can create an empty list So you can either call a list function or you can just use the list literal syntax and as you can see The literal syntax is faster. Why is it faster because if you call a function Python first need to resolve this function And with the literal syntax, there is no overhead for that And exact same thing happens for creating a dictionary Okay, I have two more examples that should be treated with quotient Even though the code can run faster. I would not advise you to do this kind of optimization So sometimes even though you can squeeze some additional performance from your code It doesn't mean that you should do this. So one thing is a variable assignment If you have a bunch of variables that you need to assign you can do this the normal sequential way or you can go for this crazy assignment and I mean you can gain some speed But with this speed comes the hate of your colleagues that will be reading this code later So in my opinion isn't worth it Okay, and another interesting property of Python is that the lookup for local variables is faster than the lookup for Globals or buildings so you can say save some time if you store the reference to a building function or a global function in a local variable So in this example, the only difference is on the line 3 where I'm storing the reference to the global up in in a local append variable and Thanks to that this function is like 35% faster But then again if you see this code for the first time it's not very clear what this is what it is supposed to do It might be confusing to see this kind of append function because we are most more used to see the least dot append version To sum up there are different kind of optimization. It's quite often about the speed But not always and there are also different levels of optimization So sometimes if you cannot rewrite your whole application, maybe you can use a different approach And even though the source code optimization, it's not the fastest way to optimize your code Those small improvements will add up and the main advantage of it is cheap. So you can write You can optimize the code at the moment of writing. You don't really need to rewrite something and As long as you're writing idiomatic code and you don't reinvent the wheel, but already use the existing Functions and data structures in Python, then you're already doing it correctly So be curious when you're coding if you think that the different code structure can be faster You can quickly check it with the time it and then you can improve it All right, my name is Sebastian Witowski and I work at CERN So if you guys want to talk about physics, then you're probably on the wrong conference But if you want to talk about Python, you can catch me somewhere on the corridor. Thank you Two minutes for questions especially if you're happy to take one or two questions. So we have them Fantastic, who's got a question you saw Awesome talk. I have a quick question. You you showed us some Profilers cool profilers. Do you have any preference every any favorites? It really depends what you want to profile because if you care about the speed and the basic ones are fine But if you want to profile the memory usage, then you might need to use different profiler So it really depends. What do you want to profile? Any other questions? Yep, any good recommendation on books or source where we can find the best practices regarding this idiomatic Python Not from the top of my head, but Well, definitely there is some guides on the official Python documentation But also for me, it's a lot of googling for best practices also reading a lot of stag over flow There are some books, but I can't give you the names right now more questions Yes, oh, was that you sticking your hand up sir or just explaining something Excitedly presumably not a question any more questions. No in that case. Let's thank our speaker for fantastic talking