 My name is Poros, I work for AHL which is a systematic hedge fund in London. Here's a little bit about us. We've been around 30 years this year which is quite long for hedge fund terms. We trade in about, we're systematic so it's the algorithm to do the trades, not discretionary traders. We're active in about 400 markets all around the world. We gather about 3 billion market data points every day. We've open sourced our tick store. You can find it there on ManHL, GitHub account, Arctic. It's been very successful for us. We're quite a small organisation really for what we do and we all speak Python. So this talk is really based on our experience of using Python as a systematic hedge fund and the things that we've had to do when performance isn't what we would like. So there's kind of three sections to this talk. One is just to introduce concepts and to scope out the talk because performance is such a big subject it can't be all covered in 45 minutes. Then there's a lot of technical solutions out there you can have to make Python faster or your code run faster. And it's somewhat confusing looking at all the different solutions so I try to create a kind of technology taxonomy really to make sense of all these different options that you have and so you can reason about them intelligently. And then finally I want to talk about evaluation criteria, how you evaluate the different solutions that are on display here. So first up let's have a little introduction. So firstly what this talk is it's really kind of like a tour through some of the Python alternatives to Python. This is for general purpose computing, not specialized computing. So all the solutions I offer you are sort of general purpose solutions for compute power. It's also going to be a way of evaluating alternatives so you can decide which one is good for you. And it's also a reflection on the trade-offs you're inevitably going to make between things like performance and maintainability and cost and so on like that. So I better say what this is not about. I will mention NumPy a bit. This is not about NumPy and pandas. So if your problem domain is one that can be solved by vectorized operations on fixed size arrays then you might as well just head off over to NumPy or pandas because that's where you're going to get your best performance. If your problem is solved by concurrency well then you know what to do and I'm not going to talk about that or similarly caching as well. I'm not going to make any definite recommendations to you. I might have some advice but I can't make recommendations to you because I don't know what your particular problem is and all problems have different solutions. This is not going to be a comparison of a benchmark comparison of the different solutions. So all the benchmarks you see in this presentation are fake. That is to say they're just like real benchmarks. OK. So it might be worth reflecting why this talk even exists. I mean why do we have to talk about improving the performance? And this is perhaps why Python is how it is. So the reason why Python is allegedly slow or perceived as slow is it's obviously an interpreted language. Every line has to be evailed so every time you go around the loop the interpreter has to evale the same line each time. It's dynamically typed as well so the real type has to be assessed each time an object is accessed. It's running on a virtual machine which is a great way of abstracting away from the platform but that abstraction comes at a cost, a runtime cost. Python doesn't have a jit yet although Python 361 is pet 523 which opens a door to possibly putting a jit in and we'll talk about jit solutions as well. Python itself takes very little optimisation opportunities with the code compared with say aggressive C++ compilers. It's kind of interesting to reflect that the first three of this list is one of the reasons why Python is slow but they're also the reason why Python is so productive, why it's so quick to write Python code and why Python is so valuable. And the second two are also perhaps why Python is slow but they're known to be first only hard problems to solve in computer science. So if this hasn't happened to you already, certainly happened to me many times, your boss comes to you and says this Python code you've written is far too slow, you've got to find a way of speeding it up, it's not fast enough. Well then you've got to ask yourself a number of questions is really is that true what they're saying, isn't it possible that Python is fast enough for the job you hand? So we're a hedge fund, we trade billions of dollars each day and we can do that with Python because of the way we trade. Python is fast enough and then we can take advantage of all the other aspects of Python to make it very quick and cheap to develop new ideas which is more important to us than speed of operation. If you want to make it go faster, you're not going to try to make it faster overall, you're going to profile it of course, profiling it is a bit of a dark art, profilers often lie. There was a very good talk earlier in the tutorial earlier in the week on profiling that had a lot of good information of it, if you don't, I'll head over to the video there, but get your profiling skills out and decide which bit. The whole point of doing that is so that you spend the right amount of money in the right place because we're engineers, we do things cheaply. Then have a look, of course your algorithms, this is a well trodden path, I use the right algorithms and I use the right data structures, people overlook that. I was involved in a problem recently where people were building very large lists of Python objects and adding to the list and of course they failed to appreciate that the list has to be contiguous in memory when you start adding stuff to it. It becomes a point where it has to be copied and a new memory found for the whole contiguous list. This is a very expensive operation, having something like a deck which is like a link list has made it much more efficient. Just changing that data structure made a big difference in the runtime. I put up this famous quote at the bottom, do remember that when you're changing your code, have that burnt into your memory. So much for the introduction, I thought I'd talk about the options that you have, technologies that exist out there for speeding up your Python code and how we can categorise these to make more sense because there is a huge amount of choice out there. Here's some of the projects that are aimed particularly at making Python code or code that starts with Python run much faster. I'm sorry if your favourite project isn't up there but you can add to it later if you tell me. But when you're considering any of these things, it might be worth starting with what would the perfect solution be? What would we really, really like as a solution to making Python go faster? Here's my wish list here. This is perfection. It can run my Python code directly. I don't have to change the code at all. So there's no effort by me because I'm quite a lazy programmer. I don't want to change my code so the perfection would run my code directly. There would be no maintenance overhead on my half per half. It would work with all Python versions, all library code, all the standard library, all third party code that I was using, my own libraries and so on like that. It would be free, of course, fully supported, wouldn't have any bugs in it. It would have a perfect debug story and it would be 100 times faster. You'll notice I've not been greedy with this shopping list. I could have said a thousand times faster but I'm going to stick with 100 because I know that. So this is what I wish for. The thing is, this doesn't exist. There's solutions out there that exist parts of this list so this is where you're going to make the trade-off. If you're going to go for maximum performance, you're probably going to incur some cost or your debug story might go to hell or something like that. So it's worth reflecting on what would be perfect when you really make those trade-offs and just blindly choosing one solution. You're implicitly making those trade-offs and if you don't realise them, you might get in trouble later. So the taxonomy I've chosen is basically about how much code you have to change before using that particular technology. I divide it into little or no code change, some small amount of code change or really rewriting your code in a completely different language. So let's have a look at what exists for the first one, little or no code change. There's five projects I've picked out here. Obviously it's Python itself. There's a line benchmark at the top. Typically these projects will give you one times to eight times speed up. It varies enormously depending on the problem. First we have Python, then we have Cython, which is a very well-known project. We're going to write Cython, not optimise code, and I'll show you what I mean now. Then we've got PyPy, which was a very interesting PyPy talk yesterday. I think there's another one tomorrow. I urge you to look at those. This is a very interesting solution. We've got another couple of projects called Shed, Skin and Python. We'll have a look at them briefly. First, Cython. I'm going to write some code here. Give it to Cython and see what happens. This code computes a standard deviation of an array of floats. I put it into Cython as I run it in Python. It's slow. I put it into Cython and expect Cython to work its magic. Cython goes about 1.3 times faster, which is really no performance improvement at all. Cython is rubbish, isn't it? Cython is rubbish, isn't it? It's not a solution at all. It's interesting to see why Cython struggles at this point. It's actually because we're not helping Cython at all. If we just take that one line here, where we're computing the mean, and we look at what Cython does, so the way Cython works is it takes your Python code, it generates C code, compiles that, and off you go. If you look at the C code for that one line, here's a Cython C code. Here's the line that we're talking about. If we dig down through this rather complicated code, we see that this line sum divided by length is expressed by these three calls into the CPython API. These calls are highly generalized, they're highly nonspecific, they're not specific to any type, and the reason why Cython can't make an improvement or much of an improvement on our code is because we haven't given Cython any type information. We'll revisit Cython in a moment to see what happens when we can fix that because that involves changing our code quite a bit. The next one is PyPy, so typically their benchmarks are seven times or so the CPython implementation. It's a just-in-time compiler, it's fairly drop-in replacement for 2735 code. It doesn't really have a CPython API, but it supports CFFI, common foreign function interface, which I'll mention in a moment, so that's your way into C and C++. It's not completely compatible with certain libraries, I think Flask and Pillow are a couple of libraries that can't manage, but there's a very interesting quote at the bottom of the slide here that should make you pause for thought, although it does have the word probably in it. I guess what I'd say is I'd change that to something like you should certainly try this because it has been very successful in some areas. It's so much for PyPy. Shedskin is a project that basically does automatic type inferencing, translates your Python code to C code. It's quite elderly, it only supports up to Python 2.6. There's been very little activity for the last past year, so I'm not going to delve more into that. There's also one Piston, which is a LLVM-based compiler. This is backed by Dropbox, or I should say was backed by Dropbox. It's only Python 2.7, and the project was suspended in January this year. I offer these last two ones just as an example of one of the problems you're going to have. When you choose a particular technology to go for performance, you're making quite a committed move to that technology. If that project just runs into the sand and stops getting maintained, then you've suddenly got a load of technical debt. This is going to be important when we look at the evaluation criteria for which project to use for your code. So, so much for a little code change. What happens if we accept there'll be some code change? We can make some code change to accommodate these projects. What options do we have there? Well, here's some, and now we get a speeding up of 10 to 100 in that kind of range. So, we've got Python optimised, we've got a number of parakeet Python. I'll go through them. So, with Python optimised, basically you do this kind of thing. So, on the left is the standard deviation code, which was like the Python code, and I just lobbed it into Python and didn't get very far. On the right is much more heavily optimised Python code. Now, what I've done here on the function at the bottom is I've actually declared I'm going to use NumPy arrays, so I import NumPy and that gives Python a whole load of opportunities there. I'm declaring the local types as size t and so on like that. I'm putting a decorator on there saying don't check the bounds for this array, which is obviously a cost. Right at the top, I'm importing from libc, the math library, and I'm going to use square root from there rather than the Python math square root, which is probably much slower. So, if I do this in Cython, I get it running 62 times faster, which is like a really good improvement. It's a fantastic improvement, but consider this. How maintainable is a code on the right versus a code on the left? We're talking about a really simple operation here, a square root on array. So, this is what I mean about trade-offs is that if you really want that 62-folding performance improvement, implicitly, you're going to put up with a maintenance problem, and probably your code is much more complicated than the standard deviation. So, Optimised Cython can rapidly become fairly unwieldy because it's this really weird hybrid code. It's not C or C++, it's not Python, it's this in-between thing. There's quite an art in tuning Cython to get the maximum performance out of it, and it's a bit of a black art, and that often isn't shared in organisations. So, on to number. This is backed by Continuum Analytics, which is always good to have a good backer. It's a JIT compiler. It works, it's pretty much aimed at Python and NumPy. Basically, you just annotate your functions with a JIT, and this brings in a whole lot of JIT technology, and you run it, and the JIT will get to work and try and optimise that based on what it's seen in the past. There are a number of these JIT compilers. One of the early ones was Parakeet, which is still around. It's the same kind of idea. You decorate your code, and it'll provide a JIT for it. Parakeet is quite old, only supports Python 2.7. It's very much aimed at NumPy. It's quite effective with that, but it's been a little activity over the last four years. So, four years ago, you thought Parakeet was the best thing, and you started writing all your code around it, then I'm sorry for you now. Python is another one. They use a slightly different technique. They basically annotate the function with a comment, and you can see the types in the comment there. Then you basically run it through Python, generate C++ code, and then you run that. It really aims at scientific computing. It's kind of subset of Python that it interprets, but it can produce really powerful high performance at some cost. So, that's some code change section. The third alternative, if you want really high performance, is to basically write in a different language, like C++ or something like that. So, what kind of projects are out there that do that? This is where you really get this 100x performance improvement, typically, even better maybe. Here are some of the C++-based ones. We have the original CPython C extensions, which we'll talk about in a moment. There's various ones there. There's too many for me to go through, but I'm going to cut it down a bit, and there's also Rust and Fortran and Go and that kind of thing. If that's your thing, absolutely go for it. I'm not going to talk about any of those. But let's just look at somewhere three of the C++-based ones, which is writing a C extension, CFFI, and Pybine 11. They all take different approaches to basically the same problem is how do you get into writing in C? So, firstly, the classic C extension, and these give you a mixture of joy and agony when you're involved in C extensions. So, what's a joy? Well, it's written in C. C is really easy. There's only 32 keywords in C, so what could possibly go wrong? You can mix it with C++, that all works nicely. You have really precise control over your performance at this point, and there's a lot of good libraries at HL. We use this a lot because C extensions can call into NumPy, address it directly as C, and that is incredibly efficient and fast. If you're writing for the standard library, you have to be here because, strangely enough, well, not strangely enough, but the standard library requires you to write in C. In fact, C89 with some little extensions to C98, I think, so here we are in 2017, and we're stuck in C89, which is very interesting for me of my background. Anyway, so that's the joy of writing C extensions. Well, what's the agony? Well, this is a little class, probably the first class you wrote in Python was something like this. This is taken from the Python documentation from the tutorial about how you write your first C extension and basically make a little class that contains a first name and a last name and has a method where you can pull out the combined names. That's it in Python. What's it like when you write a C extension? Well, it's this. It's 190 lines of C code addressing a very complex and sophisticated and well-documented API. Apart from the fact it's a lot of lines of code, where is the real agony in this? Well, here's the real agony. It's in C. You have to do reference counts, get your head around that. You have to do manual memory management, you have to understand how Python does its memory management. It's a very specialized skill writing in this C API and it's quite expensive to write 190 lines versus a handful. Testing is real problematic with Python modules. I'll return to that in a moment. Debugging of C extension is kind of a real black art as well. GDB works fine. If you want to hook up an IDE, that can be done. It's a bit tricky to set up just to show how it can be done. Here's a screenshot of debugging a C extension in X code on a Mac. There's a link there to one of my projects that show you how to do that. If we don't like doing C extensions, we find that far too painful. What are the alternatives? Here are different languages. Let's move away from Python C extensions and have a look at CFFI, which is a really interesting project. There was a talk yesterday about it that was very good. I'll give you a list of the talks that I thought were valuable here later on. CFFI, you're writing in Python. It allows you to call C code from within Python. It's C-based, but you can also hook up to C++ for a little bit of work. It abstracts away much of this boilerplate, those 190 lines of the C extension, which is basically the interface code. It also abstracts away the build system. If you remember, this is what we were trying to do. This is what we had to do in C extensions. This would be a sort of fairly crude CFI equivalent. I'm importing CFFI. I create this C def with a string in it, which is actually C code. That gets compiled up. I create a new one. I can assign those first and last names to my own name. I can extract them out in some fashion there. That code now fits on one slide, and it just works when it's been executed in C-land. A different approach is PyBind 11, which I think is also a fascinating project. I think there's a talk tomorrow about PyBind 11, which I'm definitely going to go to see. How does PyBind 11 work? It's a header in a C++ library, so you're writing a C++. It makes it very easy to write these C extensions. It's similar to Boost Python, if any of you have used that. Wonderfully, it's C++ 11, so it's a modern version of C++. A easy-to-write version of C++. Here we go. This is a class we're trying to create. This is what we had to do in a C extension. This is how PyBind works. We're in C++. We create a strut where it has a constructor. It has a single method name, which will concatenate the names. It has a first and last names as members. You include PyBind 11, and that gives you access to a whole load of functionality, templates and macros that, basically, the creating of the module is just those four lines at the bottom. Very interesting project. Now you've got your shared library that you can run. Very interesting project indeed. That's all I've got to say about those things. There are other C and C++ ones, so choose the ones that suit your shot. I just say that if you are working in a second in a separate language like C++, try and arrange your code to look kind of like this. On the right is all your C++ code, for example, that doesn't include Python.h. It might include something else, but it's pure C++ code if you like. On the left is perhaps some Python code that wraps this. In the middle is kind of like the glue code. It's like the C extension code, or it'll be the PyBind 11 code, or it'll be the CFFI code. The problem is that the middle is the one that's really hard to test. If you can physically and logically separate the code in the left and the right, as far as possible, you can then test them independently for correctness and performance, because performance is what we're after. If you put too much code in the middle, it effectively becomes untestable. So I'd recommend some physical and logical layout that is something like this. That's basically the taxonomy, which was the second section of this talk. Now I want to talk about the evaluation criteria that you might use to judge which of these solutions is suitable for you. Again, here's the choice you make. We might be able to divide these up now with this taxonomy and understand which is the one that would give us the right trade-offs, but how do we evaluate, even within that taxonomy, which is better? I suggest that there are three kind of things, areas you need to consider when evaluating these, which are this. It's basically who you are, what company you work for, what organisation is it, what team are you in, what skills you have, so on like that. Then there's a technical criteria. We're probably mostly engineers here and we like technical criteria because we can put numbers to them and numbers are our lifeblood, but it's also non-technical criteria as well. I'd argue that are equally important when you make a choice. First of all, who are you? I'm guessing a bit here, but I'm guessing you're not Facebook or Google or Microsoft or anything like that. These companies do a lot of interesting projects. They publish white papers and they open source staff and they say we're using this particular project and it's really fantastic for us and so on like that. The temptation is because they're using it that you should use it, but you're not them. They have their own problems, which may not be your problems, but you're constrained by that and you have your own culture and you're constrained by that. You may not be constrained by that, what they do. Understand what they're trying to do and what you're trying to do are different. Don't be too overly impressed just because one of the big companies is doing it and that's what you should do. On the other hand, because a big company is backing something, that is also a plus for a project because it's likely to increase its longevity and quality. Consider also what skills you have to maintain this code or what skills you can get or what skills you might lose as an organisation because if you've gone down the route, for example, of highly optimized scythen and you've got a couple of people that got really skillful at doing that and your whole code is running really smoothly, if those two people leave then you've now got a whole load of technical debt. Let's have a look at some of the technical criteria. We're fascinated by this stuff as engineers. What we might look at. These are things that we can put numbers to and compare two different options against each other. What might there be technical criteria that we might look at? What do these projects depend on? Do they depend on things that we don't really want to depend on or different versions that we don't want to use? We can put numbers to that. What versions of Python do they support? As we've seen some of these projects only support older versions of Python. Some of them struggle to move, make the transition to Python 3. Some of them only support part of the core Python or part of the standard library and some of them perhaps the third party libraries that you might depend on aren't supported at all. You can go and find that out and compare alternatives very easily. Then finally we get to benchmarks. Of course people are obsessed with benchmarks or perhaps not obsessed, but benchmarks carry a lot more weight than I think they deserve. Mainly because they're kind of wrong and as I said all the benchmarks you'll see here are lies, just like real benchmarks. One of the obstacles to benchmarking accurately to make rational decisions about various options that you have. There's simple measurement errors. There's just measuring the wrong thing or measuring in the wrong way. Bad statistics and I won't go through the whole of benchmarking with loads of good material out there about how to benchmark properly. Bad statistics is an interesting one because in benchmarking you're going to take a small subset of results and then you've got a reason about the wider world with statistics based on that small set. If you do bad statistics we'll look at a couple of examples in a moment then you're going to mislead yourself with benchmarking. Then there are actually human cognitive biases, confirmation bias. You might be overly fond of one library and therefore you might implicitly be starting to look for positive benchmarks in that library. Or you might get fixated on one small part of the problem ignoring the wider aspect of the problem. These are all obstacles to objective benchmarks. Here's some benchmark pitfalls. I'm imagining here we're running a test eight times and we've got library C and we've got library D and we're comparing them. Lo and behold we're taking the average of the timings of them and we're being very good statisticians and we're taking the standard deviation and lo and behold they come out the same so it looks like library C, library D are exactly the same but if you look, library D has an interesting pattern to it. The first time you run the test it takes 18 and the subsequent times it's a very consistent 8. This is maybe characteristic of some kind of jit kind of behaviour and so I'd say that if you're running this in production many times that library D is definitely a better choice if you're only going to operate that function maybe once then it's probably the worst choice so you've got to look at the trends as well as the raw data. Here's another common fallacy is I've got a number of tests here independent tests of two libraries G and H and I'm combining them taking the average and the average looks like they're the same but if you notice the test take widely different times and it's actually quite misleading to take the average. If I add another column onto it where I divide H by G you'll see that in the first test H takes slightly longer proportionately but all the other tests is twice as fast as as G so how come the average is so misleading saying they're the same is because what you need to do when combining widely different numbers is take the geometric average not the arithmetic average geometric average is the product provided by the nth root now we see if I take the geometric average that library H is much faster than library G in fact tables are a really bad way to present benchmarks as well because they're just numbers here's a much better way of presenting a benchmark this is a real world one not from a company I currently work for but basically this is a graph and on the x-axis it's logarithmic it's a file size going up to 100 megabytes and what we're doing here is this is a sequential file but we want to randomly access it and so we're creating an index by read, seeking through it find the interesting points and then we can randomly access it much faster than a sequential file now the red dots are the original python code on the top sorry the y-axis is how many milliseconds per megabyte a file it takes to create the index that we want the original python implantation and the green dots are writing it in C now there's a well for data in this file for one thing you can see there's a big range of inputs the file sizes we're testing are from very small to very large there's about a two decade improvement general on the C code the fact that it flattens out means it's O-N and then there's some really interesting stuff happening with some outliers up the top reflected in the bottom if you look at the bottom right hand side it's remarkably taking much much faster than others even with the python code and that's reflected in the C code as well and when we investigated those files we found another technique because they had a particular property that we could exploit for all files that gave us some other techniques improvement so presenting data like this gives you a whole wealth of information that you can make reason decisions and compare that with a table how much information you're going to get out of that ok if you're benchmarking don't just do speed here's a great long list of stuff you should do because memory might be more important or IO what are the trends of benchmarks and once you benchmark do you put it into production and forget about it you want to be benchmarking your production code as well so the last thing in evaluation is like the non-technical criteria so these are things that you can't put numbers on and sometimes engineers shy away from this because you can't put a numeric value on it it's a bit of a harder decision I'd argue it's equally as important so here's some non-technical criteria you probably want to consider ease of installation deployment if it's a large deploy again dependencies ease of writing the code and maintaining it what's your debug story what's your tool story for analysing your code or reporting things out of production and is the project that you've chosen be it PyPy or Cython or whatever it is is it how future proof is it and that's a really difficult judgement call because of course the past is no guide to the future but it's all we've got so how can you try and predict whether a project, because we've seen several projects that have just suddenly halted so let's hope that your solution doesn't do that leaving you with all that technical debt these are the kind of things you want to look at probably is what Python versions have supported are they moving up versions in a timely fashion what's its development status if it's old that's maybe a good thing because it's been around for a long time it's mature on the other hand it might be using very old technology a newer project might use more newer language features and be better and that is it maintained do they have good interesting backers like a large company or a determining company fixes quick, do they take PRs that kind of thing also who's using it and interesting enough is there a consultancy around because if there's money to be made if there's money swimming around a project it's more likely it'll last than one that's just been given out for free and everyone is taking it for free so those are the kind of three sections that I have few so I'll just finish up by just making a summary this is what I kind of say is the takeaways view is just choose out of these solutions choose the solutions that's appropriate for your organisation, your skill set your product is your choice just recognise it, you're going to be making trade-offs and if you don't realise that there'll be implicit trade-offs that can come back and bite you so try and be explicit about what you regard as important benchmark if you must benchmark wisely when you can and I'd argue that these non-technical criteria are equally important for the longevity of your code than the technical ones I'll just as a shout out I'll just make a shout out here for this book I have nothing to do with this book at all except I know one of the authors slightly but it's High Performance Python I got quite a lot out of it and I recommend it if you're it covers a lot more territory than I'm covering in this talk but if you're interested in High Performance Python this is definitely one to go and get and also just don't listen to one person particularly me consider other opinions here's some of the talks in Europe Python Monday and Wednesday we had some of these talks I went to I got a lot out of them about CFFI profiling Cythin, someone like that tomorrow we got two talks I'm definitely going to go to one on C++11 and the other one on PyPy so listen to a lot of opinions when it comes to performance not just one and that's about it and I'll answer your questions as best I can this is if you want to stalk me on social media this is about as far as I get on social media which is Github also AHL we open source quite a lot of our code now some very interesting projects there I hope you're going to have a poke at that and you make my bosses very happy if you had a look at our Twitter feed, our work Twitter feed and that's it from me so I'll try and answer your questions as best I can thank you for the talk so questions any tip how to get around the confirmation views when testing because it's very difficult in the culture and the team how do you go around confirmation bias I suspect humans have been struggling with this problem for millennia really I don't know I think it's I think when you want to present material to persuade other people you have to kind of make that material for survival there must be enough information in there so that someone can form a contrary opinion so if you just say this is faster you're not giving anyone any purchase to say they can't deny you because you're not giving the information if you say I believe this is faster and here's the data for it you're giving them some sense of false survivability because they can look at that data and say oh you're taking the arithmetic mean not the geometric mean and therefore you're drawing the wrong conclusion so I think some of these biases some of these cognitive biases can be helped presenting your information to a large number of people and they might be able to spot something that you can't spot yourself because it's very difficult to avoid confirmation bias yourself but perhaps if you're aware about it and you ask yourself that question that may help as well Hey thanks for the talk considering two main technologies for binding namely PiBind 11 and Siphon what is your view regarding advantages and disadvantages of each one of them OK well I've used Siphon quite a lot both on my own projects and at work we depend on it quite well it is one of the most mature projects out there it's used all over the place another standard example would be pandas heavily depends on Siphon and so on like that so it's got to be a star the PiBind 11 is a much more newer project but I think it's very interesting because perhaps it addresses some of the issues that I've come across from time to time with Siphon certainly like debugging in Siphon is really quite a can be quite a challenge the fact that Siphon optimised code is in a strange sort of hybrid form I understand why that would be so hard to read and reason about with PiBind apart from the bindings you're pretty much operating in C++ which should be easier to reason about so I think it really depends on who you are, what your capabilities are if your strong C++ shop for example then PiBind 11 would be a very interesting route to take although Siphon also can get you into C++ I'd say performance wise there's probably not a big deal between them but I have noticed sometimes with Siphon generated code that you can make small changes in the PYX file, the Siphon file and you get radically different changes in performance which are not necessarily predictable and it's just the way they use a lot of heuristics to generate their C code and some of those changes and that's often noticeable between versions as well so I think they're both very fine projects and I think it probably comes down to there's probably not much to choose between them for performance but it very much depends on what kind of shop you are really and what your personal preference are I don't think you'd be wrong to choose either one of them Going back to your graph on the file read Going back to your graph on the file read performance you mentioned that you identified something that actually resulted in a 10 fold speed increase for I believe you said the native Python code as well can you just comment on going down the rabbit hole of optimisation and as programmers we love new and challenging things and maybe we see a problem and we think we should implement this or anything do you have any tips for how we can avoid going down a rabbit hole where maybe a 10x improvement by keeping it in a native Python would have been enough I guess this I think there's 300 data points on each of these and it turned out there was something like 8 or so data points which are these kind of outliers which you kind of at first sight might think are just statistically not significant but we did go and actually have a look at them we found that interesting things so this is basically a sequential file where a series of records are written one after the other and to read to any point in the file you have to read everything up to it because they're variable length records and so you can't just go into some place and the reason for conducting an index is the index at the start of each of these variable length records well it turned out that most of them the variable length records had a maximum size of 1k so there were lots of small records in there and it turned out the art hours had a record size of up to 64k so there's much fewer records so creating the index you're doing much more seeking and much less reading because you have to read ahead of each one so that accounts really for the kind of 10x improvement and what we realised we could actually rewrite all the files from 1k records to 64k records and that was how we got the extra 10x improvement but without it being presented in a graph like this and without kind of like the curiosity to say ooh those outliers might be interesting and I'm not going to dismiss them we wouldn't have found that 10x improvement so I guess we could have gone down a rabbit hole because it could have just been statistically significant but we did start off by doing a little bit more statistics a few more runs and that kind of thing and discovering these were actually real outliers rather than just statistical anomalies our rabbit holes are something that I seem to merrily dive into I'm afraid so perhaps my advice on backing out a rabbit hole is more one of experience than of good decision making more questions we still have a couple of minutes nobody everybody is here with their code's performance I've been using boost python and it has two downsides which are the fact that the compiles take a long time and it sometimes produces very big object files so it's pi by 11 improving on these metrics I'm probably a bit outside my zone of expertise here so this is probably some speculation but I haven't used boost python for quite a long time I really am out of boost project but it's getting on quite a bit now and it's got all sorts of stuff that probably reflects history rather than modernity and pi by 11 was inspired by boost python but it was a complete refresh writing a fresh the whole thing and in C++ 11 which would make me think that given a modern compiler you would get less variability with pi by 11 and with boost so it might be worth I certainly recommend you go and look at pi by 11 and see if you get the same kind of problem but I'm not going to predict my guess is that you wouldn't perhaps get that problem with pi by 11 but I'm speculating a bit there but I'd recommend you go and try ok so thanks for the talk and the questions give another warm hand to Paul Ross