 Hey everyone, some Stachanbine and one of the site and core developers. Before I start this talk, I'm going to show you a lot of examples of site and how to use it in different fields. So I'd like to know a bit what you're more interested in. So who's never used site and before? That is some... it's true. The little guy over there, right? Youngest non-site and user, we'll introduce him to site today. Okay, so that's about one-third of you. Who is interested in integrating, so in accelerating NumPy code, computational code. Computational code, accelerating computational code with NumPy and all that. Okay, that seems a bit less than the group we just had. I integrate, so talking to external libraries, integrating native code into Python, I'll start to top it for you. Raise your hands. Couple of people, anyone using C++? Yeah, surprisingly not so few. Okay, so I'll try to focus a bit on what you just said. Okay, I'll start with myself. So, Stefan Behn, as I said, I'm a software developer, data engineer, whatever you want to name it these days. I've been using Python since 2002 and I'm one of the founders of this Python project, not of the original inventors, but when we forked the project from Pyrex, that was in 2007. I was already on board. And I'm also a CPython core dev since this year. I do training and consulting in-house. If you want to know anything about Python, have it taught in your company at home, you can contact me and I might come over and teach you something. I'll definitely have something to teach you or look in your code and approve it. I've been working for TrustU since 2017 and I use that as a little introductory example because TrustU is a data company and that fits very well in the field. So what we do is we have terabytes of hotel guest reviews that we collect from all over the internet. And when I say guest reviews, that's literally brain dumps of arbitrary people writing stuff on the internet. So we collect them, analyze them in 24 languages, so there's a lot of NLP involved, have surveys that we send out for the hotels, collect them from portals, from partners that we have, do text and data analysis on them, analyze sentiment, find out what people are talking about, what kind of different categories they are talking about, analyze trends, find out this kind of information. So that back to the hotels to tell them how they can improve. So we can tell the hotels, you know, if you renovate your pool or hire someone new for the reception, you'll probably improve your score by 10%. That's what we tell them. So this might also be interesting for you. So if you go to TrustU.com and type in the name of hotels, next time you're looking for a hotel somewhere, type it in there and we'll tell you exactly what people actually think about this hotel. Notebookbooking.com wants you to think about your hotel in order to book it. We're independent and so we can give you the actual opinions based on actual reviews, actual data. So go there, remember this and you'll see it's all cut down into different categories and all that, so it's really relevant information that you want to know before you book that hotel. Why do I tell you all this? Well, these are some tools that we use at TrustU as a data company. So we use NumPy, SciPy, we use Scikit-learn for things. We use Pinellis for data analysis, we use NLTK and Spacey for text analysis, Alex and Mel whenever we have to deal with XML, PyYaml for configuration and all this and it's all based on Hadoop Processing and Spark. Some MapReduce, some whatever Hadoops provides, Hive and all these technologies, so this is mostly what we do. And now most of these tools are actually partly or even entirely written inside them. Right? Who didn't know this? Raise your hand. Interesting. NLTK is no longer entirely true because we are working on patch for them. NLTK is a pure Python toolkit and so I've written a PR for them to start compiling certain of their modules so that they just run faster if you install the binary instead of the Python package. So we're working on that. Hadoop is straight out because it's the most written Java but the rest is on our list. Who's seen this? Has anyone not seen it yet? Okay. So that's the image that the EHT project took off a black hole a couple of months ago and the interesting thing about this here is that these are the tools that they use for calculating this image out of the raw data. And as you may notice, they're all Python tools, right? They use pandas, numpy, astropy, scikit image over there, scipy, all these tools, Python tools for aggregating the, again, I think they start with petabytes of data coming from eight different telescopes being taken across 10 days in a row, right? Like huge telescopes all around the world. Collected the data, cut it down into manageable data sizes and then calculated this image out of it and that was all done in Python. And now if you look at the tools that they used, all of them actually use Python inside, right? So that was a huge, uh, Python-based project, although they didn't, you know, largely advertise that but it was. Um, okay. So what is Python actually used for? It's used for, uh, integrating native code with Python. It's used to speed up Python code in C Python and simply even use it to write C, you know, without having to write C. Um, so basically we write C so you don't have to. Okay. Uh, skip that. Uh, gradual typing. Does anyone know what gradual typing is? Heard the term before? Maybe in context with PEP 484, uh, the, the typing annotations in, in Python. So the basic idea is there are two main, uh, different, uh, you know, ways to type languages. One is statically typed, which tends to be, you know, give, give you fast languages but it tends to become a sum because you have to type everything. It's very annoying. And then they're dynamically typed languages like Python, which are easy to use but tend to be slowish kind of. Um, it's, it's not that simple but more or less those are the two fields. And gradual typing is coming somewhere between and says, um, you know, you can use typing or it helps or it's documenting or it makes things faster. That's where you should use typing. And for everything else, it's fine if you don't use it and don't need it there. So it was termed in, uh, 2006. And as a blog post, just use, look for what is greater typing and you'll find it and what's the basis for PEP 484 type annotations in Python. It's really the best mix of static and dynamic typing. Um, you can use dynamic typing for the ease of development and optional static typing for safety, speed and documentation. And you should really only use static types where they help, right? That's important. Um, and so don't type all your things, right? The limits to what you should use types in your code for. Um, and this is especially, uh, important when it comes to Scythian because what Scythian gives you is types in Python and it uses them to accelerate your code. Okay. So Scythian is a pragmatic programming language. It's gradually typing for Python and it's an optimizing compiler. So it takes your Python code or you type annotated Python code, uh, and compiles that translates it to C and that compiles into a native extension module for Python that you can just import like any other binary module. It's production proven. It's widely used as you've seen, lots of tools using it. And it's really all about getting things done in the same way that Python is. It helps you keep your focus on functionality, removes the boilerplate for writing native modules. It allows you to move freely between Python and C. You'll see that in a minute. So it gives you a language that is as Pythonic as you want and as low level as you need it. Okay. So it's a demo. I'll make that bigger. Okay. Who's never used a Jupyter notebook? Again, the kid over there. See? Teach him. So, uh, yeah. So pretty much almost everyone else did and it's a very nice and interactive way to use Python to jump around in your code to show stuff. So what I have here is a Jupyter notebook. And as you probably also know, Jupyter supports lots of different languages. I remember that it was 14 different languages like years ago. It's probably 20, 30 something in that order now. And it supports Python. So, it's included in there. If you just say load xSython and have Python installed, obviously, then that teaches it how to run Python code, how to compile Python code in a Jupyter cell. So I'll just show you what I'm using here. Python 3.7. Fairly old version. Anyway, Python 3, fairly new version. NumPy GCC 7. Oh, yeah. And that's something I should mention. Since Python compiles your translator code to C, what you need now is a C compiler, right? Python is nice and easy, right? Python code to run Python, run my code, and it runs it. In Python, there's a compilation step before that because in the end you get a natively compiled module, a native code, and that requires a C compiler. But that's it. That's all you pay. Okay. So, very simple example. I'm going to take the Python math module, import the sign function, and calculate sign of 5. Okay. It's a bit boring. I can do the same thing in Python. And one difference you notice here, so the first thing is this is how I declare a Python cell. I just tell Jupyter, you know, this cell is no longer running in Python. It needs to be passed through Python. Please compile a binary module for me. And then what Jupyter does in the back is it imports that module and executes it along the way. And that's also the reason why there's a print statement down here rather than just the sign of 5 as you would know from Jupyter. So, in Jupyter cells, in Python cells, the last result of an expression kind of falls out of the notebook and gets displayed. And in a Python cell, since the cell is compiled to native code, that doesn't work anymore. So, Jupyter can't look into native code. It just executes some module and there's nothing falling out of it. So, I have to explicitly say print. But that's fine, I think. Okay. Gives the same result. Now, as I said, since Python compiles a C code here, that translates to C code, I can start using C functions now or C functionality. And what I do here is I take the sign function from the libc math header. So, this is the C sign function. And for now, I'll just assign it to a variable. And then I can call that from Jupyter. Who's surprised? Anyone? No one? A couple of people over there. What's surprising here? Yeah, no compilation. And nothing you can see. So, the Python annotation up here makes it a Python cell. It compiles it for me. But the funny thing is just by assigning the C sign function here, I can actually call it from Python. So, now Python knows how to call a C function. That's interesting. The reason for that is that Python is a type language, since C is a type language. So, the sign function is something that knows that it's function, a C function that takes a double floating point value in and outputs a double, or returns a double floating point value. So, it notes the input and output values. And that allows Python to directly wrap this function as a Python object. And when I assign it to global Python name here, it does it for me. Because that's an obvious assignment, right? I assign a C function to a Python variable. What else should it do than wrap it for me? Obvious, right? And it does it. So, this is kind of the quickest way to wrap a simple C function for Python. Here's the same thing spelled out. So, again, asking Python to learn what the C math header is. And then I write a Python function. You can see the extension here allows me, so, Python allows me to declare C argument types as I would in a C function. And then I can just call the sign function. And this is me manually wrapping this C function call here in a Python function. Which essentially does the same as above here. So, it's non-automatic. So, when I run this and down here, now I can call the Python function and it calls the C function internally. Okay. That's still a bit boring because, as you've seen here, Siphon can also do this happily on its own. It doesn't need my interaction, my code writing for this. So, when it's the point when it becomes more interesting to do this, that's when you can move more functionality into the Siphon layer. I can easily now call this C sign function up there that I just wrote with X squared as input. But that would calculate the X square in Python. Then take the result of that, pass it into a C function, calculate that in C space and return it for me and return that as a Python object again. If I do the squaring also in Siphon, then that gets translated into C code as well. And now the whole expression, the whole function body itself runs in C at C speed rather than me doing something in Python then in C and then passing it back to Python. Okay. So the nice thing about Siphon is that it allows me to move functionality freely between Python space, Siphon generated C space and at the end also the low level C written code. And it's up to me as a developer to decide where I want to put my functionality, how I want to implement this. In some cases I have an existing library in the C library for example that I want to call. I can just do that. I can declare it in Siphon, call it from Siphon code, do some more stuff in Siphon space to keep it at the below the Python level. And when I'm done with it, then I return to Python. So I have all three levels available in one programming language and that gives me a lot of freedom for moving around optimizing code in these three levels of performance also. Okay. So here's an example I go through somewhat quickly. It's about speeding up NumPy code. Okay. Any questions so far on what I presented? Anything you would like to know or clarify? I didn't understand. I'm not always as clear as I want to be. I hope it was. Apparently I was. So thanks. Next example, NumPy. So the idea is we're going to calculate the average tax rate for Germany. I only have the numbers from 2016. So they'll have to do, at the time, there were 44 million people working. So paying income tax, let's put that way. And the average at the time was 3703 euros per month. So the average income was at the time 44,000 euros. Okay. That was 2016. It probably didn't change that much. It's still an average. So from this, I then try to find official data for this. And I couldn't find anything really, probably due to data protection or something. Like there are probably a couple of people who aren't that much that you would identify them by their income. So there's no official data for this. And what I did instead was I just took a log normal distribution, fit it to the one data point that I have, made it kind of look good like this. And it's kind of not too far from what you would expect for an income distribution. Okay. So just assume this is actual data. Okay. So let's calculate everyone's taxes. When you look up the tax rate calculation in Wikipedia, what it gives you is a beautiful Excel formula. This is actually German Excel, right? So it says WIN and that's if. And then it gives you lots of formulas. So there are apparently different income ranges that change against each other and that have different formulas for them to calculate the tax rate. So the income tax for that income range. Okay. So you can easily translate that into Python and it becomes this, which I think is a bit more readable, especially because it allows you to ignore the right sides of the formula here and just go, ah, look at their different income ranges and then it does something. Just ignore the formulas, right? So this is how the income tax is calculated. And then in order to get the average tax rate, we take the sum of the taxes divided by the sum of the incomes. I'll set my, you know, kind of fake income lists. I think I cut it down a bit. I'm not taking the 44 million, that is a fourth or something. And now, yeah, whenever you do timings, don't forget to disable the energy management on your laptop because otherwise you get funny numbers. Okay. So it's going to take a while. And I can already show this a bit. So what we're going to do here is since we're optimizing this code, I have a little function that just collects the different timings from different implementations. And so I remember the timing here and then we'll use this function to show the differences. So it took three seconds to calculate this whole thing in Python. Okay. So it's 3.2. Okay. So that's our baseline. Python has factor one. Now, you can implement the same thing in NumPy. And that gives you something like this. So we're slicing the income array, then doing some computations on them, and then build the sums on it. Okay. And that's kind of heavy NumPy code. But, yeah, that's how you can do it in NumPy. And now we can calculate the whole thing in NumPy. It should be faster. So we're down to 62.1. And that is 50 times faster than Python. Okay. There's a different way to do this. Still, you can take the Python function of written and wrap it for NumPy so that it can apply it through the whole array. One item at a time. And then you can do the same formula, sum up the taxes and the incomes and divide one by the other. That, again, is going to take a while. That is slower than the slicing version. But it should still be faster than the Python version we had. And it's faster by quite a bit. So that gives us 849 milliseconds. And that is four times faster than Python, still. But it's completely dwarfed by the NumPy version. Okay. Enter siphon. Here's a plain copy from the Python code I had above. It's doing the same thing. And now what I change is I compile the whole thing in siphon. And then do the calculation again. Maybe I shouldn't show this yet. Yeah. As you can see, it takes a while to calculate this. Actually, it already takes a while to... All right. It's running here. Okay. So it takes 2.74 seconds. And as you can see, that is 70% faster than the Python version, which is nice, right? I mean, all we really added is this little line up here. And so how much is that? So that's 8 characters for a 17% speedup? That's certainly areas where that's acceptable, right? Okay. But we can certainly do better. Now, where siphon shines is static typing. And I told you about gradual typing. So what we're going to do now here is we'll gradually start typing the siphon code. And I'm going to use the PEP for notation for it. So the income that the text function here is calculating is definitely some double, right? C double value. It's perfectly fine to represent that because it's, you know, some thought important, the doubles is just fine for these calculations. Okay. Any returns? Well, it does some calculation afterwards. And so it definitely also returns a double safely. Now, the next thing I notice is this function is only used internally inside of my module. I'm not exposing it anywhere. I'm really just using it down here. So what I can do is I can convert it to a C function, which is faster to call than a Python function because it has different call semantics. And in C code, you can call C functions pretty much without overhead. And Python code calling a Python function is a lot of overhead. Also in siphon code, calling a Python function is a lot of overhead because they are called with argument tuples, maybe keyword arguments even. And so creating these objects, even just for calling the function, takes a lot of time. In C, it just passed by a stack or a register. So that's very, very fast as fast as it gets. So one thing I can do is I can declare this function as a C function. Okay. And now when I compile this, then siphon compiles it for me. And I can time it again. And we're down to 205 milliseconds. So when I compare this to what we had before, that is now quite a bit faster. The Python version by 15 times. And the compiled version that we had before is still 13 times slower. So this was a speedup by 13 times compared to what the version we had before. Why is that? Because one thing is the call overhead that I completely removed. So this is, you know, that's kind of the inner function of my loop where I'm doing a lot of work, right? And the call overhead for that function was removed. So that calling that function, going into that function is basically low time now. But the second thing happened. By typing the input and output arguments, siphon understood this function better and managed to generate plain C code for it. Because it knew now that income, you know, that's a C double. It didn't have to do any object comparisons anymore. It can calculate this whole thing in C space now. So by typing some variables, by typing some arguments in there, siphon takes decisions about my code and adapts it to the variable types. And you can see that with a functionality called annotate. So siphon-a gives me HTML output. And in here it replicates, so it outputs an HTML snippet into my notebook that replicates my code. And in here, sorry, in here you can see, when I click on one line, this is just plain C code, right? Income greater than something. Same thing here. Take a formula, go to something, right? In Python, that would look a lot more involved. It would do lots of Python C API operations, lots of object operations along the way. And this now really generates a plain C code. And it gives me a speed of 13 times. Okay. So I would then continue this. There's a lot more I can do at this front. And I think the final speed up in the end that I usually have is, do I have it here? Yeah. So in the end, I usually manage to get it down to something like 11 milliseconds from the current 200. That is another factor of almost 20. Okay. I'm going to switch to a different example here. So any questions regarding this topic so far? Okay. Yeah. I think you are first over there. Yes. Well, siphon is actually explicitly saying this is a C function which is the same as declaring as a C def function. C def has more meanings than that. This is more specific. For the others, the question was, there's a different syntax in Siphon which is not Python compatible. And there's a special file type also which is PIX instead of PY which allows you to use the syntax. And I've only been using Python syntax here. You can do the same thing in a special Siphon syntax which is more relevant when you start talking to external C code. Because that is a ground where you cannot cover the same thing with Python compatible functionality anymore. And that's where we use the second syntax. But as you can see, you can get very far by just adding decorators, Python type annotations, and so on and so forth. Maybe making your code a little more C-ish than it used to be. That also helps in a lot of cases. Okay. Question over there? If you were to actually have that in a code base, would you put this part of the code in a PYX file or would you have it in a Python file and only annotate it? Do you Siphonize the file and compile it and then you ship it or do you ship it and then if you have Siphon on the go, you don't set it. So the question was, this is in a notebook where it's nice, you can compile a cell or use a Python cell and mix them as you wish. How's that when you have code in the Python module? How do you optimize that? So Siphon compiles one module at a time, meaning you would normally pass the whole module through Siphon and compile it. It rarely hurts to also speed up a couple of other places in your code without actually type annotating them. So if you just want to optimize one function, you can still compile the whole module. It might not even be a bad idea. Different choices that you have. One is Python often uses this concept of an accelerator module where you have a Python implementation of something and then you have an external native implementation of it which you import in the end and if the import works, then you replace some function in the Python module with some faster function, something like this. You can do the same thing in Siphon. You can externalize one function into a separate module, compile it in Siphon or not, and then import it from there. If it's compiled, then it's probably faster. If it's not compiled, then it's as fast as Python will run it. Or as Python will run it in that case, for example. So that is fairly nice. This syntax actually allows you to optionally compile code. You can have it as fast as PyPy can run it and in Cpython you can compile it and get it as fast as Cpython can run native code. Also nice. Okay. I am currently looking for a little example that I had. Those related. That answers your question, right? Okay. Anything else? Then the next example that I have is C++. Just a very quick example. So as I said, Siphon generates C code by default, but it can also generate the C code into the C++ module and that allows you to use C++ code along the way from your Siphon code. And that is very nice from Siphon because C++ is an object-oriented language. Python is an object oriented language, so you can mix two object-oriented languages in the same Siphon module. And they look totally like Python when you use them. Even C++ looks a lot like Python when you use it from Siphon, but it gives you a nice additional standard library, like fast data types, for example, fast container data types through STL or anything you might be able to write in C++ yourself and want to wrap it for Siphon. So that is a very nice addition to Siphon, actually. And here is a very quick example. One thing you have to change is you have to tell Siphon explicitly that the language it should use is C++. There's more than one way to do this, but this is a common one, especially from a notebook. And then that enables access to the C++ STL declarations that we already ship in Siphon because they're so commonly used. And then you can use this syntax here to say, I have variable v, that's a C++ vector of ints, and then I just push a value in there and return a vector. What does this do? Returning a C++ vector from a Python function? What do you think happens? Yeah, it would transfer it to a list. It would copy it into a list, right? Because the obvious representation of a C++ vector in Python space would be something this like, and you would use a Python list for that. So simply returning something that Siphon understands as list like will return, will copy it into a list, Python list automatically, so it will create Python list for you, copy all the values from the C++ vector into it and return that. Okay? All automatic. If you don't want that, you can implement your own little thing in whatever way. You can use a list comprehension to get something out of the C++ vector. Or you can return a generator expression that uses the C++ vector. You can do all these kind of from real things by mixing Python features with C++ features. But this is the most simple and most straightforward way to do it. Okay. So when I call this function, then I get 10 back, which is not very surprising. This is a very common way to wrap C++ objects for Python. I can use a so-called Cdef class. So Cdef class is an extension type in Python, right? So it's native, so it's a low-level implemented class, like a Python class, but implemented in C. And in Python, these extension types allow me to directly use C++ objects as instance attributes, that's an instance attribute, and I'm using a vector of int as values attribute. And then the lifetime of this C++ object is tied to the lifetime of a Python object. So it's automatically created when I create a Python object, it's automatically de-advocated for me when my Python object goes on a scope, so I don't have to care about any memory management. It's all automatic in this case. Very nice. And so this is just a very tiny example that uses an integer list-like object, wrapper object in Python. I can add values to it, which uses the pushback C++ function to push it into the C++ vector. And as you can see from the usage here, that totally looks like I'm using Python code, right? I'm calling method on something, I don't care if it's a C++ object or a Python object, just call it. I also like the wrapper implementation here. What do you think this does? It runs wrapper on the list, right? So it does the same thing as we had before in the function. It creates a Python list from the values. This is actually very expensive, right? But hey, I mean, how often do I actually need wrapper? So what it does is it copies all the values in the Python list, uses wrapper on that, and then gives the string back to Python. If it's only something I occasionally use, why not? It doesn't cost much. Okay. So much for C++. Any more questions on that field? Last thing you checked, we didn't have Bitvector. You mean in the C++ declarations membership might be, please provide a pull request and we'll happily add it. Anything else? A gill handling. I actually have another example for that as the next and last example that I'm showing. Okay. So here's an example of wrapper in external library. Okay. That's the last use case that I wanted to show here. And so far we have seen, you know, this C import something, it's a bit of magic. And this is unpacking the magic kind of what we ship is site declarations of most of the C header set of the C++ header set of the STL. Well, C++ was huge but much of the STL. And this is how it's spelled out in the end. There's a special syntax for that and so what I want to do now is I write a little wrapper for the Lua runtime. So I'll execute Lua code from Python. And for this, I just need to declare a couple functions that the Lua API defines and then up here declare how to link against this. This is a bit of a heavy way to do this writing notebook. There are more, you know, nicer ways to do this from the setup file, but this works for now. And then once these are declared, I can use them in my code. So if you find a Python function, it takes a code string, converts it to 8 if it needs to, so it accepts unicode and byte strings. So you can just use a normal Python string from Python 3 and pass it in there. Then creates a new Lua state that's a Lua runtime representation. If that fails, then it raises memory error. That is also fairly nice, right? Do you know what you would have to do in C in order to do this? Happily, there's a C API function for it. I've just been able to say, you know, do some C stuff if that fails, if I get a null pointer from some memory education back, raise memory error. That is nice. That is what you want in your code. So that's totally the Python way of doing it. So then I have a try and accept. So I try finally, and at the end of the finally, I delete the Lua runtime again. So if anything goes wrong along the way, then I make sure I clean up the memory because C requires manual memory management, right? I'm responsible for cleaning everything up that goes wrong. And in Python, I can use try finally, and it's just going to work. So here's Lua function to load the code into Lua, executed from there, and then a tiny bit of return value adaptation to return whatever number Lua wants to return here. Okay? Just quickly run this and notice that I didn't run the beginning. And then run it again. And it fails because I didn't set up my Lua properly. Just believe me that this works. Okay. It's not the first time I show an example. It's just the first time it fails so heavily. So I'm not going to fix it right now in the last two minutes. Okay. One title of that talk was Siphon 3. So those of you who already use Siphon will probably know that, you know, Siphon is always this 0. something version kind of thing, right? So 0.29 currently. So what is this Siphon 3? Where does that come from? Well, the last Siphon version that we have is 0.29. So if you push the dots one digit to the right, then, you know, and then it becomes three. The next obvious version is 3.0, right? Okay. So that's kind of frightening, right? So you jump from 0.7. something to 3.0. Well, what's special about it? First thing is Siphon 3 is Python. Now you're going to say, but it was always Python. So what's special about it? Well, it's Python 3. And now some of you might say, I could always put my Python 3 code in there. I just had to say, you know, language level equals 3 and then it compiled it. Well, it's Python 3 by default. So the language level is 3 by default. So there are a couple of things that we changed now in Siphon 3. It's Python 3 by default. So we're going to change mostly the standard configuration of the compiler to make it more modern, to move stuff that, you know, isn't appropriate anymore these days. And we're going to adapt it to, you know, today's Python 3 world. So these are the main changes in there. You're going to adapt to several peps that came along the way. And it said Python 3 by default, yes. Okay. You can look up the Python 3 milestone. There's still quite a bit to do along the way. And we'll try to get there. Okay. That's it for my site. Thanks for listening.