 And today is the last day. So let's hope it's, like, this is my first EuroPython presentation. Let's hope I, you know, do good. So this talk is about scientific computing using Scython, so I'm gonna go around, like, some basic overview what Scython is, how can you use it, how can you specifically use it in the scientific computing world. So I am Simi Moria. I work at Credible Health. So it's a startup in India. So we work on, like, cancer reporting. So basically we work on, like, cancer detection and brain, liver, and lung. So the idea is to connect deep learning with cloud computing so you can actually get all the matrices and dashboard directly through your web browser. So. Okay, you can't hear me? Okay, I'll be just near to the mic. Am I audible now? Okay, perfect. So what this talk is about, so we'll just go through step by step, what is Scython, how, like, we'll unleash the Scython superpowers and we'll try to build our first Scython module and we'll see how Scython, how Scython provides, like, support for NumPy and we'll unleash more superpowers and there's a project demo at the end, CYVL Feet. So I worked on this project last summer for Portland State University. I actually wrote a wrapper for VL Feet. It's a library for computer vision. So it has, like, all the major vision algorithms, like SIFT and, yeah, similar to that. So, okay. So what is Scython? Like, it's actually a static compiler for both Python programming language and the extended Scython programming language. So it makes writing C extension for Python as easy as Python itself. So we'll unleash the superpowers now. So, okay, the first superpower is you can, the, when you use Scython, your Python code will actually be able to, like, call C, C++ code back and forth. It talks to it, like, natively. You can easily tune, like, Python code into, like, you can optimize the performance by adding static type declarations. Let's see how we do that. Yeah. So this is up your Python code. It's a simple integration function. So, yeah, it's pretty easy. So now you have to spot the differences here. So what do we have added? We have added a type to I, S and DX. So we have predeclared the types and we have done that by using CDEF and not simple DEF. So CDEF is basically, you use there, actually three, like, similar, you know? So there is a DEF, there is a CDEF, and there is a CPDEF. So I'll explain it to them, I'll explain them later. Let's look at just CDEF. So in a nutshell, CDEF is basically when you're dealing with C stuff and there is no, you don't have to deal with Python objects. So yeah, then that's when you use CDEF and you'll be able to actually harness the actual performance of C Python runtime when you use CDEF. So you have declared, no, so you have defined the type of I, S and DX and how does it help? So whenever there is a performance critical code, you have lots of loops and lots of computation. So you add static types, declarations. It actually allows to actually, Scython just bypasses the CPython API calls and it just runs everything in C. So the code that Scython generates now, it's more optimized and has better performance. So typically how Scython works is it's basically you write everything in Python, you write everything in .py file. In Scython, you write it in .py X. So you can just copy paste pure Python code into a Scython file that is .py X and it would run. So this won't give you really good speed up, but you can do that. If you want more optimization, you can actually include types just as we did here. So just copying this particular code in .py X gives like 35% speed improvement. So, and yeah, by adding some static types, it's actually like speed is like four times than the, as compared to the core Python. So yeah, I talked about DEF, CDEF and CPDEF. Let's look them in detail. DEF is basically, yeah, it's Python. So it is called directly from Python and it takes Python objects arguments and it returns a Python object. So there is no, you don't have to deal with C anywhere. So yeah, and CDEF, it's basically C. So where Scython functions, like you intend them to be like pure C functions, you want actual C performance, you use CDEF, but you have to be very careful. You cannot, like at that, when you are using CDEF, you should, like that particular patch should not deal with any Python object. So that you might run into a problem if you do that. So the generated code, like, so when you compile.py, like Scython file, it generates a .c and then it, you can, like it gets converted to .so, which is a shared object. So in Linux, you, and then you can directly import that .so in the Python session by doing an import, simple import, we'll come to that too. So CPDef, it's basically both def and CDEF. So there is a trick here. It actually recognizes both Python and C code. So when there is a C code, it recognizes, okay, it's C code, it can actually do some early binding and when you're dealing with Python objects, so it uses dynamic binding and yeah, it kind of gets slow. So it's like Python, then too. So yeah, it can directly import CC++ libraries using C import. So you can just do C import, lib C something. You can directly import. We'll have, we'll show, I'll show you an example later. So yeah, you can actually integrate with existing CC++ code. So you have, say, you have a library in C and you want to use it, like you want to create a Python interface with that library. You can do that by using Scython. We'll look into that too and just, you know, unleashing all the superpowers. So yeah. Okay, so where is the magic? Like, how does Scython do what's so special about it? So the main feature in Scython is the, like, the optional static type declarations. You don't have to declare them every time. Yeah, they are optional. So the source code gets converted into, like, optimize CC++ code. And this allows, like, very fast program execution with, you know, programmer productivity since you are using a Python-like syntax. So what's the motivation behind using Scython? So this is a really cool diagram I borrowed from Advanced Python Summer School. It happened in 2012. So there's ease of views and y-axis and there's a speed on x-axis. So speed, when you're using, like, low-level languages, like, you know, Fortran or something, so they give you speed and with Python, you get ease of views. You want to find somewhere, like, a middle way out. So that's where Scython is, that Scython exists. So, okay, let's get to the code directly. So first one is Python core. Second one is CC++ or C, yeah, yeah. So third one is, yeah, Scython version. So there's just few differences when you compare the Python and Scython code. Just, there's one, Cdef. This gives, you know, 80% speed x and when you compare it to Python, like, native Python code. So this is, like, a really sloppy function that I wrote. You can add few more optimizations. We'll come to that slowly. So Scython, where is it used? So there are many big, big projects which use Scython. One is NumPy, you might have heard of NumPy. And there is an AstroPy as well. It's not an example, but yeah, it does. And yeah, it's, yeah, it's like coding and C and Python at the same time. Okay, we'll come to the use cases slowly. Am I going too fast? Just let me know if I am. Yeah, okay, thank you so much. So, library wrapping. I have, like, said enough, like, again and again that you can wrap libraries. So, for example, I'll give you an example. So in C, when you're writing something in C, you have, like, the structure is like, you have a dot C and dot H. In Scython, you have dot pyx and dot pxt, the Nologous files for those. And let's see how we can do that. Okay, so the second use cases when you have performance critical code, I have explained this already again and again. So the common procedure is when you really need speed, just, you know, you can use a compile language like C, C++, and then wrap the whole code in Scython. Instead of, like, directly doing it in Scython, like, it'll give you, like, really good speed. And yes, the third use case is okay. Let's use case one, I'm sorry. So who knows about the Jill? I think everybody does. Okay, that's great. So Jill is basically, it's a global interpreter lock. So it's, what it does, it's kind of like when the lock is acquired, it limits the threat performance. You won't be actually, even if you're running your program on like a multi-core CPU, it won't be able to, like, harness the par, the total par. So yeah, so there's a solution that Scython provides to break out of the Jill, wherever required. So let's see. So these are the unwritten rules of Python. You don't talk about Jill. You actually, you do not. And seriously, don't even mention the Jill. So, so let's get into an example. So this is a really, this is a small function which actually keeps our CPU busy. So it does a lot of computation again and again. So it takes 6.7 seconds to run on like my laptop. Now, see, this is a sequential one. There are no threads involved. This is a really simple function. Now, let's consider the threaded version. And yeah, there are two threads and it takes 11.1 seconds. And this should not, this is not what you expect, right? So what happened? What happened here? Okay, so it actually, what happens is when you, you run it on a multi-core CPU and multi-core processor, you actually expect to, like this script to run like nearly half time, but or at least in equal, at least in the equal time as of the sequential version, but it's not happening because the threaded version suffers from a very bad behavior where the OS tries, the operating system tries to desperately schedule the two threads on different cores, but because of the Jill, the lock, only one runs at a time and whereas the other is moved like across the cores endlessly, it keeps, you know, it tries to allocate it to any core, but it just can't. So that's how it takes so much time. So what's the solution? Scython provides a really good context manager. It's known as with no Jill. So the catch is like, you can use it when it does not touch any Python object. So let's see how we can use it. So this is a function, the implementation of the same function, like it has different computation. It's like a bit more time taking. So yeah, so we have this with no Jill and you can see in the busy underscore sleep, underscore no Jill. So we use this context manager with no Jill. So now this, you can see underscore the last function. It's implemented in like pure C because you're using Cdef everywhere. You're declaring all the types. So it's not Python, it's not, it's like, yeah, it's almost C. So the performance has actually increased so much, a lot. So let's see. Yeah, you can actually, you can use Jupyter Notebooks to get these yellow, this is actually, when you compile it using like a minus A flag, it creates like, this is like yellow yellow lines. The more yellow it is, the more Python interaction is happening. So you can see in line number 10, there is almost no Python interaction because yeah, you have declared everything. So this is like super fast. And there are the number one and number two are equally yellow. So they're dealing with Python objects somewhere. Yeah, actually they are, that's why. So one tip that you can take is you can decrease the yellowness of these lines if you want to optimize your code. So that's one of your, like if you want to optimize, just use the minus A flag and get these results and see where you can actually use, actually there's a need for optimization. So yeah, we'll build or build like a really small program. So I have, yeah, there's a lot, I have told you like there's a pyx file and it gets converted to .c and then .c is converted to .so or .pyd on Windows. And if you can import it directly into a Python session, let's see, you can import it, you can build using different methods, you can use just tutorials, you can use pick simple and it's really cool tool but you don't want to use it on production, I don't know, we don't use it. Third is Jupyter Notebook. Yeah, it's the most common way, that's the way I use it. So yeah, this is a simple hello world script. So you have a simple, this is actually Python. You can, as I told you, you can copy paste into a Python file and you can, it will run. So there's a line from scython.bill import scythonize, it does all the work for you, it's like if you want to import it into the Python session, the scythonize does all the work for you. So you use like extension module is equal to scythonize and that's where you're using your scython file, .pyx. So yeah, and when you use this minus minus in place, what it does, it creates a .so file in your current directory so you can directly import it from there and it would work. So yeah, congratulations, you have built your first scython module. Yeah, next, yeah, you can use pximport as well if you don't want to go through all that procedure of like writing a distutils and everything, you can just have a .pyx file somewhere and you can just use pximport and import hello and it would work. So it's a really easy way to do that. Okay, so what are the conclusions that we drive? Naive scython means when you just paste the Python code into the scython file. So it does speed up things, but it's not, you probably don't opt for a scython to just get a like 1.8x increase and so you probably might want to tweak it more. So you can optimize scython is where you have type declarations where you are actually dealing with CDF and other modifications. So CP def gives a really good improvement over def but CDF is where actual power is. So it's really valuable, but you have to be very careful when you're using it. So scython CDF is almost, yeah, it's almost equal to the C version of that Python code. It's the best attempt to actually increase the performance but yeah, okay, now let's move slowly, slowly to the scientific computing world. So NumPy and scython, actually scython provides really fast access to NumPy arrays. So it has a C-level type, which is known as typed memory view. You might have how many of you know memory? What's memory view? Okay, just two, okay, anybody else? Okay, great. So memory view is just, it's like similar to a buffer protocol. You don't have to like, you can actually read and get the contents of particular say array without like, memory view is just like an interface. You don't need to copy it somewhere or you can just access it, you can read it. So the typed memory view is kind of similar to the NumPy array buffer support. And yeah, scython actually allows you to work with buffers without even knowing, like without even getting into the details of it. So yeah, that's pretty cool. And a typed, okay, so because a typed memory view is actually designed to work with a buffer protocol, it kind of supports every buffer producing object efficiently and yeah, so it allows sharing a buffer data without copying. So okay, so we will go, these are like, okay. So now when you have typed memory views, the cat gets a unicycle. Let's see how it's how we improve this unicycle, this motor transport. So okay, so let's suppose we want to work with like a one-dimensional buffer in scython. So we do not care like how data is created at the Python level, we just want to access it in an efficient way. So let's create a def function in scython that has a typed memory view argument. There we go. This double parentheses and a colon denotes a typed memory view. So you have, you're actually passing a memory view as an argument to a function. So okay, so yeah. And you might want to, so you can see there are a few optimizations like static type declarations. You are actually defining the types of i and total. Why do we do that? So i is actually the loop variable. So you probably want to like, you know, oh, we'll come to that later. Okay, I have full explanation of that. So we are, so yeah. So yeah, here the memory view actually attempts to access the objects underlaying data buffer. So if the past object cannot provide a buffer, that is like, it's probably it might not, it doesn't support the protocol, then a value error would be raised. And if it does, then it means it supports the protocol. So this was, this is actually, the next one is actually the same code, but with a few more optimizations, you have to tweak more. So when you are iterating through a type memory view, Scython treats it as a general Python traitor. So it actually calls Python CAPI for every single access. So we can do better. So we were doing it here. We were actually, it's considered as a Python iterator. We are calling the API every time. So how can we do better here? So the type memory views are designed in such a way that you don't have Python overhead every time. So it's like, yeah, it's C style, it's C style way of accessing things. So this version has a much better performance. Probably if you make like one, an array with like a million numbers. And if you want to add them by doing iteration, and if you use this function for that, it's actually sums the argument. So if foo is an array of like a thousand objects, thousand numbers, and you, this wouldn't, this, if say it takes, say two milliseconds, this will take one less, I tried it. So I had, I didn't include the example, so I thought I would explain it like this. There's one more thing, the Scython, actually in this case, Scython generates a code that bypasses this Python C API, calls and like indexes into the underlying buffer directly. So this is the like source of our large speedup, but we can still do better always. So optimization never stops. It's a never ending process. I learned it from Andrew. I attended his workshop yesterday, so no, day before yesterday. So yeah. Okay, so now when you have this kind of speedup, our cat gets a good bicycle, and it can do gymnastics, so yeah. It's a bit better improvement now. You can do a lot. You can actually increase the performance by, but you have to trade the safety. Let's see how. So every time we access our memory view, Scython checks that like the index is in bounds. So if it's out of bounds, Scython raises an index error. Also like the Scython allows us to index into memory views with negative indices. So just like the Python list, the index wrap around. So in our foo function here, yeah, we trade through a memory view once and we do not do anything fancy. It's just a rather simple thing. So we know, here we know ahead of time that we never going to index, like we never index within out of bounds or negative index condition. So we can actually instruct Scython to turn off these bounds check for better performance. So to do so, we use the Scython special module with the bounds check and wrap around compiler detectives. So we have, I have modified the code here. Yeah, so it's the same foo function. And now we have like original, we have the original definition, but we have a new block or context manager here, which actually turns off the bounds and wrap around checking when we are accessing our memory view. The result is like really small performance improvement, but more better efficient code generation. It is up to us that we ensure that index doesn't go out of bounds if it does. If you are, you use like negative indexes. So you could lead into a segmentation fault. Yeah, so we can, okay, there are actually a lot of ways to turn off the bounds checking. So one way is this. You can use a context manager. If you just want to use it, see it's the width bounds check falls, wrap around falls. This is just above the loop. It means that you want to turn off the bounds check for the next loop. So if you want to do it for the whole function, so you can, okay, no, sorry. Okay, yeah, this was it. So if you want to do it for the whole function, you can use this another, actually you can use a decorator. So it's another form of directive that removes the context manager. So there's one more way if you want to like remove the bounds check like globally across the program. So you can use another compiler directive. You can actually add, it's a comment line where bounds check is equal to falls and same for the wrap around. So actually it's now the size one provides like different scope levels for these directives. One was context manager, second was decorator. This is like global module level. So you can actually have like really precise control over like where these directives are in effect. So they can be easily disabled and for debugging and yeah, and enable for the production runs. So yeah, now your cat can do this. Like it has on a like big rocket and it has laser beams in the eyes. And it can, okay, it can also do this. So it's up to you how much you want to optimize. Okay. So I promise that I'll show you a project demo. Okay, so let's just wrap up like what have we learned. So we saw how to declare like a simple type memory view. We saw how like indexing the type memory view within, you know, interval argument can efficiently access the memory views underlying buffer. We saw how to use a bounce check and the wrap around directives and we saw like three different ways. So when do we optimize? Like we just don't go and optimize everything. So I, yeah. So I borrowed this from Andrew's tutorial again. So you first see like you have to actually profile and you have to set benchmarks. Like, okay, where do you want to optimize? You just have, so there are some, if you want to like write a really small program, you probably don't need optimization. You have to like know beforehand whether you need it or not. Now you know it. You can use the minus A tool that I told you earlier. So now you know where the optimization is needed. So you can go ahead and like add static type declarations or any of the other, the previous method that I told. Yeah. Next. Okay. So this is the project demo. Last year I worked for Portland State University for a Google summer of code program. So the project is C by VL feed. It's actually a wrapper for library named VL feed. So VL feed is actually, so what is VL feed? So VL feed is actually popular computer vision algorithms library. It actually specialize in image understanding and local feature extraction and matching. So we have like couple of algorithms. You might know the SIFT scale invariant feature transform. K-means, hierarchical K-means and slick super pixels and quick shift super pixels. These are like very famous image algorithms. So yeah, these are the algorithm that it has. So I worked for a company in like India. So they needed to, they were working on a video platform. So what they were trying to do, they was actually, so if you see a video and say someone is driving a car and you love that car and you want to know what model is that are like, probably the video is really short and you're not able to get, spot the like, you know, brand of that car. So they were actually trying to make a video advertising platform so where you can, it provides you, you know, an interface to run the video where you can actually click on any pixel. So if that pixel belongs to a car, that clicks got saved somewhere and you can see it when the video finishes or probably in the middle. So it saves, actually each and every pixel was labeled and it was mapped to the advertisement info. So if that car was like Ford Figo, you can actually see it and that you can directly go to the advertisement, advertiser website or something. So there was a lot of object detection involved in that project. So I had to use real feet. So it was like written in MATLAB and so like, you know, companies don't prefer MATLAB so they asked me to like, whether you can write few functions, like you can wrap few functions for iPhone. So I proposed this project for Google Summer of Code last year and it got accepted and I, so project was already in, I didn't realize this but the project was already in progress by like five researchers of Imperial College of London. They were already working on it so I contributed along with them. So we worked on like 14, 15 features and they got completed, so yeah, yay. So you can find it here. It's CYVLfeet is a part of Menpo project and Menpo was actually in the demo section of last year's CVPR, the Computer Vision Pattern Recognition Workshop last year so you can check it. It's a small part of the Menpo project. So yeah, contributions are welcome for CYVLfeet. You can actually fork it and create a new feature branch. So I'll also tell you a few limitations of Syson that I faced when I was working on the project. So yeah, by the way, yeah, these are the two modules that need more work and these branches are like still really active. You can go through, have a look or something. So I was actually working when I was wrapping this particular library VLfeet. So there's actually the bin sum module that you can see. It has, so it has like, we were actually taking reference from the corresponding MATLAB interface. They had a MATLAB interface. So there was like, so in MATLAB, the structure is quite simple. So there's when you have a .h file and a mechs which actually provides, there's a mechs file which actually provides the interface between MATLAB or Octave and the functions in MATLAB and C, the mechs files do that. And so the, so yeah, so the corresponding Syson implementation would have been .h translate to .pxd and the Cmechs translates to .pyx. So that's the analogous. So there were no deviations until we finished like 12, 13 features. But when we came to bin sum, there was a file .def which is the module definition file. So like, so I'll give you a brief introduction what def is. Does anybody know here what .def is? Okay, it's okay. So in Windows, .def describes what functions are going to be exported from the DLL. So unlike like in GCC Linux where every symbol gets exported by default. So you have to tell, in Windows, you have to like tell the compiler what functions to export. So the standard way of doing it is to write .def files. So at that time, I didn't know how to like wrap this def file in Syson, how to watch the analogous. It wasn't documented anywhere. So yeah, I asked, it's on Stack Overflow. That's the normal way of doing. So yeah, I still can't figure it out if somebody knows. They can just talk to me later. So this is the one limitation that I faced with Syson. And the reason, there's one more limitation in Python 3, like they have removed the nested tuple argument unpacking. So yeah, Syson also doesn't support that. So it's a recent removal. The slides are here if you want to like download it or something. Thank you. Yeah, I finished it really early. Okay. So any questions or something? We have time for a lot of questions. Okay. Thank you so much. It's a great token. Well, personally, I'm gonna rate it highly. Okay. I have a question. Do you have any examples of importing or referencing external libraries from Syson? Yes. Here we have. So you see this one. Can you see it? See import libc.mass. So you can directly import the library and see, okay, there's an example of minus a as well. So here it generates, okay, you can actually click on it. Yeah. So this is the C code that it generates. So yeah, you should be able to read it if you want to optimize it. So yeah, it takes some time. So it's number four is a bit less yellow since it deals with the library that is imported directly from C. So yeah, it's just really small. Yeah. And that number three, it deals with Python. So here it's a bit longer. Yeah. Is it clear? Okay. Any more questions? Anything else? Okay. Thank you so much. Here we have. Here we have. Here we have. Here we have.