 Next up is Ben-Wa-Shin-Wu, he's, did I pronounce that correctly? Okay. He's best known for being the author of G-Unicorn, but today he's going to talk about how he ported the Go concurrency model to Python. So please give a warm round of applause to Ben-Wa-Shin-Wu. Thank you everybody. I'm pleased to meet you and to be there today. I will present you my little journey in concurrent programming with Python and how I finished by experimenting to port the Go concurrency model in Python and the library I wrote for that. So just to quickly to introduce myself, I'm the G-Unicorn author, a WISGI web server in Python. I'm also a COSTP co-committer and member of the PMC, which is the project management group of the COSTP. I'm a Python foundation fellow member and I'm doing a part-time for a living. So first, what is concurrent programming? Basically, concurrent programming is a way to run functions simultaneously, but it doesn't mean formally in parallel. It doesn't have to be in parallel. It could be just running your function execute code simultaneously by sharing your process in multiple time slides on which you will run from time to time a part of the code you want to execute. Or you can run it on multiple CPUs or machines and then it can be run in parallel in that case. You have mostly two models of concurrency. The first one is the Charon memory, which is easier for the programmer. It doesn't have to care about the variable. It can use global variable, variable app path to functions and the virtual machine, the runtime is managing it for you. It can be problematic if it's not a more efficient way to run your program concurrently, because the runtime has to care about locks. And when you want to access, for example, to a global variable, the system will have to to ask for the latest step of the variable and then you will have to lock your runtime, check the variable, and pass it to the function, etc., again, back and forth. And Python is not really efficient for that. The other way is message passing. So you have two models. You have the active model and the CSP model. I won't detail it. Today, this is mostly a way to say my function, my code, is running independently in a part of my process or on the different process on a different machine, and I'm passing to them data via message passings so they can discuss between themselves and actions and data. Air Long and Scala are very well known to one of them. So nowadays in Python, you have a different way to run all that. You have three ways in the standard library. The old way, for example, is using Asynco, which is basically just an improved SELECT. SELECT allows you to wait on read and write events on your sockets or find descriptors. Asynco allows you to do that, but you can register a function that will under some codes when an event happens. You have a more primitive way, which are threads in Python. I will come back later on that, but you have many problems with threads in Python. They allow you to run concurrently, but it doesn't run parallel, and are running consecutively. You have multiprocessing that allows you to launch code on different OS processes and that have the same API threads. And on top of them, you can use a shooter, which are now in Python 3, but are available as an extension in Python 2. And that allows you to execute code using a thread of pull or thread of process on top of multiprocessing on threads, and you can execute your code on one thread or one process. And the results, when you come back later in the future, you recover the result or the exceptions you had. And you can also put some callback on that, on the results. You have Asyncio, which is a modern way in Python 3, which allows you to do evanted on asynchronous programming. It's basically allows you to yell to or yell from a generator and not really like Asyncio, because it's like you are doing message-pressing because you can send to an iterator or getting back something from an iterator, but this is not message-pressing and this is only running on one frame in your threads. So it just interrupts your code the time you get the data from an iterator or to an iterator, and then you come back on the code. So you can't really run mid-tips coding concurrently. You just interrupt from time to time. And it's also running on one frame only. You can also use external library. You have too many groups of external library. The ones that are implicitly yelledings running asynchronous code, a broken code in background. You don't have to care about how it's running a code. It's hidden. You are using your... You are developing your pattern code like you are usually do. Each function are blockings. You pass a function, get the result, etc, etc. And the code is run in an event loop. T-event, eventlet, errorgreen are good library for that and they are all based on greenlets. They just are using different event loops for G-Event. This is LibEvent today and errorgreen is using PyLibUV and PyUV. The other group is event library twisted, like twisted, which is mostly working like asyncq. At least it inspired a lot of asyncq. And this is for me, this is like having Node.js somehow on Python using spaghetti code. And waiting for callback, passing for callback. Eventually deferring some code for later, but this is not what I really wanted. And all this library, asyncq, G-event, eventlet, errorgreens, twisted, all are based on event loops, mostly targeting IO input-output. So files, events, and so gets events with some options to use the time where your event loop is dealing to handle some other code or to handle a timer. So I decided to write a new library for that to experiment some new code, and basically put in the runtime library, the runtime to Python. Offset means if this is a small and virtual complex, so that it's growing on another plant and sharing the same at the end, they are both sharing the same at the end, but the other plant has its own life. And this is basically what is offset. This is making the Go scheduler on top of Python VM. I have some goals, sort of. I want a simple blocking API like G-event and eventlet, but that is not trying to patch in background my standard library. I prefer something explicit. I prefer to know when my library is patched and I prefer to proxy standard library for that. So it used to be working on Python 2.7 and it's still working on Python 2.7 and 2.8, but I'm mostly focusing today on Python 3.4 and PyPy because I don't have time to handle all the Python today. And you can find it on my repository. The co-currency model is mostly a memory model. It's based on Go routine. It's a bunch of code running on one thread. It's basically a way to split your thread to pass function, execution code in the stack of your threads. And when you want to switch from one code to the other, the system is choosing one frame in the thread to execute and go back in your code, et cetera. It's basically just adding a metadata to your threads. Each code is not from each other. They are running independently from other. You can't pass global variable. Obviously in Go you have global variable but it's not advised to use them and this is the same in Offset. You can use because you are in Python, but I don't advise you to use them because you will have a game concurrency issue. They don't share anything and channel pipes are the only way to communicate between them also are a way to unblock your code because when you are waiting for a channel, the scheduler is able to run again to say if someone will send to the channel or to execute another code, et cetera. And you can wait on mutip channels. So, yeah, I decided to call that on top of Python because it was easier but it was some, I have some nightmares. So, really, I didn't sleep some time to find the way. And so, because Python as the main drawback is well known, this is the skill and the main part of the skill, the skill is here to allow you to make easy code, to have global variable, to easily pass variable to your function. Like I said, to do that, the system, the runtime needs to lock and it takes times. And also, because of that, one of the was thread is executed at time. You may have mutip threads running in parallel in your systems, but you can only handle one thread at a time to execute the code and then you need to wait until the thread finishes to execute to get the results. So, during this time, your system is locked. But it works well for things like accepting on socket, waiting for connections, reading an event on socket, or reading a file, writing to a file, because all of that is under the background by UOS. So, using thread for that and I will use for that is very fine. And, yeah, Python has no implementation of the system and I don't want to try to do that on Yield. I don't think it will work anyway. So, I decided to create some coroutines to create the coroutine systems. Coroutine will always be executed in the same thread, in the main thread, and broken code operations will be executed in their own thread, a Python thread. To implement coroutine, I'm using Python Fiber. Python Fiber is basically a port of continuolets on C Python from SAGUL, and I contributed to Python Fiber to also have the same API on top of PyPy, so you can use Python Fiber on top of PyPy to use the same API. I created the modules to do atomic clocking, so instead of using Mutex, etc., I'm using atomic operations that you can find on GCC and other implementations, and I've created a CFFI for that. Everything is abstracted in Prost class, so if you want to use greenlets, you can write your own abstraction to build coroutines. So, scheduling, this is the main part of the offset library. There are three entities in the scheduler, the other threads, called F, because they will run in futures, in thread pools. Scheduler context, which is called P, because you have one scheduler context, which will be maintained around Q, in the main thread, and this is called P because of process, and this is a process in your Python VM. And go-routine, which are called G. So how the scheduler will work is pretty simple, in fact, at the end. You have a process context which is maintaining a run queue, a run queue of coroutines, and these coroutines are stacked in a queue, and the first one and third will be the first to be out to be executed. At the time, you have only one coroutine that is running, the one that is put out of the run queue, and when the coroutine has stopped to run, it can be put back in the run queue, or it can be another coroutine from the run queue that will be executed. Of course, you still have a system call to handle any blocking call, like having a timer or waiting for a connection or reading a file, any blocking operation, mostly I operations. So you are still in your run queue, and when your coroutine is detected as blocking, and this is done by proxying the standard library in the syscall models in offsets, I'm adding and tagging any blocking calling from the standard library in the syscall models, and when a function is detected as blocking, the scheduler will put the go-routine out of the run queue, and we create a call in the thread pools that will handle the call. So the call accepting a connection or reading a file will be under the thread, and when it comes back, the scheduler will create the go-routine and put it back in the run queue, and it put it back on top of the run queue, so it will be executed in private, and you will get the results in your code later. So by doing that, all your syscalls are able to run in parallel, and you are still able to handle the rest of the code in your run time. Some examples. Here is a go-routine example, on the right you have the Python code, on the left you have the go code, which are pretty similar. So what this code is doing is pretty simple, it's putting hello world five times, and the first time the main function is executed, so here you are seeing some kind of old code, because the run function is about to be removed, I just need to commit the code. But right now this is how it works, you have a go-routine that is saying words, and the main function that is saying hello, and the say function what is doing is five times is printing the string that is passed to it, and it's sleeping during 100 milliseconds. And when it's sleeping, this is a syscall function, and when it's sleeping, the go-routine is put back in the run queue, and the next code will be executed, so the first time it will say hello, the main function will be put out, will be put back in the run queue, and the next go-routine which is the world, the first go-routine will print hello, will be put in the run queue, then the next go-routine will be executed, will say hello, et cetera, et cetera, five times. And yeah, pretty similar to the go-code module of the syntax. Channels, so channels are fully implemented in offsets, like in Go, you have all the features you have in Go language, you have them in offsets. Channels are mostly pipes that connect concurrent go-routings. You can send value in a channel, into a channel from one go-routine to the other, and we save those values in another go-routings. And by doing that, when you are sending a value, the go-routine is put back in the run queue to let the chance to another queue, to let the chance to go-routine to send back the data to the go-routine, when the go-routine is sending, if we put back in the run queue, and the next go-routine will be available, okay? And you will be able to read. So by doing that, we are mostly articulating the scheduler and unblocking your code by passing message from one go-routine to the other go-routine. Channels can be preferred, that means that you don't have to wait until someone is receiving the first value you are sending. You can send many value at a time before blocking your code. And you can select on-routine channels. Here is a simple example. On the right, you have Python. On the right, on the left, you have Go again, and we are making a channel. Here, the goal is to sum a list, and to sum the list more efficiently, we will split in two, and we are splitting the list in two. We are creating two go-routings that we sum this list. To this go-routine, we are passing a channel which is created by the makeChan function. And when the sum is done in the go-routine, it will send back the result in the channel. And the main go-routine, the main function, is waiting for the result of these two go-routings, getting the result of the sum on X and Y. And we sum the final result to print the final sum. And the code in go-routine is quite similar. This is an example with the preferring. And this is an example with the channel. So the channel is the select function, which allows you to wait on multi-channel, and you can wait that someone is able to resave the data you want to send to him, or to wait that someone is able to send you some data that you want to resave. Mostly like a select when you are waiting for read and write in your sockets. And this is done by the ifSend functions here, and you register some events in the select function. So you are, here we are resaving a send and resave, sorry. And voila, this is the fumonation function, and it's not time to date. You have all the models implemented. You have sync functions which are atomic locking that you can use in your library. In plus of channels, so you can do some atomic locking. And you have timer, and net and IO modules to handle the IO connection to your file and your sockets in an unbroken manner. What I want to do is to re-write channel using a map. This is mostly done, span on different processes on machine, which needs the first one to be done. And then you have to run the channel and make the runtime switchable, like in Rust, so you can switch to a live if you want or to use multiprocessing runtime or any other runtime. Any app is appreciated, and you can contribute to the repository. Voila. If you have some questions. If you have questions, please line up at the microphones or raise your hand if you are. No. Okay. Thanks again, Benoit.