 Okay, now we have a presentation about asynchronous programming with coroutines from E-Road from Krainast. Thank you. Thanks. Thank you. Thanks. Good afternoon. My name is E-Road. E-Road from Krainast. I work for Intelligent Systems Outrun. All right. Cool. Where I'm a software engineer and where I have the privilege of also being a trainer for our internal people and also our customers. I got into async.io not too long ago and I wanted to do something for my colleagues to also get them on there and I thought it might have also been interesting to do here at FOSDEM today. I have a small introduction for you today, something small on Python 3.4, which is in the past. Then Python 3.5 haven't updated to 3.6 yet, but not a lot has changed, so I was told in a small summary and if we have time, some extra slides. I do have a fairly large slide deck, so I don't know where timing is going to get me. Might or might not include extra slides then. So async programming for us today. What do I mean by that and what context do I want to talk to you today? It's about writing concurrent applications without the use of threads or multiprocessing and that in a cooperative multitasking fashion. Python is definitely not the only language to employ techniques like this today. Many others did it before and we'll do it after, fairly coolest up there already. I'm also very into C++. We were also hoping to get some coroutines in the next release of the standards, but we didn't. It's going to be postponed until later, but at least in the meantime I get to do some Python coroutines. For me, it's also a style of writing code a little bit differently. Not all that much, but it's about avoiding the use of blocking calls and having an event loop do all the heavy work for you, let's say, all the IO stuff and so on. And in that context, it's fairly important that we avoid blocking in our code. You might ask me, then, what's the issue with the blocking API and why only now do we start disliking blocking APIs? Well, when you block, your thread is busy and if you're waiting on IO or on some other event to happen, it's actually doing not anything useful while you do pay the overhead for it. As far as the now part goes, well, I grew up in the one thread per connection kind of style of programming and except for callback things, there wasn't really an alternative for me to use. So I was maybe not okay with that, but at least at ease for some time. But now that we do have an alternative, I'm more into the non-blocking stuff than the actual things that do block. And then from time to time, in previous installments of this talk, people asked me, yeah, but isn't that why threads exist, so you can block and so on? And why are you now frowning upon threads? Well, yeah, indeed, they were kind of designed for multitasking, but at the operating system level, and that thing knows nothing about your application. So how can it do a good job for you in that particular area, then even if you are the only process running on a system? So contact switches, quite expensive, threads also occupy some memory. You hear these numbers of like 8 to 32 megabytes and so on. And for the people who are in that kind of area, the C10K problem, having a large number of concurrent open connections that lingered for a while and only occasionally do some data, you might turn into the limitations of scaling a regular system up until those particular numbers. And also, who here hasn't written a treading book? I do know for myself that I've written many. Also, in the kind of code that I see as a consultant, there are many deadlocks, many race conditions, and so on. And it generally seems something that we are pretty unable to get right. We do get it right most of the time, but we often make mistakes, nonetheless. So not my quote. I was browsing some other presentations from some other conferences in the past, and I saw that one. I decided to steal it and threads the go-to of our generation. You might all remember the paper from the extra in go-tos considered harmful. They are being used. I mean, think about error handling in the current one. It's pretty useful using go-tos, but we'd rather not use them anywhere else. And maybe that's a bit the same thing for threads then. We do need them. Definitely in this age of multicore, your mini-core machine, probably all of you are carrying here today, treading will still be a must to have there. But maybe we can just use less of those. Maybe one thread to handle our connections, one thread for all your video, one thread for all your screen IO or other purposes, which are inherent to your own application. So to do this new style of programming, we don't want to block a thread, to be able to do things concurrently. But you could also ask yourself, isn't all code blocking in a way? And of course, it is. We have things like executing code, crunching some data, doing some calculations, large or small, or waiting for IO operations to finish. And the first, there's not much we can do about it, except for parallelization, if that is our blocker. But the second one is the subject of our attention here today, because using async IO coroutines, we can actually do something about the blocking overhead of waiting on stuff. So to do that, we kind of need new APIs, because the regular blocking APIs, of course, wouldn't do for us here today. And when I say new APIs, it turns out, because I was kind of late to this async party or to the asynchronous programming style in general, turns out those APIs aren't that new at all, right on. And other languages have a very rich history of doing stuff asynchronously. Now, you might know one or more of these libraries, StG event, Tulip, which turns into async IO a little bit later, Twisted. Still a big ass framework. I think they are going to reinvent themselves. There's an awesome talk about the debt of Twisted, also somewhere on YouTube from other conferences. They do offer more at the moment, but since this thing is gaining momentum, they probably will need to do something about their marketing and reposition themselves in that area right there. And let's say about a year ago, I got a privilege of looking into a project which used, let's say, Tulip style of async IO. And not that I didn't like it, but they were kind of heavy on the callbacks and so on. And I was then kind of happy to learn about the new way of doing async IO in Python 3.4. Now, beneath the hood, the same mechanisms might actually be used. There's some form of select poll, e-poll, or maybe a task queue or something being used under the hood. But nowadays, the same underlying constructs can be used in a nicer form with support from the language itself and the core library. So, async IO in Python 3.4, an example, or at least an example on the next slide. There was this disclaimer in Python 3.4 about this thing being in there on a provisional basis, which meant that they were allowed to change things about it and they did. In a minute, we're going to talk about Python 3.5, where the core keywords for doing coroutines have indeed changed. Okay, thanks. An example, printing some fairly unique string to the screen, kind of obligatory in these things. What can we observe? There's a decorator on top of our print hello function that turns it into a coroutine and there's also, at the very last line, a yield from statement. And it's going to yield from another coroutine, async IO.sleep. Now, these coroutines, just to get in the right frame of mind, we can think of them as just special functions, which can be suspended and resumed. That's basically, if you keep that in the back of your head during the presentation, that will work just fine. I consider these things to be new style, async, because of the coroutine nature, but already old style syntaxes, since things have changed a bit in Python 3.5. I'm not going to linger a lot on these particular slides right here. Another example, which will, in a minute, come back in making a connection to something on the other side, is inherently an asynchronous operation. You don't know when that is going to finish and so on. Also there, we can make use of those new coroutine APIs, which return objects that have coroutines of themselves and allow for all those yield from and awaits we'll learn about in a minute. So 3.5 was the version that I was using when I made the slides. Of course, we're at 3.6 right now, but not a lot has changed, so I was told. They changed the keywords. No more atasync IO.coroutine decorator. No more yield from statement. They have nicer keywords now, async def and await. Here's the same disclaimer, but it doesn't really matter. It was removed in the latest version. The same example of print hello. Looks kind of neat to me, neater than the previous one. Async def print hello. And then on the last line, we await async IO.sleep. And in fact, sleeping, it's not going to block. It's just going to suspend this print hello coroutine for three seconds-ish until that other coroutine called async IO.sleep has actually finished on the event loop. So in the meantime, we can do other tasks, other IO, other sleeps, if you will, or just data calculations in the meantime. Again, we think about them as special functions which can be suspended and resumed. But if you have things like resuming and suspension, you need something to do that for you. And that's where the loop comes in. You ask async IO for an event loop. You give it a kick, either to run forever or to start with a top-level coroutine and then stop or unblock after that has finished. So a loop is a pretty important thing to do async IO. So async is programming with coroutines here in Python. The loops also support old-style async. In this case, it's an example of adding a reader to keep an eye on a file-like object. In this case, standard then, and then register a callback to be called back when certain things for data coming in have happened on that particular file object right here. Not something I particularly like, but it's there. You can still do it. Now, this callback async, I think that's mainly the reason why I was never all that enthusiastic about async as programming myself, which is a bad thing in retrospect. But back then, it didn't look all that appealing to me. All those callbacks, the anti-pattern of callback hell made it on the consulting jobs, let's say, coming into a very large codebase, a new codebase, made it often pretty hard to follow along what was actually happening. All those state machines, all those callbacks, it didn't really feel right to do proper maintenance on that or to feel confident that you weren't going to break anything because, of course, there aren't any tests and so on. I didn't particularly like it, but for those who were doing it, it did get them out of doing threading and having those issues right there. So what else can we do? Schedule just regular functions. In this case, call me soon, execute this function pretty soon on your event loop. Kind of nice. If you're into that kind of stuff, call add call later to delay having your function or delay having your function called with call later or calling it at a certain time point with call add, but that's all stuff that hasn't anything to do with coroutine, so we skip it fairly easily. The same example as before, the TCP echo clients. We open up a connection, which is an asynchronous thing. Some stuff needs to go on the wire end, and there's no telling when your connection will actually be established. We get a reader and a writer object back, or we get a tuple, and we explode that and bind it to the symbols, reader and writer. What else can we absorb in this particular function? I have a kind of dissection in the next couple of lines. So basically calling another coroutine and waiting on the result of that particular one happening asynchronously at the same time suspending our own coroutine. And now we have some writing and some reading. And we can observe that there's no await on the right, but there is one on the read. Since for the read, we don't know when packets are going to come in over the wire. The kind of need to suspend until that has happened. But the right one doesn't have an await, and that's because it is documented not to block. With a small remark of earlier, it doesn't all code block, of course it does. But this is about enqueuing a packet and just doing the least amount of necessary work, I presume, to get that package enqueued and then returning. And we don't consider that blocking the thread on which the IO loop is running. That's why we can mix await and just regular functions from the same API, in fact. Why do I like this? And I keep repeating it. I like it because I write it as if it was synchronous code, multi-thread code, no callbacks, no keeping state and so on, but at the same time allowing for concurrency, multitasking within the event loop, suspending until some event happens and not wasting your threads by doing nothing, but in fact yielding to other actors, to other tasks in the loop in your application and so on. Now, to do async.io, we need those non-blocking APIs. We need alternatives, and Python wouldn't, of course, be a Python if also there, it didn't come with batteries or at least some batteries included. And as it always goes, additional batteries on PyPI, I keep being astonished by the large number of packages on there, and that thing just keeps growing and growing and growing. It's awesome, although I often ask myself who's using all those packages, but still awesome. What do we get from standard Python, low-level socket operations, making the connections and doing the stream operations and so on, sleeping or at least not running for a while. It's just not exactly the same thing for me as sleeping as we did before. So processes, waiting on them or at least non-blocking, waiting on them, suspension, let's say. And then synchronization primitives. If you want to deadlock, you can still deadlock also with code routines. I hope that's none of your intention, but anyway, it can be done. And then if you want to consult the async.io.org website, there's a pretty nifty list of libraries, additional libraries to work with async.io. I assume they're all on PyPI, because why wouldn't they? Your favorite stuff might be on there. If it's not checked the website, it might be on there. I might have missed it, or I might have just omitted it from including it over here. I come from the, let's say, embedded world myself, where we either use Python for testing and validation or write a slower running embedded system with Python on the bad Linux, for example, and then sockets, serial ports, and so on, or the things that get used in our domain, and they lend themselves pretty well for asynchronous programming. An example, of course, for a serial port, just open one up using the async serial package, asynchronously writing something to that particular port, asynchronously reading, and then potentially also asynchronously handing a response. No callbacks, no states, all pretty straightforward, easy to read code right there. On one of the projects, I was in last summer. We needed an RPC mechanism fairly quickly. I needed to run on async.io. We got on the internet, browsed a little bit, and came back with aiozmq.rpc. It literally ran in under five minutes. It was no trouble to set it up at all. It just worked out of the box, and I thought it was pretty nice. Everything about it can be async and can be awaited, let's say. HTTP, of course, I had to include it. Not that I'm doing a lot of HTTP myself, but that seems to be pretty popular these days. One thing I did want to point out, though, is on line number three, asyncwithsession.get. That's a context manager, right? The thing that will manage an object for you, regardless of exception context or not. Also, that thing can be run asynchronously. Also there, this code routine fetch can be suspended. I also think those and the async for loops are a pretty nice thing to be added to the language as well. SSH, also there. You can await on reads. You can await on closing a connection. You can await on opening a connection. You're probably starting to sense a pattern here. All pretty easy to read code. No callbacks, nothing like that, just await statements and async def code routines. If you do have a blocking function, there are some provisions for that. You can run your blocking function on executor and have async do all the synchronization for you and suspend until that block of work, let's say, has been done on another thread through a thread pool executor. And that might come in handy with legacy APIs or things that aren't easily made async, let's say. For example, file handling. I haven't seen any Python implementation of actually talking to an, it does. So is that like really blocking? Let's take that question offline. I don't know if I honestly get it but maybe there's time to answer it afterwards. So I haven't seen any implementation of Python library that does an actual async file interaction. I've looked at AIO files, a library for doing so and also to another framework by David Beasley and both actually overload the work of doing file IO to this executor, right? So I don't know which async API in C we should be using but that might help if we had it but I haven't seen any implementation of that yet. But also again, note the async context manager right there and doing the file management automatically for you. Of course, it's not because we use async IO and the loop that our stuff doesn't need to be tested. We do need a loop where test framework knows about hey, I need to encode routines on the loop. I can't just be executing them left and right. Async they've set up in this case. Don't write your unit tests of course to connect to an RPC and so on. We are just often in a situation where a unit test has a bit of a vocabulary crash where a unit is the embedded system you ship and you do unit tests on the unit and so on. It's not always straight up unit testing like we would like it to be but those frameworks are aware of async IO loops and so on. Ideally or theoretically, running those things on the loop would allow us to run multiple tests on the same loop at the same time. Don't know whether or not that's a good idea but there's a guy doing another presentation on this testing framework later today if I'm not mistaken. So I hope to learn more about that over there. Another thing I like is the clocked test case. From time to time, you might find yourself in a situation where you have a long lift process that does some events or some actions at a pretty slow pace. You might not want to have your test run for hours and hours and hours before getting a result. Accelerated tests using, although by taking control of the clock of the loop might help you out there. I think at least in our company we will be using those in the future. There are also some mocks available for us to use. I haven't tried them out myself but they seem pretty interesting. There's also PyTest, async IO and extension to PyTest for those who don't use the standard TUNI testing module built into Python. Also they, as you probably could expect, have support for doing async IO and coroutines and testing them. Now, from time to time, your application might require you to stop the loop or to stop individual tasks. That's pretty easy and also pretty awesome. Basically any await statement in any coroutine is a chance for you to stop the loop. And none of that nonsense of not being able to stop a thread, that's where you didn't build in stopping mechanisms and so on. The loop will take care of that for you as soon as it gets control back from an await statement. And this is about cancelling a task. Of course, you need to have a handle to the task to be able to cancel it later on. Now, if you were to do these things, stopping a task or stopping a loop from another thread, you would get into some trouble. Those loops or the API for those loops isn't thread safe. Why would it be thread safe, of course, since we actively try to avoid using threads right there. But there's an API call soon thread safe to allow you to, let's say, schedule these coroutines, stop and cancel from another thread right there. Exceptions, they get raised as usual, except for the topmost coroutine. Either it escapes and caches your loop or the loop does something useful with that like catching it, logging it and carrying on with other tasks. I kind of appreciate it doing that and not crashing my whole system. It's using the standard Python logging module to send its data to. And probably you know that through that configuration you can send that data and those logings anywhere. It doesn't have to be standard out or whatever, just send them anywhere through the logging framework. Then, of course, as to be expected, not everyone completely agrees with the standard implementation. Also, people set out to write alternatives. For example, there's lip UV loop promising a better performance for the event loop itself. Goal-like performance don't know what that actually is, but that was kind of a promise. And then also people implement complete because the other loop just works with Async.io, but people are also implementing complete alternatives to the whole thing, like David Beasley, probably known to you from other Python conferences, and pretty good talks he's delivering. Also, he got some pretty nice numbers up there, speed-wise. And at the same time, I must admit, I like standard stuff in Python. But if I see these numbers from the previous slide, thanks, five minutes, I will wrap up. If I see these numbers from this particular slide, they are very enticing to me to at least try those out and go beyond standard Python. And of course, PyPI is beyond standard Python, but still, yeah, I like it. Summarized, so asynchronous programming, concurrency without threading, writing suspensible functions just like you would just synchronous code in your Hello World program from the days of yonder, let's say. Possible in any version using the callbacks or callback hell, Async.io in Python 3.4 with that particular coroutine decorator, and then the newest and greatest syntax, Async.dev and await in Python 3.5. We need non-blocking APIs. And as you saw on my smallish overview of async.io.org, there is some stuff out there already. I would expect to see more of those, and why not replace all blocking APIs by non-blocking, can always wait on stuff, but making blocking stuff and blocking on the go. That doesn't really work right there. So Python 3.6, as far as I'm aware of, not a lot has changed. It came out around last Christmas, but for Async.io, we can still stick to the slides I had right there. And this is something to end on. You know this pretty well already, I think. Thank you for joining. Keep calm. Write code routines. You'll see us in Cairo. Thank you for your questions. Your questions. It seems that I have two minutes left for questions. There was one right here, but I don't know if I will be able to answer it. Thank you. It's my understanding in C Python that when you open a file or read a file that can handle it after C and releases the jail. And it was my understanding that when you release the jail, it becomes, in a way, asynchronous because you're not really blocking. Is that correct? Or yeah. Well, I think those C APIs are executed on the actual same thread, gil or not, and they would block. So you would have to hand those over to a different thread, as far as I'm aware of. Or use asynchronous C APIs, like the ones on socket sensor on. And I believe they are there in POSIX, but I don't know of them. And apparently, neither do the implementers. Thank you, Lut.