 Okay, thank you very much Time to welcome our next speaker Jonathan who's going to tell us more about I think I owe Okay Thank you. All right So good afternoon. I hope you have a good for them. So I'm going to talk you cannot hear me So speak louder. Okay. Can you hear me? Yeah, good enough. Okay. I try to speak right now Okay, so we are going to talk a bit about async I owe in Python So it's about the async and await keywords that were added a couple of years ago And how we can use them within Python and a bit about what is the difference between Writing async IO code and writing threaded code because that's another way of doing concurrency So all this is about how can we speed up our program? How can we do things in parallel in order to improve the execution speed? So we have quite a bit of different ways of doing concurrency in Python First there was multi-treading which is the easiest way to do concurrency Then there was multi-processing which solves an issue if you need more CPU Because one Python process can only consume one CPU if you need to have multiple CPUs Because your problem is very CPU intensive then you need multi-processing Then there's G events. I'm going to talk a lot about that and then you have a whole lot of event loop based Implementations like there's twisted which is probably the oldest then there are a few others and in 2012 we got a sync IO So let's first start with training so training is pretty easy You define two functions in this case Alice and Bob and then you create two threads Where each time you say first the entry point the targets Alice the second time you say target is Bob and you start them So if we execute this Then you see these two things are running concurrently. You see Bob print statements. They're nicely interleaved Now things become more complex if these threads if they has have to coordinate with each other if they have to exchange data somehow Like for instance in this example, you see that there's a global variable counter and both threads are trying to manipulate that variable so First we are doing an increment and then we are doing a decrement and both threads. You see they're doing exactly the same thing and Now we can try to execute this So what you would expect is that for every increment we do a decrement So if we would print the value of counter at regular points in time We would always print the value zero one or two depending on when exactly you print the value So let's try that That's this script You see it's not exactly the case right now. We are printing negative values then it becomes positive again and as we go on we Diverse more and more from actually the zero value So how is this possible? Let's take this function. So very simple function with an assignment and an increment If we're going to disassemble this function You can do that in Python by importing the disk library and then you do this of this and you Pass F so that you see the bytecode of that for this function Then you see that actually for the increment that corresponds with these four instructions That it's not really one instruction an increment like plus equal zero or plus equal one, sorry Consist of these four instructions It has to do with how the the Python interpreter operates on the stack But it's very interesting It means that if we have two threads that operate on that same variable at any point in time between these instructions We can go from one thread to the other thread Which means that these two threads can possibly interfere with each other and that's exactly what's happening in this example So the way to solve that is by using a lock So you can create a lock and you surround these blocks where you modify that shared variable with that with lock Context manager and then you prevent these two threads from manipulating the same variable at the same time So what is wrong or what is the disadvantage of using threads is that? when as soon as you have to Manipulate shared data structures you have to think about locking and that's something which is very hard to get right So either you choose to use one global lock but then you have the risk that You lock too often like things that could go in parallel that these things become sequential Are you use very like many small fine grained locks like for each data structure? You create one lock To protect that from being modified by our threats and then you have the risk of that locking There are ways to prevent that Like some some guidelines and so on but it's very hard to get right and then threads. They have some overhead not a lot So it's not really a main reason not use threads, but still they have a bit of overhead I Shall also mention that instead of using shared data structures. You can also Not share data between threads, but use some kind of message passing So these threads they can communicate over a queue And so if one thread has to pass data to another thread you can send it over the queue and serialize it That's a way to prevent Using locks and prevent these issues, but that's also a whole different way of programming So let's come to async IO We're going to have a look at how we can do these things in async IO So here we have also two functions function one a function two And we are going to run these in parallel and do the same thing the same increments and decrements Now you see that these functions are not normal functions There's async in front of the dev keyword, which means that's an asynchronous function and then there is an await as well Now what the await keyword here actually means is that that is a place where it's fine to go from one function to the other So basically we are in control when we do the context switching between these two functions So if we are now going to execute this these two will run concurrently But the context switch between these two functions, they will only happen at the place where we have an await keyword There is a sleep. That's not important right now, but it's just to make this Example work So let's this example you see we always print zero That's what you get when you would execute this script So The await keyword actually it means that it's a place where it's safe for one core team to move to another core team That's it basically now In practice, it means a bit more than just that it also means that you're probably waiting for some IO to complete because the problem that we typically try to solve with async IO is a very IO intensive application and When you have many things going on the parallel a good point to go from one core team to another core team is When you're idle when you're not doing anything like for instance, you do a network request You wait for the response to arrive in between you're not doing anything so that is exactly the right place where it makes sense to go to another core team and Resume the execution over there And so actually you will go back and forth between all these core teams at the point where you haven't await So you have total control over the context switching So basically the difference between threads and core teams is that threads They're pre-emptive and operating system decides when to contact switch with core teams like what we have an async IO We are in control over the context switching and that means that most of the time we don't have to use locks Because we know what pieces of code are run atomically and won't be interrupted. That's exactly where we don't have an await and So it's much easier to get your code right With training there's a lot of chance that you get things wrong that in production at some point things will start breaking With isync IO it's easier to get things right Now important to know is that all of this runs actually on top of an event loop So the core teams are an abstraction on top of an event of an event loop an event loop That's a very simple mapping where you map IO completion events to callbacks So like for instance When a network socket becomes ready for reading or writing when you receive most of keyboard events These are events and then you specify a callback that will execute When that event happens and that's literally this while true So you wait for a file descriptor to become ready file descriptor maps to an event and then you call the Corresponding callback and you keep doing this in a loop So if you keep this in mind, then you see that we only run one callback at a time Right, and that's really at the advantage of event loops these callbacks. They won't interfere with each other So we run everything in one thread one callback at a time I Said there is no complicated synchronization like data locking and so on It's also pretty easy to debug because if you put a break statement in your code your whole event loop will freeze And you can inspect the state of all core teams And this is great also for handling many connections in parallel Like if you have thousands of connections like web sockets connections and they're idle most of the time Then event loop is perfect because you can wait for so many file descriptors at the same time your operating system can do that for you and then you execute the callback that corresponds to the incoming network connection from where you receive a message and That is of course much cheaper than having one thread for every single connection One important thing to know though is that you should not mix Async IO code with traditional or like the typical blocking code that you find in other other applications Like for instance, if you use the request library for doing network requests That's what many people do in Python They will block you do a request and your statement will your request will get statement will wait for the response to arrive If you try to do that in an event loop your whole event loop will freeze so you cannot do that There are workarounds to still execute that kind of quote, but it's best to avoid it if you can So we shouldn't we should not do blocking IO instead we should do non-blocking IO By registering a callback in that event loop which will then execute When a response arrives for instance Now these core teams are an abstraction on top of event loops It's pretty nice so that you don't have to think about all these callbacks because otherwise you would end up with a very ugly code So this is an example of what It's really hypothetical because this library doesn't exist But this is what a database query would look like in async IO code So we do a query we wait for the response because there's like a networking in between And when the response arrives that thing will be assigned to users at this point So the await keyword here Means that we wait for the response to arrive We return to the event loop the event loop can then do other things in between And when the response arrives the event loop will resume the execution of this function So it's like kind of a state machine the await will suspend it and later on we resume it So we do actually two things With await we say We can go to another function, but we're also waiting for the response We can also use the await to call another core team Like for instance here we have a main function and we have a get users function And the main function is calling get users, but get users is an async function Which means that if you call it you don't get the response, but actually you get a core team object And you have to use await in front of it in order to get the actual result So the await keyword is very often used to await the outcome of another core team You see at the very bottom there's async IO.run that is how you start an async IO program You need to have an event loop that gets started and that will like operate these Core teams that's what the very last line is doing Yeah, so async and await they very often go together like you see in this example Something else which is important to know is how to run core teams in parallel Because that's the whole point of using async IO you want to parallelize stuff And one way of doing it is by using async IO.getter So async IO.getter is a way to Wait for the outcome of multiple core teams at the same time In this case we spawn the get users core team twice And when both are done these The execution of the main function will go on So actually you can think of async IO.getter as a join in threads. You're joining multiple things together And actually I like this way because of doing Concurrently I like this because you have some kind of symmetry You have a place where you start things in parallel and things come back nicely together That's what you actually want. You don't want to fire something and then forget about it It's it's very important actually to do the join in order to get some kind of symmetry If you cannot do that then you could actually do async IO.grade task that will spawn the core team And execution in the main function still goes on But if later on you decide you need a response then you can still await for the task And you could actually do users equals await task to capture the outcome of that core team But I think it's pretty important to actually not just spawn tasks and not wait for the response But instead really wait for the response because that way you can do proper exception handling If there is an exception raised in get users using try accept You can capture them around the await statement Then there is also async with and async for Which are pretty nice. So the with block you probably know as a context manager That's something a code that you execute before and after the block very typically used for establishing a connection and closing the connection or Allocating resources and releasing resources So that's what a with block is doing async with is what you do If any of these two blocks the enter or the release involve async code And in this case, that's the case Establishing a connection Has to be asynchronous because it takes some time to establish the connection and you need an async with in order to do that Then async for is something you typically see when you're consuming asynchronous data stream Like for instance, you select the users from a database table the Entries are all transmitted over the wire over the network and you receive them in chunks not all at once So you don't want to wait until the whole response arrives before you start processing But you want to start processing things as soon as possible as they arrive So that means that every iteration of the for loop involves possibly waiting for the network response to arrive And so that needs to be asynchronous. That's when you have an async for And these two are typically things that you see That you have to use for many async io libraries Then there are executors, which is also something good to know about So this is something you use whenever you have to run traditional blocking code It could be either a situation where you have A blocking io library like requests Or it could be a situation where you have very CPU intensive code In that case also you would block the event loop if you have something which is very computational expensive you could run it in a other thread And have the main thread the main event loop go on and respond to incoming connections So in this case the executor will run on another thread or on another process And the await can still be used to wait for the outcome of that piece of code Then one warning Don't turn every call into an async call It's what people often get wrong when they start with async io Because some people think that at some point every function will become async Because at some point every function will involve doing io Right and often that is the case But doing io doesn't mean that you have to wait for the io For instance in the case of a logging server imagine that you're logging to a remote server Then you don't have to wait for these log logging messages actually to be transmitted You want the logging to happen in your server, but you want your execution to go on as soon as possible And so instead of using an await in front of your logging calls Is best to implement your logging framework in a way that you Push all your logging messages into an async queue And somewhere else you have then a core team that consumes the queue and flushes the messages Over the network and so that way your functions don't have to be async not necessarily Something else here, which I think is important We should try to separate The code that is doing io from the code that is doing Computational stuff if that's possible for instance if you are doing a parsing library where you are parsing data that's completely That can be written completely independent from the io layer And so if you can separate these things Then you can have code that is completely synchronous And an asynchronous code around that that calls the synchronous code And that's totally fine So try to avoid ending up in a situation where every function becomes async This is next actual example I'm not going to say a lot about this because later today tom christy will talk about htpx Which is an async io library for doing htp calls So if you're interested in that then stay for a few more talks then We will discuss this But you see that here as well We have the async web block for establishing the connection and then the await to wait for the response Is something that you very often see with these async libraries Then something else which is good to know is that if you're experimenting with async io You cannot You cannot Just use an await keyword in your interactive shell if you try that there would be a syntax error Because you can only use an await in an asynchronous function And that's very cumbersome for trying out async io code, but there's the solution ipython does support Async io integration which means that in ipython you can use the await keyword Top level and then ipython will ensure that the event loops runs and so it's very great for experimenting Or you can do python dash and async io. It's also very It's basically the same thing, but then with the normal python shell So to conclude async io. It's a pretty great concurrency pattern For io heavy applications Not so much for cpu intensive applications But if you're dealing with a lot of io and many connections then Async io is great It's not the easiest to begin with honestly But very often when things become more complex it's it's it is much easier than training to get right And then the important pitfalls i would like you to remember try not to mix Async io code with blocking io unless you use an executor And try not to turn every call into an async call So that's what i have. I don't know whether there are any questions you can find me on trader on github Sorry Yeah This one Github can take as many functions as you want It takes them as positional arguments You can pass as many coroutines as you want to Github