 Trio structured concurrency for Python before we talk about concurrency and trio and all of that stuff Let's go into a little bit of history back at the dawn of time when dinosaurs roamed the earth and I was very very young there was a control flow a mechanism called go-to and Go-to was the way you Went from one place to another in your software Here is a language called basic It's the first programming language. I ever used and it is It's not widely used today It was very popular in the 70s and early 80s And here is a basic program. That's a very simple straightforward basic program It just counts from one to ten and prints out a line with each of those numbers on it And then it prints all done when it's finished and to prove to you that it works. Here is the output I Apologize for the quality of the Commodore 64 screen. It is an old cathode ray tube and Downscaling it on the slides doesn't seem to work all that well But you can you can see and if you want to you can make sure that the program that's running is in fact the one that I have Nicely syntax highlighted above So the way this works is that on Line 40 you check if your loop variable i is less than or equal to 10 you go to line 20 But there's no other than the number next to the go-to. There's no actual length So you the only time you know that line 20 is a place that your program can jump to is when you get to line 40 and read the go-to But this is a very simple program It's fairly straightforward. It's only five lines long So let's look at a more complicated one This is a program it has three go-to than it So it's not actually all that complex by programming control flow standards but I'd be willing to bet that very few people in here can actually Read and understand that program without drawing diagrams or stepping through it or executing it or whatever I know that I certainly can't and I'm the person who wrote this program What it does as you can tell by the Input prompts at the top is it prints out some number of primes so let's run it with Printing out the first ten primes and there you go. Somebody wants to confirm that those are in fact prime numbers This program has three go-to's in it they all jump to different places and It's it's a mess. It's a very simple algorithm. It basically has Three nested loops in it, but nobody's going to follow that easily Fortunately we were rescued from go-to by the invention of structured programming Which I'm a little young to have lived through that it happened around the time I was busy learning what buttons on a keyboard did and In modern languages, we don't use go-to we use for and while and If and we have nice blocks pipe and has particularly nicely visible blocks So the first program I wrote and I had in basic there that printed out ten lines Here's the equivalent of Python much easier to follow no go-to's you can tell that The the indented print is the thing that happens In the loop and the old oven happens at the end and you can tell immediately where the but that repeats starts and ends and Here is the equivalent of the prime numbers Program, it's not complicated. It's not that easy to follow But at least you can tell where the loops are and you know that the stuff at the beginning before the while is stuff that isn't going to repeat and the stuff inside the while is the outer loop and then there's a an inner loop and Etc etc as an aside one of my favorite little bits about this program is the else block on the end of a fall You can do that with while loops as well And anything in that else block runs after the loop, but only if you don't break out of the loop so That runs for prime numbers because we haven't broken out of the loop. It doesn't run if you break out of the loop so It's a it's a neat way of Knowing whether your for loop ended Successfully or if it broke Else is a really really horrible Word to use there though Which is probably why this construct isn't used more often If anyone wants I can switch from sharing the presentation to sharing a Terminal window and run these programs to prove that they are actually the equivalents but We can maybe do that later because switching the presentation stuff is a bit tedious and annoying so Getting rid of go-to didn't just give us Nice execution flow that humans can reason about it did a whole lot more than that it let us have call stacks and local variables in our functions and The reason for that is that go-to doesn't just jump around inside a function It can jump to anywhere in the code So you can jump from the middle of one function to the middle of a completely different function and If you can do that then you can't reliably you can't rely on local variables because they may not have been set You've got all the ones from previous function instead of the new function You your call stack doesn't make sense anymore because you're now in a function that was never called So getting rid of go-to lets you have these things You can have exceptions and nice error handling because now you've got call stacks. You can unwind them you can jump to your caller you can you can wrap a try accept around some nested piece of code and catch any error that handles that happens inside it and Something that Python does particularly well is that you can have context managers and with blocks so With open file name as file You can open a file for the duration of a block and then no matter how you leave that block whether it's By reaching the end of it or raising an exception or Anything else you'll get you'll be guaranteed that when your program flow leaves that block The file will be closed and you can do this with anything So that's all all these wonderful things you can get just by Not having go-to anymore so that's some history and The first the part of this the name of this talk is a concurrency we've talked about structured in The history lesson and now let's talk about concurrency So what is concurrency here some words that spring to mind when somebody mentions concurrency I can throw in a few more Race conditions Philosophers with strange habits and cutlery, etc. Etc. All of these things are associated with Concurrency, but they're not actually what concurrency is So at its core Concurrency is having multiple independent flows of execution. So instead of having One flow of execution through your program Where you go from line 10 to line 20 to line 30 to line 40 and then jumping back to line 20, etc, etc, etc You can have multiple of those happening maybe at the same time maybe Switching between them maybe interleaved The ordering of those things doesn't matter because they're independent. They can happen in any order and the thing that makes it interesting and difficult is having communication between those flows of execution Because if they don't talk to each other or know about each other, they may as well be different programs running on different computers In different cities doing different things that don't know or care about each other at all technically my program running and your program running are Concurrent systems, but they're completely disconnected from each other. So It doesn't matter when you've got communication. It gets interesting and useful and also difficult So concurrency has its own go-to which we use a lot I'm going to call it spawn Different languages call it different things The one the name that's closest to go-to is the language go which calls its spawn primitive go But having go and go-to and go is a very common word. So it's confusing to talk about so I'll stick with spawn This is Erlang which is a language from the 80s. So about 10, 15, maybe 20 years after basic with its go-to Don't worry too much about the syntax. You're not expected to read and understand this program at a glance but there are two main parts to this program the first bit which is now highlighted is a recursive function which Every time through the loop that prints out a message and then sleeps for 200 milliseconds and then Prints out the next message it counts downwards rather than upwards like the basic program did just because it's Easier to do that with recursion and a little less code Adding a few lines I could make it print forwards instead, but that's just extra complexity and Then the big thing that makes it concurrent is in the main function here we spawn at once with Task one is the message and then we spawn at the second time with task two is the message and then we just wait for them to finish So I will actually Switch the presentation thing and Run this If I can find the right Window there it is so here is that Erlang program running and Maybe it's running a little fast, and you just make that instead of 200 milliseconds, let's make it a thousand and let's make that sleep for Six seconds There we have it both tasks running counting down simultaneously and At the end we finish Async IO, which is the standard library concurrency async framework for Python Caused create tasks instead of spawn so here's a the previous the equivalent of the Erlang program written using async IO We have the function that prints the message every 200 milliseconds sleeping in between and We have the main function which spawns two tasks And runs them concurrently But it doesn't just spawn the the two tasks it Hey, I can see my mouse go sit here it first Runs the function once kind of synchronously so waits for that to finish and then it Spawns two tasks in parallel. So these two things will happen At the same time, but only after this one is finished but the important thing here is that a task is just an object that can be passed around and there's no link between the place the task is created and where it starts running and Where you wait for it so You can have things that go get lost and go missing and tasks that get orphaned so here is One of the big problems with async programming in Python, especially but in pretty much any async system Where do errors go? so we have a An asynchronous function that waits for a little bit and then throws an exception and we call it in three different ways So the first one we call create tasks. We assign it to a variable and then we just never wait for it so the error from this task just Nothing ever catches it. There's nowhere else in the program for it to go because we never wait for it to finish So the error for this one will get lost What Python does is When the task object gets garbage collected, which can happen at some arbitrary time in the future We'll get a runtime error that gets logged And if you're not looking at your logs, well, you don't know It's not even an error. It's a warning. We just don't know what happens with it The second one We create a task and we do await on it, but only later so It raises its error the error just sits there waiting in the task object And then when we await it right at the end of the program then we get the error and then it gets raised But again, there's no link between where the task is created and where we wait for it So if we create this task in some deeply nested complicated piece of code None of that shows up in the stack trace the only thing that shows up in the stack trace is the function that raised the error and the main which called the await it's a little more complicated than that but basically stack traces are Very confusing and often have the wrong things in them when you're dealing with ace and exceptions and the only one that actually makes any sense is this one where we Create the task and immediately await it But if we're going to do that we may as well not create the task at all. We may as well just Await raise error Because we're not doing anything Concurrently there we just running Sequentially running stuff in another task So this is one of the big problems not the only problem with async stuff But one of the big ones that's kind of equivalent to some of the problems we saw with go-to where you can't do proper error handling And there's no way to know Where your control flow came to at any point in the code So this brings us to structured concurrency Which I'm not going to have an example here because that's what the next section of the talk is for But the core idea in structured concurrency is that any task that spawned in a block of code Must finish in that same block of code So there's no way to spawn a task and then move on to do something else and your programs somewhere completely Hidden away or in a completely different part of the program when your Errors happen or when your tasks finish So this is one idea. It's a very simple idea, but it's very powerful. It's the same kind of idea that you get from go to not Being able to jump from the middle of one function to the middle of another function So the advantages of structured concurrency as we'll see is first you get execution flow that humans can reason about the same as with go-to we can now See what our code is doing. We know where all the Concurrent tasks are happening. We can have nested contexts and task local states So that's kind of like having a call stack and local variables only with concurrency so I can spawn a a task and then that task can spawn its own tasks and The the My spawn task will wait until its tasks are all finished before it finishes and everything is Nicely ordered and structured and there is always somewhere for errors to go and We always know what states our program is in Which brings us to well-defined cancellation and error handling so if you have if one of your sub tasks Raises an exception which isn't caught. Well, that bubbles up to the the block that spawned it and then everything inside that block gets cancelled Because that's what happens if you get an error your your block stops and then you'll you get that exception raised More more detail on that in a bit But most importantly no often tasks or missing results So you never have code that just goes on forever and everything's lost track of it And nobody knows what it's doing and it could be doing anything So the third part of the the title of This talk is trio, which is the work we'll be using Which trio is built on the ideas of structured concurrency So the best way to learn about trio is to read the documentation So I think we're done here. You can all go off and read the documentation and Everything will be great The docs are actually excellent, so you could just go and do that. That's what I did But since we still have a few minutes left of this talk, I will actually talk about trio some more So brief Intro to async programming. You've got async code that can call sync code But don't call anything that blocks and Sync code can't call async code Sync and async sound way too similar when I'm saying them But anyway synchronous code can't call async code Unless you're writing a framework like trio or async IO, and then there's some special mechanisms to make that work So here's some ways you can get async code wrong The first is to try and await something in a sync Function and that you'll know immediately because the compiler gives you a syntax error You're not going to miss that in your code The other thing you can get wrong is to forget to await an async code So here we have we're calling an async function, but we're not using a wait and This you get a runtime warning logged and you You don't get well what you actually get printed there is the representation of a coroutine object That's kind of an implementation detail and this is unfortunately a symptom of the concurrency permits have been Python is basically spawn and We're building the structure on top of spawn. So it's the same as building your for loops and while loops and if blocks and whatever on top of go to if go to is still there under the hood while you can get some of the benefits of Not having go to you can't get all of them You can't trust that there won't be some code somewhere that users go to and breaks everything But trio does provide some tools for debugging stuff. I'm not going to talk about Trio dot ABC dot instrument any more on this talk, but it is incredibly useful and the docs explain how to use it So if you find yourself with code that seems to be doing weird stuff It might be a bit starting point for the wagon. So how does trio actually do the structured compact concurrency? The main thing here is it's called the nursery. So instead of having The equivalent of a sync IO create task, which you can call anywhere The only way to create a a background task is to use a nursery so this code as before we have our Task function that just prints a message in times And in our main function here instead of using create task we use nursery dot start soon and We have this async block With the nursery so async with open nursery as nursery and what that does is Any tasks started by this nursery Will cause the block to block basically the block doesn't end until all the sub tasks are finished So you'll notice we never await anything in this block but both tasks Will task one will print its four lines task two will print its two lines Both of them will finish before we leave this block and get to the all done and to answer Bruce's question because this is a good time for it if you want to have separate sort of overlapping lifespans for tasks The way you can get that is a nursery is an object you can pass a nursery into a sub task So if you want to have A task that's allowed to create sort of sibling tasks You can pass the nursery to that task and it can create tasks that outlive itself, but they don't outlive the nursery But there it's explicit. You can't do that without explicitly passing the nursery to your sub task So that there's a mechanism for it, but it your life's your lifetime of your Your tasks is still limited. You can't outlive the parent. It's just you can You can pass the parents around and create You can have your sub tasks create sort of siblings instead of their own children I've been ready to have an example for that though, but it's straightforward enough to to figure out if you play with it a Really nice consequence of this is timeouts If anyone has ever tried to have a timeout on an HTTP request You will understand the pain There are multiple different kinds of timeouts. Are you talking about a Connection timeout, so You'll stop trying if you haven't made a network connection within some amount of time, but once the connections made well the the server can take three hours to send your response and You just have to sit and wait or maybe you have a timeout on The server or the amount of time between Data being received it gets really complicated and what if you want to run three? make three requests Sequentially and have a timeout for all of them. Well, you have to now figure out how long the first Request took and subtract that from your timeout, etc really really nasty with trio you use the Async block, so you've got The first thing first bit of code here is trio don't move on after with three seconds So the entire body of that block will Take at most three seconds So here we've got four seconds worth of stuff We Go to sleep we wait a little bit we print still sleeping we wait a little bit and then we wake up on our own After three seconds that block will be cancelled And because it says move on after we don't care about that. We we don't care that it didn't finish We just care that it didn't take too long So no exception will be raised. We'll just move on If I put my focus on the place I can move on The next one is fail after which is the same as move on except instead of just moving on it actually fails so we This example also has a nursery Because the nursery is inside the the timeout block The the timeout applies to all the subtasks and the body of the nursery and Because we use fail we get an exception if we take too long so This simplifies timeouts immensely all you need to worry about is how long do I want this block of code to take at most and Then you can do whatever you like inside that and be guaranteed that your your entire block when Take longer than that and I mentioned things being cancelled so the timeout is kind of a special case of Errors and cancellation so Cancellation is Basically killing a task. It's a little more complicated than that under the hood there are checkpoints where trios or framework gets to Pass bits of code well Gets to raise exceptions and stuff basically any time you call a function in the trio name space or an async function in the trio name space There's a checkpoint inside there where things can be cancelled If you're just writing code that uses trio all you need to worry about is as long as you're not CPU bound and as long as you're awaiting Something in the trio name space somewhere in your full chain Your stuff can be cancelled But Any time you've got an exception in any of your subtasks an exception That exception sort of bubbles up and whenever it gets to a nursery all Tasks inside that nursery are also cancelled and then you can catch exceptions as normal around the nursery trio does also handle the case where multiple tasks raise exceptions simultaneously Which is also a little more complicated than will fit on one slide but the docs do a really good job of explaining that and In general, you don't need to worry about that too much unless you're doing multiple different things comparantly in a nursery and Trio also comes with a whole lot of other features. So it's got Task local storage like thread local storage except for tasks It's got mechanisms for communicating between tasks. So events and channels are the main ways of using communication But they're also lower level things locks semaphores, etc. Etc. There if you need them You can use Asynchronous generators, but because of the way they interact with a low level Sort of spawn level stuff in Python. You have to be really careful with those. So it's probably best to avoid them If you really need to unless you really need them and then be careful You can use threads if you must and that's actually the title of the Section of the documentation that explained that talks about them threads if you must You can have async file system operations, which is really difficult to do because Kernels generally don't provide good tools for them trio cheats and basically just wraps synchronous file system operations in threads under the hood But it's a good idea to use those to avoid maybe blocking for 30 seconds If you have a very busy disk or something You can have sub processes all the good things that any async Framework provides So in conclusion What have we learned so far? Firstly, we've learned that go-to is bad, but hopefully we already knew this and in fact if Python is our language of choice We don't even get go-to. It's never been in the language Just like go-to spawn is bad no matter how you spell it and Structured concurrency is great Hopefully I've given you enough of a taste of it that you you're you're interested in trying it out and If so, you should definitely use trio for your next async project Personal anecdote we have a System that talks to let's encrypts to manage certificates for a cluster and That's built with twisted because we built it like five years ago and It worked great until Let's encrypts Retired their version one API and the library we were using didn't support version two and it was quicker and easier to learn trio and Reimplement the whole system in trio from scratch then it was to try and update the library to use the slightly different to API and Deploy a new version of the thing we already had so Trio is great and that's the end