 Okay, so before I introduce myself, I want to tell you a quick story. So recently I interviewed an engineer for a position who had a Node.js background. And Node.js is really great, but this is Node.js conference, no, no, shit. So he had a lot of Node.js background, and I asked him a question, I said, so what do you like about Node.js? And he said, well, it's single threaded, it's awesome. And I said, okay, what does that mean? And he couldn't explain it. So he didn't get the job, but the important part of the story is that it made me realize that maybe there are developers out there who either using concurrency and async technologies or want to use it, and they really don't know what it means, and they're kind of scared of this whole topic. So it's interesting that I think this room alone hosted three talks about async.io just today, which really is pretty significant and tells that a lot of people are interested in this. So this talk, I'm really going to try to demystify what this is and why is that relevant within the context of web development. So hi, everybody. My name is Amit Nabarro. I come from Israel. This is a photo of Tel Aviv. It kind of looks like Rimini, I suppose. It's really hot right now, but it's a lot of fun. I work for 4.75 Cumulus, which is a consultant agency, and I rent and let out on my Twitter feed over there. Okay, so how many people here are doing web development with Python? Okay, that's a good number. So you guys probably know at least one of the frameworks which are now on the screen. Probably Django is the most prominent one, or at least one mostly used, but there are all other kinds like Pyramid and Flask, and I think actually it's a really long list. And all those frameworks, they have, first of all, those are frameworks. Those are not web servers. If anyone has ever made a mistake to confuse that, those are libraries. But the important thing is that they all have one thing in common, and the common thing that they all share is that they all work or implement WISGI. WISGI is a standard which was first established in 2003, and basically what it is, it's a glorified CGI, if anyone is old enough to know what that is, and basically the whole point of WISGI was to try to create specifications where Python web frameworks can easily work under production web servers. And the way WISGI works is that you write your app in a framework which supports WISGI, and then pretty much any production web server like Apache or Nginx can hook up to it and serve your application. And you don't really need to know how that works, and you don't really need to care, you can really just focus on the application logic. So, but what's wrong with WISGI? Well, actually nothing is wrong, so thank you all and have a great conference. I'm kidding. There are two things that are wrong with WISGI. First of all, it's synchronous. It's synchronous in the sense that it can't handle multiple requests at the same time. I'm going to put that in quotes because it's not entirely true, but we'll talk about this in a minute. And then the second thing that's probably more important that it only supports the HTTP protocol. It doesn't support any other protocol. So let's dive into that a little bit. Okay, so if you are writing a WISGI-based application, this is kind of how your flow looks like. You have a web server or a WISGI web server. Every request that comes in, the WISGI server creates a thread, an OS thread, okay, an operating system thread. And then this thread processes your request. Your code runs in this thread and you handle it and you do stuff and you go to the database and you do all kinds of whatever your application logic does. And then when the response is ready, it's sent back to the client. And during that time, that thread is blocked. So if you were to run a WISGI server with a single thread and you are doing that when you're using run server, Django's run server, it's a single threaded web server, you can basically only process one request at a time. If you're running a production web server like Unicorn or WISGI, then potentially you can create as many threads as you want. But if you only have a few cores on your machine, it doesn't really matter how many threads you have, you are limited by the amount of cores of CPU cores. So I said earlier it's not exactly synchronous. Well, all your code is running synchronously one after another and then you really don't get a lot of options there. So how do we solve this? How do we get up? I mean, there are some huge systems out there, Instagram running on Django, Washington Post running on Django. How are they doing that? How are they processing so many requests? Well, what they do is they scale. That's what they do. They just run more servers and more servers and they optimize their code and they run more and more servers and that's a pretty good approach. It solves your problem. The only thing that is disappointing in this approach is that it's very linear. Okay, so eventually, depending on your code, depending on what you're doing, let's say you're a single thread or a single score can handle 100 requests per second. Okay, so if you need to handle a thousand, then you need 10 of those. Okay, and if you need to run more than that, then you need to duplicate your server. So the scaling is very linear and you cannot actually improve where your pain points are and where your bottlenecks are. Okay, so that's one of the biggest problems of Whiskey. The second problem is that it only supports HTTP. Okay, so HTTP, anyone doesn't remember, is a stateless protocol. It means that the client sends a request. Quests is processed and then response goes back to the client and that's it. There's no more connection anymore. And if the protocol is stateless, then it's very difficult to create stateful communication between clients and servers and stateful communication or be directional communication or whatever you want to call it. It's a pretty hot item. Everybody wants to do it today. And HTTP just doesn't support it. So there are all kinds of work rounds. Okay, you can do long pauling. You can do service side events. I don't even know what that is. Really nasty stuff. So we reached kind of like a glass ceiling. Okay, Whiskey is wonderful. It allows such a huge community of engineers all over the world to quickly build web applications with Python. But for some apps, it's just not really relevant anymore. So there is a solution. And that solution is concurrency. That's a hell of a word. Let's see what concurrency means. Okay, let's look at Wikipedia. So according to Wikipedia, concurrent programming is a form of computing in which several computations are executed during overlapping time periods concurrently instead of sequentially. Okay, that's what Wikipedia says and we all know that Wikipedia is never wrong. So we're going to take that for granted. But what it actually means, okay, what it actually means, let's see if we can use this diagram here. But before I talk about the diagram, I want to make sort of a statement. Okay, and my statement is that most web applications, most of what they do is they perform IO operations. Okay, I'm talking about web applications. Okay, mostly that's what they do. They go to the database to fetch data. They go to the cache to fetch data. They go to the file system. They send HTTP requests to other servers or microservices or what not. These are all IO operations. And those IO operations are IO intensive, but not necessarily CPU intensive. Okay, and then what happens, what ends up happening is that those web servers, they mostly just wait for things to happen. Okay, we have those really powerful CPUs. And most of the time you just wait for IO to leave or come back from some other place. So if we were to look at this example that I'm, the diagram that I'm showing here, if we had a single threaded Whiskey server and it got four requests from four different clients, it will process them sequentially. Okay, it'll first do the blue one and then the orange one. Then the green one, you guessed it, purple one. Okay, and concurrency is all about doing it differently. And instead of doing this, we want to do this. Okay, what does it mean? It means that, okay, I am sending an IO request to my Postgres database. And while I am waiting on a reply from my Postgres database, instead of waiting, I can do something else. I can handle another request which maybe wants another Postgres query or maybe wants me to get a file from the file system. Okay, so instead of doing all those requests sequentially, we sort of mix them together and we hand over control from one request to another. Okay, in order to optimize what the computer is doing. So here's another way of thinking about it. If you look at the bottom video, you see a eight-year-old request and a six-year-old request happening sequentially. Okay, first, what is it? Peach one and then the white one. And then on the top video, you see them happening concurrently, right? They're both processed at the same time. By the way, they're very proud to be here today, just to let you know. So, okay, okay, good, good. You're saying we can process one request while waiting on the other request? Okay, sounds good, but how do you do that? Well, there's only one secret ingredient to this. Nobody don't let anyone tell you differently. You have to explicitly give up control. I know people don't like that term give up control, but I think that within our context, that's a very good thing to have. And really the emphasis here on the word explicit. While you run a query on your database, in your code, you will explicitly hand over control to some other request that comes in to do its thing, knowing fully well that someone else will eventually relinquish control and give it back to you once your query returns from the database. Okay, I know that's kind of scary for first timers, but let's see if we can clear that up. Okay, so if you've been doing some JavaScript in your professional life, then you probably know this pattern here on the left where I use jQuery to get some data from a web server. And this is pseudocode, by the way. And then as soon as that data comes in, I have a callback, an anonymous function in this case, and I call, I do something with this callback once it comes back. And everyone who's done JavaScript before knows that if we look at the code here on the left, we know that do something will execute after do something else, okay? Because do something else is called immediately when get is done, and do something is called as a callback once the request finishes. In Python, we can do something very similar. It depends on which version you're running, but in Python 3.5, and I'm sure you've heard it before, we have those new cool words, await, async. And essentially, this is kind of similar. It's not exactly the same, but it's kind of similar where I call a function called fetch data, and fetch data and fetch data isn't a regular function. It's what's called a coroutine or an async routine. And if you remember earlier, I said we relinquish control. We say, okay, I sent that request. Now I'm going to wait for it. Somebody else here, go ahead, use the CPU. And once my data is fetched, I can do something and then I can do something else. And in this case, do something is executed before do something else, because in Python, unlike JavaScript 5, we have an easy way to write asynchronous code in a structured way, while in JavaScript 5, we cannot. JavaScript 6 kind of copied every cool thing from Python, so now they have await and yield and whatnot. But those concepts have been around for a while in Python. Before 3.5, we used yield from rather than await, and we used decorator rather than the word async. Okay, so how does this work? Earlier, I said that a whiskey server operates under the assumption that every request creates a new thread and the request is processed by the thread, meaning that if you get a lot of requests, you get one thread per request. There are two problems here. Problem number one is that thread management is done by the OS, not by you, which I guess most of the time it's a good idea. And the second problem is that thread creation is expensive and limited. Creating a thread, destroying a thread is an expensive OS operation. It takes time. A single instance or server is limited with the amount of threads it can create, okay? Therefore, if your server has to handle 50,000 requests a second, it's going to run out of thread at some point. The way concurrency works is that everything is handled on a single thread. And unlike that interviewee, I'm going to actually try to explain it to you. So what concurrency is using is using a concept which is called an event loop. Event loop is something which is triggered by a mechanism in the operating system kernel in Unix. It's called epol. I forget what it's called in Windows. And essentially what it means is that we can create functions. We can call those functions, but we don't actually call them. We just declare them, stick them into the event loop, and then at some point they're going to be pulled out, executed, and then shoved back into the event loop for our code to receive, okay? So here on the right, we see all those different kinds of operations which we can do asynchronously. We have file system access. We have data store access. And then a request comes in. It creates a coroutine. It shoves it into the event loop. And the event loop, it has a queue, right? It just has a queue of a lot of handles to coroutines which it processes. And it will process them one at a time. And as soon as one coroutine explicitly gives up control, okay? Let's go back to that giving up control thing. As soon as you call a wait or yield from or whatever mechanism in different languages, you basically tell the event loop, okay, I'm going to wait now. Go ahead, run those other coroutines in your queue, and then come back to me when I'm done. And to me, that's kind of like good citizenship. You say, okay, I'm not going to waste those shared CPU resources. I'm going to wait on my thing and then knowing well that I'm going to get back the CPU once my data comes back. So essentially, this is how event loops work in a nutshell, okay? It's a little bit more complicated than that, but we don't have time for that today. And then that really solves the problem of processing a lot of requests at the same time. The second thing which is really nifty and cool is this thing called WebSockets. We're going to go back to Wikipedia here, okay? So WebSocket is a computer communication protocol providing full duplex communication channels over a single TCP connection. Okay, first of all, do not mistake WebSockets with regular sockets. It's not the same thing, okay? WebSockets is a protocol on top of HTTP which was created for a single purpose and that is having B-directional communication with a browser. That's it, okay? And how does that work actually? You have a client. The client makes an HTTP request to the server and then with a request to upgrade. I want to upgrade my relationship with you from a single direction to a B-directional communication. And then at that point, both server and client can send back and forth requests to one another. And then each side can also terminate the request if needed. That's really the shortest explanation of WebSockets ever. Okay, so let's talk a little bit about what's available for us. When it comes to libraries that do concurrency, we have Twisted and Tornado, which are pretty old. By the way, Tornado is just absolutely fantastic. And if you are forced to use Python 2, then Tornado is probably your option. Otherwise, I would suggest you use Async.io. It's part of the standard library and it has a growing ecosystem around it and it's very promising. If you need a web framework, let's not confuse framework with libraries, then on top of Async.io, we have Sanic, which I was exposed to first time here in this conference and I was really impressed. If you come from Flask, Sanic would be really nice. AOHDP is also a fantastic option. I've been using it for a while now and I'm very happy and I'm very happy with the way it performs and the way it moves forward. And if you are on Django and really doing non-Whiskeys, not an option for you, then Django channel seems to be a good compromise and I suggest you look into it. Okay. The advantages are efficiency, kind of self-evident by what I just explained, having not having to wait for something to happen in the meantime, being able to do something else. It solves the C10K problem. C10K is, you can look it up on Wikipedia, essentially a description of what happens when you have 10,000 requests per second on your server. You can spawn tasks and you can do it easily without a system like Celery and not Celery is amazing, but if you just need simple stuff, then you can do that without Celery and obviously it gives you B-directional communication and that's really, you know, if you want to increase the user experience of your apps when B-directional communication is kind of a must-have. Pitfalls, very hard to debug, even worse when you have to test, if I had more time, I'd show you. You really have to watch for locks and race conditions and once you get your hands dirty, you're going to run into those and the most important thing is, I tell that to everyone, if you don't write concurrent code all the way, then you're wasting your time. So if you have a concurrent web server but your database access isn't concurrent, then you've done nothing. Everything is not concurrent. So your code has to be concurrent all the way. Thank you. Thank you very much for listening. We have time for questions. Awesome. So you showed us the process with the keyword await, which handles giving up the control. What does the other keyword is doing? The async, I mean. Async is essentially telling, it acts kind of like a decorator. It takes a normal Python function and it says instead of just executing it, return something which is called a future and unfortunately we didn't have time to delve into that and then you put that future into the event loop and that gets executed on the event loop time rather than immediately. Once you dive into this and you get your hands dirty, you will figure this out really quickly. So you said that the main thing is that you should give up the control explicitly. Can you compare that or can you just say a few words about solution like G event or like about the implicit context switching? So G event, thanks for bringing that up. G event is an implicit way of handing over control. By the way, Django channels is using G event under lib event. I'm not really sure what the difference is. I think the main difference is that you really have absolute control over when you are handing over control rather than let the library figure it out based on what's going on. I think that's the main difference. Over here. Great talk. So you said async returns the future, so coroutines return futures and the await keyword kind of gives control to the event loop and after this is ready we continue with the rest of the code. Is there an API if we want to do something more complex like work with the future object, make a promise chain when everything's resolved, call this callback, something like in JavaScript world right now? Absolutely, absolutely. The example I showed is like the simplest example. You can launch hundreds of coroutines, finish calling all of them, collecting all your future objects and then wait for them. You can say, okay, they're going to happen one after another, but if one of them fails, I want out. You have a lot of flexibility. If you're coming from the JavaScript world and you're doing jQuery with promises and then and when and all that, it's kind of similar. It's not the same, but you have a lot of flexibility. There's still time for one final question, so if not, then there will be a coffee break. Please give Amida a hand. Thank you. Thank you very much.