 Hi, good morning, everyone. Good evening, everyone, brother. First of all, I'm really excited to be here at EuroPython. It's my first time. And it's also amazing to be the speaker here. So thanks to everybody who was involved in selecting talks and giving me this opportunity. Today, we are going to be talking about non-blocking with specifically how does that work with Python. So a very high-level overview. We are going to look at what is non-blocking IO and try to understand this by examples. And essentially, why do we need to do this, really? So this is going to be not very practical examples, but understanding of the concept as to what really goes on when you talk about non-blocking IO and whenever you try to use that practically. And it is a rather beginner-level talk. So expect the content to be like that. But just a little introduction about myself. I'm Vedic. I've been working with Python for about four years. As of now, I work with a startup based out of New Delhi, India. And I'm an infrastructure engineer there. And just in case you want to connect with me, you have some of the social networks links. But some background as to why I'm doing this talk when, long back when I was in college, I started out as a web developer. And just out of curiosity, you start moving down the stack, exploring more things. And somewhere there was this project which required, it was a web application required a little bit of scaling to handle more connections on just one box that was very limited resources-wise. And just along the line, when I was checking out some material, some content around how to scale web applications, I encountered G event. And, well, I used it. But then it was that time when I didn't really understand how it worked. So I always wondered, how does this thing really work? But other than that, what I'm going to talk about today is something that I have not really seen anybody talk about, especially at any Python conferences. You would see just about every other Python conference have a talk on Twisted or Tornado or Async.io these days. In fact, I think there's a workshop going on right now, parallel to this talk. But nobody really talks about what are the underlying infrastructures that these libraries or frameworks make use of. And I plan to shed some light on that. So non-blocking IO, or just before we start understanding what it is, let's look at what is blocking. So blocking, just a very simple definition of it is a function or a code block is blocking if it has to wait for something to complete. That's probably the simplest definition I could come up with. And what that means is that, let's say if you have a function that makes a HTTP request to another API and does something with whatever response it gets, the script cannot really progress until it gets a response. Extend the same example to any kind of network interaction network called. For example, talking to your MySQL or Postgres database, and your application cannot progress or do the next thing until you get a response back from that. Other examples could be you have some function that does some statistics or some mathematical computation. And that just takes time. For example, some complex integration. And it may take a while before the next thing can be done in the program or in the application. Another example could be waiting for the user to input something, for example, on the console. All these things are blocking. So the problem with blocking code is that it is capable of the delaying execution. And as long as tasks are related to each other, that's fine, because you cannot do one thing, because it depends on the other one. But what if there are independent tasks in your application which can actually progress with each other, if not at the same time? So for example, you have a single-threaded web server, which is usually the case in Python world. We don't really write many multi-threaded web applications, at least with Cpython. And you get a request. Your request handler is running. And it makes a call to your database. And at the same time, you get another request. So the first handler is running. It's single-threaded your web server. It cannot really serve the other request, because the first request is blocking the other request. Another simple example could be that you have workers consuming tasks from some queue. And usually, you offload heavy processing to workers, which asynchronously process your tasks. So if your worker is doing something and it is blocking, well, it could be any reason why it's blocking. But if it is non-blocking, it is blocking because of IO, then you're basically not doing anything at that time, because when your code gets blocked because of IO, you're not really making use of any CPU. But what it really comes down to is that the overall system is not able to progress. I mean, blocking is fine. But blocking other things that are independent, which could be done while any other thing is blocking and not doing anything useful at that time, that is not good. And as engineers who want to write systems or applications that need to serve multiple of users or multiple of consumers, they may be users or they may be any other application, we don't really want that. So now let's talk about IO. At least for modern applications, the applications that we write today, things like dealing with a network, reading or writing from your file system, doing operations on pipe, these are the kinds of things that would fall under IO. That may not be exhaustive, but in general, if you want to define any kind of IO operation, it would be dealing with file descriptors or doing any operations on file descriptors. So today, at least for the scope of this talk, we are going to talk about dealing with a network. And how to implement non-blocking IO while doing networking in Python. So non-blocking IO is essentially dealing with IO so that it does not block execution or execution does not get delayed because of IO. And to understand that, let's look at some example. I don't know if that is good enough for people to read. No? Yes? Okay. It's small, yes, it is small. That doesn't seem to be helping. Okay, what I can try to do is... Better? Yes, all right. So, yeah, so we look at some code in women instead. So this is a very hello world kind of a thing that we look at when we start out with network programming and Python. This is a very simple server. And a written Python, all it does is it accepts connection. It waits for some data from the client that has tried to connect, gets the data, prints that data, and then waits, tries to get more data and does that as long as the client is trying to send some data. So this is a very simple script. Probably most of us here in the room who have done any kind of network programming in Python would have written something like that. So for the scope of this talk, we are not really going to be making use of this anymore. This is as simple as it gets. We won't really look at this again. And here's an almost as simple client as our server was. There's, again, nothing. It just tries to connect to the same server, this client, and tries to send some data. To be precise, it tries to send about 70 MB of data or so. So when we should, in fact, have the server also open, I don't know if that is, I was counting on my slides too, so that I hope that content was readable. But probably we'll have to make do with this. So here, there's a server again. And if we look at this code on line 11, the server blocks because it waits to receive a connection. It won't proceed further unless a client tries to connect to the server. And after that, on line 13, it blocks until the client sends some data to the server so that the server can actually read the data and do whatever with the data it wants to do. Then it prints out the data and then blocks again when it tries to get more data from the client. And keeps on doing this until the client stops sending any data. On the other hand, our client script blocks at this call where the client is actually trying to send all of that data, that is 70 MB of data. So I mean, it's understandable. That will take a while. And both the client and server, as of now, at least, they block. So I'll just rush through the slides because this is what I was talking in these slides there. And the problem with this is when we run this, assuming that the server is running a client, the client takes about 45 seconds to run on this machine that I'm using right now. And while it was just trying to, I mean, the client's script was not really trying to do anything as such other than sending the data. But if it really wanted to do something, it couldn't have because the script would just block on Socket of 10. So let's see how we can improve that. So for non-blocking network IO in Python, at the most basic level what it comes down to is this. You make a socket non-blocking by calling the set-blocking method on it and tell it to not be blocking anymore. So you pass it a false or a zero and it makes the socket non-blocking. And how does that work and how does it really look like in real world? So again, I'll switch back to the, I'll switch back to Wim here. And this is the same client that I was showing earlier, but just with one minor change, there's one additional line here. On line five, we have set the socket to be non-blocking. And everything else is exactly the same. And when we run the script or when we run the client, it's not exactly the same thing. We see an assertion error as we put an assert statement on the last line of the script. And it says the bytes sent to the server is not equal to the bytes we wanted to send. Now, this happened as soon as we changed our script, our client script. And what set-blocking did was that it did not send all of the data. So the subtlety here is that when in the previous script, when the socket was not non-blocking or the socket was blocking, how that send really works is it makes whatever system call it needs to make to send the data to the other process on the other end of the connection. But what really happens is the process copies or sends the bytes that it wants to send to the other side and passes that to the kernel. So the amount of data that can be accommodated in the right buffer of the kernel for that right call or for that send call gets passed to the kernel space. The kernel gets that and then basically puts the process to sleep because the right buffer is full. So, and that's the reason why our call gets blocking. So when the call was blocking, the process passed the number of bytes it could to the right buffer. Then the kernel takes care of that, sends it across to the other side of the connection. And then in the meantime, it puts the process to sleep. And then it brings the process back up, or wakes the process up when the right buffer goes empty. And then gets more bytes to send and keeps doing that until all the data has been transferred. And while the process is sleeping, it's not really making use of any CPU, so we could potentially do something else in that time. When we made the socket non-blocking, socket sent returned immediately. Basically what happened was it just transferred the number of bytes at the amount of data it could give to the kernel to send to the other side of the connection. And returned immediately saying, these are the number of bytes I could transfer so far. So it just sent the number of bytes it could immediately and not block at all. But what it gave us back in return was the number of bytes that were transferred. And that is useful stuff so that we can, as of now our client failed, it does not send all the data. But we can use that to send all the data in another way. So let's look at another improvement to that script. Here's a slightly different client. All we're trying to do is essentially the same thing. Our socket is non-blocking, but we put that in a via loop as long as we have not sent all the data. It just tries to send that data again and again and again. So socket send returns immediately telling us the number of bytes that were transferred. And now that we know how many bytes were transferred, we can try to send remaining bytes in the next iteration. So this is essentially how we made our socket non-blocking. And we just keep trying to send more data. And as soon as, the problem with the script here is, well the good thing is that we have achieved non-blocking socket or non-blocking IO here, but we are wasting again a lot of CPU cycles and running that via loop because most of the times we will not be able to send that data because write buffer will not be empty. So if we run this actually, we will probably end up spending most of our time in this exception block instead of actually being able to send this. And this, it will actually go here and succeed only whenever that is seldom the write buffer goes empty. And this is again a good improvement or a good change. We can now make use of this to make more improvements and probably do something more useful instead of just only trying to send this data. So I'll show you another improvement to this very client with some minor changes again. And this is again the same, pretty much the same client as we saw just before this. Again, but with one extra line, which is just this. And this is something new which probably, if you've not really ever tried to look into how non-blocking, not just non-blocking IO, but this kind of infrastructure works, this might look new. Select basically, so we are doing exactly the same thing, but our while loop will block at this line as long as we don't have the same socket available for writing again. So as long as the right buffer becomes, as long as the right buffer is full, this call will block here. So what Select does is, it stops us from wasting those CPU cycles that we were wasting earlier, trying to call solve.send again and again in every iteration of the while loop. But what does Select exactly do is rather interesting. So Select is nothing but a system call and it's an infrastructure provided by the operating system for monitoring file descriptors for events. So events like, is the file descriptor ready for writing or is this file descriptor ready for reading? Or is this file descriptor ready for handling some kind of exceptions? And what Select, the Select API that we just used, what we saw there is just a direct wrapper of the Cisco, but it also makes using Select rather simple as compared to the C API. But if you understand how to use it with Python, you would probably be able to make sense out of it when you look at some C code. And the signature is like this. You pass Select three sets of file descriptors, three arrays of file descriptors. And these arrays are basically three different arrays which you want to monitor for either a read event or write event or an exception on that file descriptor. And there's a fourth optional argument which is Timeout which basically tells how long do you want Select to block until any of the file descriptors passed for monitoring become available. So earlier in our previous example, we did not use that fourth argument. So Select would block indefinitely, but you can change that and adjust that according to your needs. And it returns a subset of file descriptors as passed earlier, telling these are the file descriptors which are available for performing whatever operation you wanted, you were monitoring them for. And when we talk about file descriptors, at least in Python world, what Select accepts is any object that implements the file number method. So usually sockets have a, I mean, not usually, in fact, all sockets have a file number method on them because sockets are essentially file descriptors. And at least in Python, we have, if you call file number on socket, you get the corresponding file descriptor for it. So a good improvement that we made here was that we at least, we are doing essentially the same thing that is trying to send data again and again in a while loop. But we make our process as long as we don't have another, we don't have our socket available for performing that operation. And here we look at one last example, which is where so far we have only been trying to send data in a non-blocking fashion and not doing anything along with it. But essentially the idea of doing non-blocking is that even if you are single-threaded, and while you are not doing anything constructive, you should be able to do something else if you can. So here's another example, continuing with our previous example, but just changed a little bit. We have created two tasks. One task is the same that we wanted to send some data to our server. And the other is just a task which makes use of CPU which just tries to increment a counter. And we have put a sleep there just so that if we, when we run this, the output is easy to read. So what we are going to do is we are cooperatively let both of these tasks proceed with the help of generator functions. So this is one task which is essentially a generator function. It increments a counter in a while loop and it yields after every iteration. And the other task is the same thing that we were trying to achieve in our client. And it's again the same code which is trying to send the same amount of data. But the only difference is that where we were calling select, instead of calling select there, we yield and we yield the socket object that we were using for sending this data. And we also eat what the operation that we want to monitor this socket for. And finally we have our main block where we implement another, a slightly different version of our previous ugly while loop. Probably this is uglier. But here what we do is we try to execute both these tasks one by one and then monitor for file descriptors and execute the corresponding tasks whenever their file descriptors are ready to take more data. So here we have, I mean, you don't have to read through the entire code but the essential thing to understand here is that what we are doing is we have a huge while loop where we are running this while loop as long as we have some tasks to execute or we have some file descriptor to monitor. So whenever there is a file descriptor to monitor, the reason is that there is some tasks that wants to perform, do something after that when the file descriptor becomes available for doing that operation. So all we do is we run every task one by one and if the task or the generator function for the task yields a socket and asks us to monitor it, we just keep a mapping of all those file descriptors and the corresponding task and further down we call select to monitor those file descriptors or socket objects and do whatever needs to be done accordingly. The difference in this select call is that we have used the timeout argument here and we have set the timeout to be zero because we don't really want to block. Why don't we want to block? Well, if the file descriptor is not ready for execution, we at least can let our other task which was incrementing counters proceed as long as these file descriptors are not ready for any operation on them. So we call select and on every iteration we check if the file descriptor is ready to be monitored and we create a pending tasks list where we keep pushing the task that needs to be executed in the next iteration of the while loop and this is pretty much it. So these changes, although not pleasurable again, probably I'll share this code somewhere so that you can read it when you have more time at hand. But what we really achieved here was cooperative scheduling using generator functions and select and we let two independent tasks proceed along with each other in a single thread script and you can actually say that this is probably our first network event loop implemented here. This is a poor man's scheduler that we also implemented. So I think I have just one minute left and I'm going to rush through the rest of the things as soon as possible. So we just look at select but our operating system actually provides more infrastructure for monitoring file descriptors. There's something called poll. Poll came soon after, I don't know if soon after select but the implementation or the technical details are pretty much the same other than the API itself. It is probably as bad as select. So time complexity for select and poll is probably the same. E-Poll and KQ are pretty much the defectors today. Most of the web servers like NGINX make use of E-Poll and KQ if you're using Twisted Tornado G event on Linux or BSD systems, chances are that they have E-Poll implementation or KQ implementation and you're using it. And I think I don't have more time but yes, I'd be happy to have more questions. Other slides were essentially about what other libraries you have in the Python world but probably we can catch up if you want to know more about that. So any questions? Well, this is maybe more related to the system but are you aware of the limit of asynchronous connections you might have? Any top limit? A limit of synchronous connections. I'm sorry, I'm not aware of that but if there is something like that, I'd be happy to know about it. Yeah, well in Linux, I was testing one tunneling server and it had like 4,000 because select in the Cappi of the command. About select, yes, yes. Select is indeed limited by the number of file descriptors but I think you can change that if you are compiling the operating system yourself. So there's a, but well, we don't do that. So yes, most of the times you'll be limited by the number of file descriptors you can monitor using select. But having said that, I mentioned something about select's time complexity that select as compared to E-Pool or KQ tends to be slower and nobody really uses it but this, it really depends on what is your use case of using select. So if you don't really have many file descriptors to monitor, probably select would be a better choice to make as compared to E-Pool before E-Pool actually starts shining out. Thank you so much.