 Python contributor and creator of Wallpapers. Hello everyone. I'm super excited to be here. So today I'm going to talk about Asingayo, one of the most popular library models that comes with Python 3.4. I know yesterday we already had two talks that cover the same team. But now I'm going to focus on the Asingayo library itself. I am not going to talk about tornado, twisted metrics, Jivan. So I'm going to talk how to program asynchronously, how to deal with many IOU requests in the short time with Asingayo library models. I just said that this Asingayo came with Python 3.4. And I know that most people don't use Python 3 in the real world yet. I mean yesterday I talked to some people. When I said, you still get stuck with Python 2, they all laughed at me. But bear with me because Asingayo can be used with Python 2. It's just you have to use differently. You don't use the special syntax yield form. I'm going to show you how to do that with Python 2. But before that I'm going to explain something more important than the technical how-to. It's the fundamental idea behind Asingayo. And I'm going to explain it in a different way than they did yesterday. So let me confuse you that learning Asingayo will be a good investment for you because if you attended the talk yesterday about the transfer pipelines in the Dreamworks, that's a cool software. And you can achieve the same thing with Asingayo. But first let me introduce myself. My name is Fajr Skycock. And I know by the end of this talk you wouldn't remember this name. So just call me Sky, please. Don't be shy, you can approach me. I'm a friendly person. And I live in Jakarta. I have from Jakarta to come here. So I'm very excited to be here. I work in Camp Caroline. The company that is responsible for Liputan Nam is one of the biggest news websites in Indonesia. And in my spare time I'm a regular contributor of Python Core. And I found it as a rewarding experience. But I'm not a core developer yet. So I used to contribute to Django Ruby Core. But time has become expensive for me now. And other than that, I'm a sole developer of Wallpapers. It's my first open source project that's built by Python. And it's used by many people around the world. Basically it enables you to have different wallpapers for different work spaces on Unix desktops. Back then, like many years ago, I used Genome. And I kind of envy of KDE desktop that can have different wallpapers for different work spaces. So I try to enable it on Genome by writing wallpapers. Okay, so let's get down to business. But first, if you want to contact me, you can use the first email if you want to work with me in Jakarta. Don't worry, it's a nice place. And use the second email for anything if you want to discuss the weather with me. Okay, so let's talk about the problem as in IO is designed to solve. The problem here is that IO is slow. Okay, I know everyone knows that. But the thing is, you cannot escape IO. IO is everywhere. As a Python programmer, you would have to deal with IO in one and another way. You may have to download the images from Amazon bucket. You may have to request data from database. You may have to communicate with another Python process. So IO is everywhere. Unless, of course, if you are the kind of programmers who writes an application which is purpose is to find the largest prime number under two billions, then you don't need to care about IO. The thing is, as a Python programmer, you have to deal with many IO requests in a short time. Otherwise, you will not get efficient program. So let's go to the real world applications. So this is like a basic of web crawler. So recently, I watched a YouTube video. I mean, I watched an advertisement on YouTube about a website that can help you find the cheapest price for ticket flights and hotel rooms. And they definitely use the web crawler. So I mean that people build business by writing web crawler. So here we build something similar. We create an application to get the web pages from three news websites, Spiegel, Guardian, and Lemone. And as you can see here, the line that was highlighted with blue color is an IO blocking function. It's going to get the web page and it fitfully does until it gets the web page or it gets killed by a signal or timeout or an exception. And that is the problem. As you can see here, I downloaded the web pages sequentially. So if you're going to build a business by writing this kind of code, you're going to be a bankrupt. If you are an employee, you're going to get fired by writing this code. But before I'm going to explain to you how to deal with this, how to improve this code. This model is kind of a solving problem. And we would get this graph. So here the arrow represents the linear time. So the color boxes represent the time when we get the real data, when we get the real batch from the server. And so in the blue color boxes represent the time when we get the real data from the Spiegel and the yellows for Guardian and the reds for Lemone. And the gap within the color boxes represent the time when we sit idly, when we wait for the next chunk of data. So as you can see here, the server doesn't send you the web page in one block data. They send you data chunk by chunk. And between those chunks, you will have to wait. And as you can see here, we have a long gap. And those gaps represent the time we waste. I mean in business, time is money. So when you look at that gap, you can see the gap is the money that you lose. So that's why when I said that if you wrote this kind of code, you're going to lose money. So Essingayo can help you to reduce the money that you lose. But before we solve that using Essingayo, I'm going to explain to you how to solve it with trading. The reason why I'm explaining how to solve this problem with trading is because when we compare both approaches, you would understand Essingayo better and faster. And aside, there is a time when Essingayo falls short and trading can trump. If you intended the talk yesterday about the transfer of pipelines in DreamWork, there's a situation when Essingayo cannot help you, such as listing directories, getting the M-time attributes of the files. So here, when we want to solve the problem of dealing with many IO requests in a short time, so the common pattern is you create each stretch for dealing with one IO operations. So here, because we have to download the three web pages from three news websites, I mean we have to download the web pages from three news websites, so we have three IO operations. So we create three separate threads, so one thread for each website. So because this blue color box represents the thread, so we have four threads. The upper box represents the main thread. Now, if we model this kind of solving problem, we work this kind of graph. So here, as each IO operations gets its own dedicated line, so for IO operations of downloading the web page from Spiegel gets its own dedicated line. So when we get blocked while downloading the web page from Spiegel, while we are waiting the next chunk of data from Spiegel, the other IO operations can still run. They don't have to wait for us. So as you can see here, while we get blocked for the first time while downloading the next chunk of data from Spiegel, the other threads can still get the data from Guardian and add another thread for LeMonde. And now here, how we are going to solve this problem with AsyncIO. So in AsyncIO, you only need one thread. So here we have one blue box. That is the only thread you need. So here, these coroutines you see are comparable to the threads that you saw just now. So because we have the three IO operations, so the command pattern is to create three coroutines. Each coroutine is responsible for downloading the web page from one news websites. But we have something different here. There's an event loop. So here, the event loop is like an intelligent machine that runs and executes the coroutines. So in AsyncIO code, you don't execute the coroutines directly just the way you do with trading. But you have to put this coroutine inside the event loop and you start the event loop. So in this case, you can say that event loop is something like the operating system in the trading case. It's the schedule, the task between them and the schedule, the priority between them. But please don't take that energy too far because they are kind of different. So if we model this way of solving the problem, we would get this graph. So here, it's same as naive model. The AsyncIO shared, I mean, all the IO operations share one line. But we have something different here. The color boxes are intertwined to each other. So here, we get the next chunk of data from Spiegel. Then we jump to the IO presence of downloading the web page from Guardian. Then we jump to the Spiegel again. Then we jump to LeMonde, jump to the Spiegel and so on. We do it to fight the IO blocking time. So when we sense that for this IO presence, we will get blocked, which leads to another IO presence. That way, we can reduce the gap. We will still get the gap between the color boxes in AsyncIO code. It's just that we would have the shorter gaps. It means in business, we lose lesser money. So here, if you could only remember one slide from this talk, let this one be. Everything is derived from here. So let's get down to the technical detail. So the common pattern, if you are dealing with many IO requests with trading, so you create as many threats you need and you create a function that inside these functions, you are going to execute the IO blocking functions. Then you feed this function to the threats, then later you start the threats, so on. So the lines that were highlighted with green color is the line where you create the threats, and the lines that were highlighted with blue colors represents the IO blocking functions. Technically speaking, print is an IO function, but for now you can ignore it. For this sleep delay, you can see it as a representative of the IO blocking functions that you need in your daily life as a Python programmer. So it means like OS.system, or in the previous example, URL live request and the sub-process pureband, then later you communicate with these pureband objects. Anything that you're going to get an IO request and blocks, this is it. So you put this IO blocking function inside the function, then you feed to the threats. So for the technical details as an IO code, they are kind of different. So here, instead of creating a function that is going to execute the IO blocking functions, you're going to create a coroutine. Basically it's just a function that you decorate with a coroutine decorator. And inside this coroutine, you execute the IO functions with the special syntax yield from. But these yield forms are only valid if you use the async IO with Python 3.3 and Python 3.4. So it is true that Python 3.4 comes... I mean, async IO comes with Python 3.4 in the standard library. But if you use Python 3.3, you can download the async IO from third-party provider, but they don't call it the async IO anymore, they call it tulip. If you use Python 2.7, because the Python 2.7 doesn't support yield from, they change the way you interact with async IO. Later I'll show you with. You can download the async IO from the third-party provider if you use Python 2.7, but they don't call it async IO, they call it troll use. Okay, let's get back to the technical detail. So you're going to execute these IO functions inside the coroutine with the special syntax yield from. Then later, these coroutines, you convert it to the task, you wrap it to the task. Then later you run the event loop. Remember, I said everything is executed inside the event loop. Here the loop object is the event loop. So now, if you take notice, there's something different here. Here we use async IO slip, but in the previous example, we use time.slip. Some of you may ask why we don't execute time.slip inside the coroutine. Why must we use the async IO.slip? So there's two questions. Why do we need to use yield from and why do we need to use async IO.slip instead of time.slip? So I'm going to explain the later question first. Let's get back to the fundamental idea behind async IO. Remember when I showed you this graph? So in async IO code, everything happens in one thread. So it means if you're going to execute a function that is going to block, so everything is going to get blocked. Every other coroutines will get blocked and the event loop will freeze. So that's why you have to avoid the normal IO blocking function when you write the async IO code. So it means when you are going to execute write async IO code, you cannot use the normal IO blocking function anymore. So it means if you're going to solve the earlier problem, you cannot use the URL loop request, URL open to get the web page from the news websites if you are going to solve the problem with async IO code. Because it will defeat the purpose of async IO. So you need something different here. Now here we have the async IO compatible functions. This function is going to execute the IO computations and it will not block. So here it will communicate with event loop. When it sends, it will get blocked. It will tell the event loop that hey, I'm going to get blocks. Why don't you switch to your attention to some other coroutine? So for all the IO blocking function that you need in your daily life as Python programmer, async IO library modules already provided the replacements. If it doesn't, most likely the third party provider already provided it in the case of AIO HTTP. This AIO HTTP is comparable to the URL request. So the basic idea if you are going to write async IO code with async IO library modules, you execute these IO functions. Whether it's going to get the web page from the server, whether it's get the data from database, whether it's to get the results from the external process inside the coroutine. But you don't use the blocking function anymore. You use the async IO compatible function and you execute it with the spatial syntax yield from. Then you get the result and everything else is the same. So no more blocking function in your async IO code. So when you're going to execute the IO function, ask yourself, is this going to block? If this yes, find the replacement. Most likely they are already replacements. So if you're going to solve the earlier problem with async IO, here's how we do it. First you get the request object with the spatial syntax yield from. This response object is the object which you can inspect the final URL and so on. Then later you get the web page itself by calling it from the response read. This way when you download the web page and you get blocked like a five times, this yield from response read will tell the event loop to switch the tensions for like a five times. So it will not get blocked so it will intertwine with other the coroutines. So now I'm going to answer the first question, why do we need to use yield from? But let me get back to the story of yield without from and the idea behind generator. So in Python land there are two kind of functions. The first one is the function of nothing or all. You're going to get all the results or nothing at all. It's either zero or one. It's either black or white. There's no gray color. Either you get the full result or you're done. That's what we call the normal traditional IO blocking functions. Either you get nothing at all or you get the full web page. But there's another kind of function which is a little bit special. We call it generator. These functions can let you to consume a part of the result. So you don't have to consume all the results. You can consume half of the results and stop it. And you can do other things first maybe like printing hello world 500 times. Then you can resume the remaining part of the results. So here we can see that special function we call it generator that gen. All the results is one and two. But we let the color to consume a part of it. So we consume the color of this generator, consume it by calling next. And we get the part of the results one. And after that we can stop it. Or we can do other things first before we resume the other half of result by calling next again. Now remember when I told you when you write the asyncio code, you use the asyncio compatible functions. Imagine that asyncio compatible functions is divine like this. So you get some data. You get the chunk of data or you get the 10 kilobytes. Then you give this part of the results to the color. Then you get the data more, get the part of the result, you send it back. So the color can do other things between the results or between the full result of this asyncio compatible functions. So who is the color of this asyncio compatible functions? It's our coroutines. Technically speaking, the asyncio compatible function is a coroutine itself. But for now we differentiate them to make it easier for you to understand. So inside our coroutines that consume this asyncio compatible functions, this asyncio compatible function is like a asyncio sleep. Now we get the part of the results by calling next. But inside this coroutine, we cannot do something useful between the part of the result. I mean, so here there's nothing I can do, there's nothing useful I can do between the part of the result, between the part of the web page. If I want to get the full web page so I can continue to do something useful. So inside the coroutines, we want to get the full result. We don't care the part of the results. But while the coroutine doesn't care the part of the results, the event loop needs it. Why does the event loop need the part of the result? Because the event loop wants to do something useful between the part of the results, between the chunk of data of the web page. Once it does something useful, it's to execute another coroutines. So that's why we get this intertwin's color boxes. This is the fundamental idea behind why we use the yield from and asyncio compatible functions. We don't send the full result, we send the results apart by part. So between this part of the results, we can do something else, like executing another coroutines that send the results apart by part itself. But it doesn't answer why we must use yield from because like now we just use yield. Now to understand why we use yield from instead of yield, you can see here this coroutines, we get this kind of boilerplate. So it's kind of wasteful of space inside our coroutine. So the non-booking functions send the part of the results in our coroutine. We pass it back to the event loop and the event loop may want to execute another coroutines. And that coroutines will consume another asyncio compatible function. That asyncio compatible function will send a part of the result of it. We have too much bureaucracy here. Our coroutines becomes obstacle of efficiency. So instead of becoming a middleman, why don't we let the non-booking function communicate directly to the event loop. So here that's why we use yield from. So I don't care about the part of the results. Why don't you communicate directly to the event loop? Once you get the full results, then tell me. Now so if we will get this graph, it will use the yield from. So no more middleman, the non-booking function can communicate directly to the event loop. Then event loop will do something useful between the part of the result. Now here is the thing. There is a time when asynchronous code cannot be met asynchronous. There is a time when asyncio cannot help you. You may have IO blocking functions that doesn't have the asyncio compatible version of it. In that case, you may execute it in the separate thread. So asyncio modules provides your convenient function to execute this IO blocking function. So it will not freeze the event loop. So here we use the IO blocking function, the URL live request, URL open, and the request object. And later we yield from the object. Then you can do something useful. So now I'm going to show you the final code of the downloader. So here, you can see it right in the back. So in the trading, for each news websites, you're going to create the separate thread. Then later you feed this function to the thread. Inside these functions, you're going to execute the traditional IO blocking function. And it will get blocked, but it's okay because it gets blocked in the thread. And that's not our concern anymore. That's operating system concern. So here we start the thread, then later we wait for them to finish. Now for the asyncio, you get this coroutine. Inside this coroutine, you execute the IO functions. This is kind of special. It will send the results part by part. But we don't care. We just want to get the full results. So that's why we use yield from. We tell the AIO HTTP request to send the part of the results to the event loop. Then once you get the full results, you go down one line. Here we need to get the full result before we can assign it to the body. But the response rate will send the results apart by part. But we don't care inside the coroutines. So we use yield from to tell the response rate. Why don't you communicate with the event loop directly? After we get the full result, then we can do something interesting with body. Something interesting such as printing to the SD out. Then you get the event loop object. This is the intelligent machine that I'm talking about. It's the one that's going to execute the coroutines. But before the coroutines gets executed, we need to convert it to the task. So here we convert it to the task, this coroutine. Once we got the task, we run the event loop to tell this event loop to execute this task. This is something like thread join. You want to wait the task to be finished. Then you close the event loop. So if we... Now, this is the client type of SINGIO code. It's like a web crawler. It's kind of interesting, but I'm going to explain to you how to write the server type of code using SINGIO. But first, why don't we run it? Because I saw a talk yesterday. They execute the code online. So here if we run it... Oh, I don't have the internet connection. Never mind, we just continue to the server. Here. Imagine that you have a process. Here we just printing the Mahatma Gandhi Squad to the SDD out, and we will simulate the IO blocking time with the sleep three seconds. So we are going to write the caching server that will get the result of the Mahatma Gandhi Squad program. So there's nothing special here. Yes, it's going to print Mahatma Gandhi Squad randomly after three seconds to the SDD out. And so for the... If we're going to solve this problem with threading, this is how we do it. So we're going... For here, I'm going to create the UNIX server with the UNIX file socket. So I'm going to create the UNIX servers that you created as a subclass of the threading mix-in and the socket UNIX stream server. Then you tell this class we're going to use this file. And there's a proxy server. In this proxy server, you can imagine it's like a protocol. So this is our IO blocking functions, the traditional IO blocking functions which will get blocked if you execute it. So here, this is the handle. Every client is going to get executed inside here. So this is the IO blocking function. Then you send the result. Now, let's simulate the server. These many clients, it just simulates like simultaneous clients... Simultaneous access to the server. So there's like four clients that's going to access the server. They want to get the Mahatma Gandhi Squad. Okay, so with threading, we can solve this problem. We don't need 12 seconds to get the whole Mahatma Gandhi Squad. We only need like 3 or 4 seconds. That's because the IO blocking functions... This IO blocking function gets executed inside a separate trap. So it's okay to execute the traditional IO blocking function inside a trap. So that's the idea behind solving many IO requests in a short time with threading. So let's take a look at how we handle this with SINGIO. And we will get something like this. First, you get the event loop. Then the event loop has convenient methods of creating many kinds of servers that you're going to need. For example, like a unique server, TCP server, datagram server. So other than the necessary details that the server needs, such as the file that you're going to use as socket, we're going to send like a protocol. This is the protocol. It's defined how you are going to have a connection with a client. So here we have this connection method. It will get executed when there's a new connection from client. And here we have this coroutine. This is the we use the yield from which execute this. SINGIO creates a process exec. Then we use yield from to get the status of the process. And this inside this connection made. Yeah, this is the coroutine. We convert this coroutine to the task. Then it will get executed when there's a client accessing our server. So this is when it gets executed. So now we have something different here, the transport. The transport is the ones that represents the IO computations. It represents our SINGIO server and the client. So with this transport, you can send, you can read. And for certain protocols such as if you are going to communicate with external process, this transport can give you the PID. But for the TCP, of course, it cannot give you the PID because this PID is specific to the process communication protocol. So here if we are going to execute it, now this is our four clients accessing the server. One, two, three. So we will not get blocked. Here the idea is if you are going to write a server using SINGIO, you create the kind of server you want to use such as Unix server or TCP server. Tell them what kind of protocol you want to use. But this is not the only way. You can send normal coroutines with except the two arguments. These two arguments represent the reader and writer. But for this case, we use the full-blown protocol. It has a connection made, the data receive, and the connection loss. But I'm not interested in the data receive and connection loss event. I'm just interested in the connection made event. So here we execute this like communicating to the external process, the process of producing the Mahatma Gandhi's quote. And we convert it to the task. Then you execute it with the server. And to handle the error case in the SINGIO, it's kind of different. In the trading, you can use the normal exception. You just use try cats. But inside the SINGIO, when we get the task, this is like a future. There's nothing interesting here. When we get this task object, you cannot get the results. It will get nothing useful. But this is a future. It's a promise. It will get your results once the task is finished. Then you must use callback. So this way I add a callback. This way you can check whether we get the meaningful results. If not, you can say, sorry, I cannot give you the Gandhi's quote. So this is the way how you handle the error in the SINGIO quote. So that's it. I'm thanking you for listening to me. And I want to thank one of the Python call developers, Sender Kumaran, who helped me pull out this talk. And I hope you get something useful from this talk. Thank you. Thank you, Sky. Questions? Thank you for the interesting talk. I want to check, can we add tasks while the tasks are being executed? We start with three tasks and then can we add one more while the task is running if you have the reference to the object? You mean, can I add the task while executing another task? Yes. So let's start with three tasks and then I add a couple more. Is it possible? So you mean you want to create the task while the event loop is running? Yes. Of course you can. Thank you. Is there a similar limitation with SINGIO that the loop that you started, once it stops, you cannot start it again? Well, you can restart the event loop if you stop it, but you cannot restart the event loop if you close it. So there's a stop method, there's a close method. The close method is a final. In your server example, you're using both yield and return inside a function. I'm sorry again? Yeah. In your previous example that you were using? In your previous example that you were using, you had a getOutput function which had yield from and return. I'm sorry, I didn't get it. Yeah. So I was asking, I mean, where does it return towards? Oh, you mean that it depends on the code itself. So it's kind of the same of the yield case. Just return whatever you want to return. Okay, I don't get it. But let me talk to you after this talk. Okay. One more question. Is there a context manager that monkey patches all the default functions to use the asyncIO functions? No. Okay. Thank you.