 Okay, I guess we'll get started. Thanks for coming everybody. My name is Daniel Pope. I'm a web developer and a Day Ops enthusiast and for the past few years I've been doing contracts, doing sort of development and network systems, backend systems, that kind of thing. So I'm here to talk about G event. G event is a framework for asynchronous IO with a fully synchronous programming model. So you may be familiar with things like tornado and twisted and the new async IO in Python 3.3 and 3.4 and G event is a sort of direct competitor to that and I hope to demonstrate that it's easier to use and more flexible than any of those things. So where this talk is going, so first of all we'll meet G event and see some examples and then I'm going to discuss the theory behind G event and the other tornado and twisted and async IO and how they compare and the programming models involved and then might talk briefly about my experiences with G event. So asynchronous or evented or non-blocking IO is any form of strategy for writing network programs where instead of blocking and your program suspending and waiting for a response for the IO operation that you've requested, the program goes away and does something else and resumes to the point that resumes executing your code after the IO operation is completed. So diving into G event. This is a very simple G event program. So to talk through the code, the only G event component we're using here is the stream server. We pass in a connection handler and then for any connection that is received on that port, the connection handler is called makefile is a feature of socket and obviously so we're just echoing back lines, makefile turns it into a file like object, can iterate that for lines and send them back over the same socket. So that's very similar to the code that you might write with plain Python before 3.3 if you're picking up async IO except that some magic happens so that is highly scalable. So we'll talk about the magic later. This is a client that uses G event and what we're using here is URL lib to read 100 URLs on a pool of 20 thread-like objects, greenlets. So those things are happening more or less in parallel so that somewhere in the middle of URL lib 2 we access a URL. It does some blocking to read data from that URL but while that's happening and while data is being received, other things are scheduled and run. So the clever thing here apart from this pool object is that we're able to use URL lib 2 unmodified because of the first two lines where we do some clever monkey patching and people keep saying, oh monkey patching, that's a bit nasty and I'll explain a bit later why I think it's actually rather elegant rather than rather ugly. So another example. I actually had, I could execute these examples but we probably don't have time and the network might not be working so an echo server is not particularly fascinating to look at. A chat server, more so but you probably wouldn't be able to connect to my machine. So what's happening here? We've got reader and writer, well greenlets effectively so what we're going to do is run each of those reader and writer objects in a single greenlet so they will be happening effectively concurrently and the reader is just reading lines from a file and rebroadcasting them. We've got a system of queues so that a broadcast can go to all of the subscribed users and the second part of this is the code that hooks it together. So after receiving a connection, again this is sort of hosted with the stream server, after receiving a connection, we turn it into a file, readName is a function that I've not included in this where we just read one line that is your username and loop until it's a valid username and then create a queue for the user and split out onto two greenlets to actually just do one direction of the communication at a time and the join all and the try finally ensure that those greenlets when they, because greenlets raise an exception if the connection is lost, it will just remove the user when the connection is lost. So moving on to some theory. So talk about async and Python, the first thing we need to talk about is synchronous. So this is a diagram of call stack and obviously call stacks can be arbitrarily deep so I've got an example of a simple call stack where we want to get some data and that data is, so I don't know, we're going to process that data in the green function and we want to do one IO operation which is read one line from something, a socket probably may be a pipe. So in synchronous IO, we actually, the code executes following the arrows get to the point where we block on IO and nothing happens, a process, from our point of view, the process, it just stops there and the kernel then waits for IO and resumes us when that IO is ready. So in fact, it may block more than once but the everything stays completely intact and the execution continues when the IO is complete and obviously then the line could be returned to the caller which will do some processing and the data is got. So problems with this kind of model in Python, the performance is not good and the memory usage is not good so there is an excellent article which I actually failed to include in your novel so click that hyperlink. So we'll Google that but it turns out that threads in Python, the gil is not a kernel level object exactly so threads fight for the CPU attention so it's actually much, much worse than threads in other languages. So there's also a stack memory so a thread, the kernel knows about threads and it prepares a bunch of stack that you can control it with U limit minus S but it prepares the stack ready to do stuff in C basically so it's not particularly useful for Python you can turn that limit down quite low but you won't be able to get high scalability and similarly with processes you've immediately lost the shared memory space which is useful about threads and a thread is basically a lightweight process they're very similar kernel level objects. Yeah, so I already mentioned this kind of model but what we're trying to do in all of these IO systems asynchronous IO systems is when we are doing some IO we want to jump out of what we're doing to let other things run and so that usually means there is a central place it always means there is a central place that is waiting on IO doing all of the waiting for all of the things that are processing IO at that moment and resuming the right one when something happens so this is what it might look like so people were saying last night don't use select it doesn't scale but effectively all of the alternatives to select are just sort of API improvements but the fundamental code will look a bit like this so something registers to want to wait on a file descriptor for read or wait for write and then an event loop will be started which resumes for some definition of the function resume when that file descriptor has the capability to do that operation so if we register a waiter to wait for data of being able to read select will return a list of file descriptors that are on which you would then be able to so select blocks sorry select blocks when it returns it returns those lists read write error and each of the items in that list is a file descriptor and so for every item in the every file descriptor in the list read it's guaranteed that you'll be able to read some number of bytes which I think is like 512 or something so there is at least 512 bytes in the buffer to be read at that point and error handling obviously error handling is actually very important in network operation so I've omitted it for brevity but a similar thing applies and then this is the same thing but sort of slightly modified to show the time-out how the time-out part interacts with that so you may be wanting to do IO and blocking on IO but you also might want to be just blocking for an arbitrary amount of time so you could say wait for time-out and then the last argument to select is a time that we want to suspend and if nothing is returned on read write and error for the duration of that time-out then select will return anyway and we could do some process all of the things that we're waiting for time-out right so get into different models of what resume might mean and so callbacks are the simplest approach they're used in a JavaScript tornado IO streams twisted deferred and async IO all have this idea of callbacks from the event loop so this is the kind of what this does to the stack so whereas before we had a nice simple definition of what get data like get data was one thing now I had to break it in two we've had to break the the read line function into to do the bit that sets up the getting the line and the bit that receives the line and or waits for a whole line perhaps so this is a lot more messy and you notice that I've drawn the return values as sort of the return parts as light arrows because you can't do anything useful with the return values of callbacks they're not going to a useful place they're not going to contain any useful data so there is actually no way to ever break out of the cycle of callbacks and just turn into nice code where we can return values which is a really convenient way of programming so I've included some examples of callbacks this is an imaginary HTTP library where I'm making a request to an endpoint and passing in the function that I want to be called and it's got the ability to load JSON and so because I can't return now I've got to do something I've got to call back another callback with the response and then if I wanted to link the request that I'm doing to a particular state of well pass arguments effectively then there's one way of doing it but one way is to use closures so the handle response is bound to the where it's closed over beer ID so it knows what beer ID is which I've not used oh I've used callback instead sorry but sort of more practical things this is something that I encountered a couple of years ago this is Pico which is an MQP library and here we're at sort of four levels of callbacks deep and this is a simple example so in fact this just declares a Q but you might want to declare an exchange and bind a Q to an exchange so you're sort of six layers of callbacks deep before you've actually received a message so I don't like that at all that's really ugly to me so somebody once said and I don't know I don't have an attribution for this quote but callbacks are the new go-to and we've discussed how it's an untidy code structure where you've split everything into tiny little component parts and you're not able to do return values and also error handling you notice that the previous example is absolutely no error handling in there and if I wanted to do error handling for all of those operations I might want to register an error handler in fact Pico you register the error handler once and then it just arbitrary some error happens somewhere in the program but error handling with callbacks doubles the amount of work that you're doing so people don't and the examples don't and then people copy the examples and error handling is just left on the floor so a simple approach to dealing with the complexity of callbacks is binding them into a class so that you have so rather than having the closure as I demonstrated earlier you have a class that has methods sorry, members and methods and things and the methods certain methods are pre-registered as callbacks for certain operations so this is something I wrote once bit truncated perhaps but this is a twisted application wrapping a sub-process and out-received and error-received get called whenever there is a chunk of data for some reason the process protocol doesn't let you turn it easily into line so I had to write that and then how do we break out of this handler that's getting events called sorry methods are being called on the object how do we turn that into useful things in another part of the program had to use something called a deferred queue I don't really remember the semantics of it but again, you register callbacks into it I think so you're just this is just a simplification of some of the difficulties of callbacks but it doesn't really deal with the problem and you end up using callbacks anyway and you still have to split your processing into multiple chunks so if I wanted to say I'd so I'm doing self.q.put here but if I was to decode the lines at that point and try and do another asynchronous operation how would I link that asynchronous operation which is another class to this class I've got to I do it using callbacks basically and here's an example from Async.io so it's exactly the same kind of thing the there's underlying a system of callbacks but in Async.io you have protocols and transports that are paired together so your transport is wrapping a type of thing like a like a well a sub-process or a network connection or something like that an SSL connection and the protocol is the processing for that but so it's still callback-based but you can your wrapping it in a class and in there's a slightly I suppose I think it's a slightly nicer API where where twisted I had to use a process protocol and that would be different to a protocol that goes over the wire sorry okay okay right I'll try okay sorry where was I yes so I think the Async.io is slightly nicer than the the twisted because the protocols and the transports are are kept separate whereas in twisted they've been combined but the same problems apply so so then we get into a more modern technique of generator based coroutines so this is present in Tornado there's a Tornado.gen module and this is the I suppose the key feature in the new Async.io in Python 3.4 and so what we're doing here is we're using generators which sorry generator when generators were introduced it was noted they provide coroutine like features and a coroutine is really what what G event is built on but in this case we are using so we're trying to suspend operation between the in the place that in the the earlier example of blocking IO we would have let the operating system call us back we are using yield and yield from to break out of the stack while preserving those stack frames and the event loop will and because the event loop is the parent the event loop knows that we're waiting for something to happen and it will return us back to the point it will reassemble the stack when that operation is complete so so there's literally a division in the middle where the stack is torn down and preserved as generator objects and and resumed using send and there is one of the advantages of this method is that you can actually return data so in in Python 3.4 when does yield from 3.3 so before Python 3.3 you can actually return a value you've you've used generators generators in earlier Python you could not have a yield and a return of a value so that was a feature that had to be added to make this work and also you've used the yield so the semantics of yield are now coupled to breaking out of the stack rather than being able to actually use generators as a useful sort of looping tool so this is an example of async.io and using those co-routines so this is a very similar example to the stack that that I showed there for some reason Sphinx wasn't able to highlight something with the yield from in it but so so the async.io.sleep is a special type of generator that that returns an object that when when it's sorry yields an object that when it's yielded all the way through print sum back to the event loop the event loop will resume that generator by going back through the print sum or sending to print sum sending from the the yield from line into the yield from async.io.sleep line and resuming the or letting async.io.sleep return effectively and this is what it looks like in Tornado so Tornado has Tornado works on Python too so there is no yield from there is no return value but that's yeah it's usable in Python too as well but you notice that we've had to put yields into the code where really it doesn't really make sense that we're yielding so what is an actual co-routine? so generators have been described as or the approach of generators has been described as semi co-routines a full co-routine can yield not just to the calling frame but to anywhere so any other co-routine and it doesn't require any collaboration from the other stack frames to do that so the the top level of the stack frame can just say hold on, park me, I'm going to call back to the event loop um so this is a bit what it looks like so without having to suspend the stack frame or modify the calling conventions to use yield from we get to the point at which we block and we just say yield to this event loop and the event loop when it's ready to to resume yields back to the point at which the this greenlit was suspended so it's like blocking in the blocking IO example except instead of blocking at the kernel level we just suspend this greenlit yield to the event loop and the event loop does what the kernel would have done which is wait for IO and resumes so going back to the async IO example this is this is how the same the same piece of code we've written with geovent rather than having to use yield from so the only the only difference in terms of what you're calling is you're calling geovent dot sleep you don't need to use yield from you write the code exactly as you would normally write it and somewhere in geovent dot sleep the magic happens where it yields to the event loop and here I'm not even kicking it off from an event loop so the fact that I'm using geovent dot sleep geovent dot sleep will create the event loop if it doesn't exist so the there's no I don't have to be spawned by an event loop so much simpler yes I think I've said most of those things in geovent the event loop is called a hub so let me get back to the monkey patching so it's possible to just use geovent dot monkey to modify the existing sleep function the time dot sleep function which means so you have to do that before you import anything but in case you keep references to the old versions but it means that any existing code can run without modifications so you probably have code somewhere that uses time dot sleep you can make it run using geovent just by starting your programme with geovent dot monkey import patch all patch all or however you want to express that or maybe have a launcher that launches your programme with geovent which is an approach that's often used for something like geovent so it's available in geunicorn you could just say use a geovent worker and it will do all of that stuff before your programme starts so to tackle the nastiness of monkey patching I don't think it's that ugly in this case because we're not arbitrarily monkey patching bits of the standard library at random times we're just starting Python with a completely different distribution of the standard library that happens to be co-operative multitasking with geovent it's bundled as a library rather than having like a Python hyphen geovent programme it's bundled as a library so you can choose to use it or not in different ways of calling your code so for example you might have a batch job that runs a batch job that runs without asynchronous code because it doesn't need it and if you want to do some CPU stuff or simplify your code by not doing or a test is actually better you don't want to do asynchronous networking stuff in tests so you might call your business logic as normal code and have it run synchronously and then when you switch into production you're using geovent to do the asynchronous networking so obviously you do integration tests with actually running through geovent and that kind of thing but it's optional to use this you can use the full power of geovent without doing it it just means that you can't use existing pure Python libraries so I think that's a massive advantage as I say here you can't use it if you're writing a geovent library you should not rely on monkey patching being present because you don't know if the caller of your library is going to want to do the monkey patching so the monkey patching also works with async code using select so that immediately means that you can use existing libraries that are doing their own kind of networking that will have their own event loop like pika there you could use pika if you really wanted to deal with all the callbacks but you need to ensure that it's using the select function rather than ePol or KQ or any of the other more platform specific alternatives that are usually better so just to quickly run through the kind of features in geovent you obviously need to be able to spawn a greenlit to allow concurrent operations that don't block each other so the fundamental unit of processing IO with geovent is to have these greenlets you spawn greenlets to each side of a like a reading a writing side you can kill greenlets by passing an exception so that when the greenlit is resumed or sorry signal that greenlit to be immediately resumable but when it resumes that exception will be raised so that's actually an advantage over threads because you can't easily do that with threads in Python and then there's a greenlit pool equivalent to multiprocessing a pool or other types of pools so if you wanted to do parallelised network operations that's an easy way to do it and then there's synchronisation primitives to ensure concurrency between or to ensure synchronisation between your greenlets it's worth noting that greenlets never run at the same actually at the same times unlike threads so these things are slightly less important but you're because you know that you're never going to give up control of the CPU until you hit a blocking operation and message passing async result is pretty neat so you block on a single operation that's a useful way of turning callbacks back into synchronous programming so you want to have a synchronous programming model because it makes it easier so async result you could just say when this callback is raised set a value into this async result and that gives you something that you could block on as a sort of just get you do async result.get and it will return you the result or if there's an exception you set the exception and the thing that's waiting on it will actually receive the exception so synchronous error handling as well and this is an example of using the thread killing the greenlit killing mechanism so you could just use a context manager like that and you've automatically got a time out on the contain section so any timers like time.sleep or all blocking operations could be limited by the same time out and we've already met things like the stream server but there are whiskey servers and that kind of thing so I think I've covered this in a way but you can have business logic that's completely unaware of gfn and unaware of the even without the monkey patching you can have business logic that's unaware of asynchronous and you pass in say a file like object which has been made green and it will the business logic will hit that and stop without having to change all of your call stack to collaborate in doing this yield from shenanigans to get back to the event loop to be resumed eventually so I think that's a huge advantage and I don't really want my business logic to have the idea of asynchronous back ends potentially being part of it and also have to deal with the occasional synchronous back end so greenlets have all of the advantages of the kind of generator approach but to sum it up these things are very light the stack frames in python are on the heap so yeah need to hurry through some stuff yeah no co-generators so it works on windows as well I didn't mention that a disadvantage is that it doesn't work on python 3 at this time so I mean that's a big down for some people there is a python 3 branch there's a python 3 fork I've not tried it it's obviously sort of usable for some things but that's not finished but then we're talking about networking operations so if you want to write a network server in python 2 that shunts bytes around and use python 3 for your user facing stuff and use async.io or use synchronous.io just for that component that's something that might work for you and wherever we have locks we have the possibility of deadlocks but that may exist in other async frameworks as well so one pitfall the biggest pitfall is doing something that actually blocks instead of doing this fake blocking where we yield to the event loop so if you're using any C libraries they probably do this and you have to modify the library or use any async support in that library and wrap it up into the G event programming model to avoid blocking and likewise if you keep the CPU busy you'll never yield to any other greenlets so this isn't at all for CPU bound activities but you can you could obviously use the ability of the networking features of G event to delegate to synchronous back ends that are doing heavy lifting and return it through your sort of network plumbing applications written in G event so I mentioned using one greenlet per direction you don't want to try and merge these two into one greenlet you want to do writing in one greenlet and reading in one greenlet because you only want to block in one place at a time so the writer you're actually blocking waiting for a message sending that message but you're you're not blocked on anything important at any particular time you can obviously use this with multi-processing and that's a kind of approach that's used in Java and go and Rust so these systems do have green threads and greenlets but they use multiple threads underneath them in Python we'd have to use processes underneath them but you can still get more scalability out of multi-processing systems by using this approach on multiple processes and then a couple of years ago when I was doing this really heavily I wrote a micro framework which I think I gave a lightning talk a couple of years ago saying never write a micro framework it was a really stupid idea but there is one so if you want to do something with a restful geobent and a green postgres driver that can do so your database operations also do this fake blocking kind of thing then that's built into to Nucleon and in sort of revulsion at pica and all of that callback stuff I wrote an AMQP library that was actually forked from another one called pica but so there is a whereas most of the AMQP drivers try to be asynchronous through callbacks this gives you a completely synchronous programming model so remote queues can be exposed as local queues and you just iterate through the messages of a queue or loop over getting messages from a queue rather than having a callback called every time a message is available and all of the other AMQP operations as well so we have time for questions this is more a comment than a question but we are skipping over that part real fast but g event only uses the select system call on all operating systems which give suboptimal performance and other platforms like windows for example so somewhere in the heart of g event it will use the most appropriate thing for any particular operating system probably not for windows so it's a horrible choice for cross platform scalability but might be good if you know your target I think it probably is possible to adapt g event to be more windows compatible because the coroutines approach is sort of completely abstract you could use it yeah absolutely but as of now it uses a suboptimal system call on some platforms so it might be worth knowing if you're going to double in g event in fairness I've never used it on windows but I hear it works on windows no me neither but just worth knowing is there any optimal number of greenlets you can use in one process sorry I can't hear you very well is there any optimal number of greenlets you should use per one process for example or range or how many greenlets should you start how many workers oh greenlets, oh start as many greenlets as you want greenlets are very very cheap so create as many as you need you can start at like 1,000 yeah 10,000 greenlets as long as you got memory I have a question regarding generators and coroutines if you use a generator you yield if you have an asynchronous car to do and you exactly know that between the yield and the resume the global state or the shared state may change and in a g event you don't see obviously that for example time sleep may suspend your greenlet and the state may be differently when you wake up again so I yeah well so that's something to be to be careful of in all of the async frameworks that you can't really rely on global states something that is useful about g event is that it has its own thread local object so the and the monkey patch approach will apply that thread local object as the threading dot local so that means that you can have your local state in a thread local object and I think flask uses thread locals or something like that so with monkey patching the flask global stuff should work although I don't think I've ever tried that so yeah but obviously it's something to be careful about using global state when you're you're using anything concurrent thanks so if g event was ported to python 3 could it be compatible with async IO for example could we use some library in that scenario that's written in the async IO style I suspect you probably can yeah so async IO is completely within python it doesn't have any special python tricks whereas this one see trick in g event and similar systems with co routines where you could jump to a different stack so that anything within completely within python could potentially run on g event but I think the big hope for async IO is that the event loop in async IO will come to be the standard event loop for everything including something like g event so you could run an async IO event loop and have g event use that as its event loop meaning that all of the frameworks like twisted and tornado and g event and async IO could all be using you know running on exactly async IO as it is at the moment you could probably use because as I said select is one of the things that's patched you could probably use tornado and twisted their own event loops within g event but actually bringing everything together is an open problem okay thanks hello do you have any idea on when g event will be ported to python 3 because you know I think it's a very important you know dependency for a lot of projects like hundreds of projects and we twisted you know I think it's one of those dependencies which needs to be ported right right now you know so you have any estimate on when so g event has only just reached version 1 at it was in November last year that it reached version 1.0 and I don't know how much effort is going into writing a python 3 compatible g event so yeah no answer for that but if you're interested in it get involved why not okay thanks thank you it was a very insightful presentation I'd like to ask you where I work we use tornado for production and asynchronous python sorry I didn't hear that it's a bit noisy sorry where I work we use tornado as asynchronous python and it works really well just like g event I myself used g event in the past the problem is that we found a really hard bottleneck and that is the database because the database we couldn't find an asynchronous ORM so it's a huge problem for us we don't want to use inline queries so do you have any idea of an asynchronous ORM or one that's being developed so there are ways around it if you want to do synchronous operations you can use an actual thread pool or something like that so wrap up the synchronous code or a multi processing pool so that you're passing the requests using g event and receiving the responses using g event but you only have say for blocking workers that are doing database calls that's one way around it so I've in nuclear and I found a way of somebody had already written an example code for how to do how to make the Postgres Psycho PG2 driver co-routine compatible by having the library itself tell you when it wants to block and then you are responsible for doing that kind of blocking so it is possible for some drivers but there are workarounds for others okay thank you thank you very much