 mentioned, I'm going to be talking about, as I call it, MIO. But I've been hearing a lot of people saying MIO, and it rolls off the tongue. Originally, I think the name was for mini-IO, but Dave Herman coined metal-IO, and I ran with that. So MIO for short, metal-IO, or MIO, whatever. So I started working on it about a year ago. I had some free time on my hands, and I thought, let me try a project, a Rust project, to really get more familiar with Rust. I had already been using it for a bit, but I still hadn't reached that comfort zone that I was looking for. I work at Tilda, and we have a product, Skylight, that is using Rust in production. And it's been using Rust in production for over a year, almost a year and a half now. So I believe we were the first production users of Rust, which, if you remember a year and a half ago and all the changes, that definitely came with some dedication. In fact, at some point, I think, in October 2014, we were like, we need to get things done. We're gonna freeze the version of Rust, and just once 1.0 hits, then we'll worry about updating. Anyway, it took up until just about now to get on the nightlies again, because that upgrade from October. Anyway, I'm getting on tangent. But, so I was going to work on originally a web framework, because that was something that interested me. Unfortunately, there were no suitable HTTP libraries that I wanted to build on. So there was Hyper, which is a great library, but it's based on synchronous IO and the one thread per connection model. It's not what I was looking for. It's not really the best way to build a server that can scale out as well as possible to many connections. So, okay, maybe this is a good place to start instead, an HTTP server. I can do it the way I want, one dedicated thread to handle all the open sockets and maybe a thread pool to run the request handling on and of course, the ability to asynchronously respond to requests. Okay, so I'm gonna work on HTTP server. Now what? Let's figure out how to start writing the IO code, the underpinnings of the HTTP server. Well, Rust had TCP support, but again, it was only synchronous sockets, not what I was going for. So maybe I started looking at some C libraries and there was LibUV and there were others. LibUV is a great library, it's very portable. One problem is that it's not exactly idiomatic Rust. So I don't know if any of you really tried binding C libraries, simple libraries tend to be okay, but as you get into more complex involved C libraries, the fact that it's not written, I mean it's written in idiomatic C, Rust has its own idioms and also Rust has a way of kind of shoving the ownership problems into your face. So which is a good thing, that's why I use Rust, but it could be, I mean it ended up being a little painful to try to do. Another thing I wasn't a huge fan of is that LibUV uses as its primary abstraction model something that's really optimized for Windows and I'm gonna talk about that later, it's kind of foreshadowing, but okay so I decided not to use LibUV. The last real option I had was writing directly on OS APIs and for Linux that means using E-Poll, for OS 10 and free BSD and other BSDs that's KQ, for Windows that's using IOCP and also I'm discovering now there's something called Registered IO which I just discovered this like two weeks ago. I'm not a Windows developer, okay also something I'll talk on. So writing an IO system that's really portals, actually a huge amount of work and at this point LibUV seemed like probably would have just been the best option to use it but I was not doing this to ship code, I was doing it for fun so it was an opportunity to try to learn something new. So okay I'm going to try to write an IO abstraction for Rust. Nothing had been done yet kind of at that level how to write an asynchronous IO abstraction so it seemed like a good opportunity to just try to do something. So I started, first decision was to figure out which IO model I was going to base my own and there are two primary models, there's the readiness model and the completion model and they're pretty vastly different. Readiness model is what E-Poll and KQ et cetera use and the completion model is what Windows and LibUV and I think Solaris as well but I've not looked at Solaris, maybe I'll look at Solaris when Rust supports it. So in the readiness model, the way it works is the kernel notifies you when a socket is ready to operate on and the way this allows multiplexing sockets, many sockets on the single thread is you say, hey kernel I have all of these sockets that I care about, tell me when any of them are ready and the kernel will watch them for you and when data is received or when the socket becomes writable it then gives you back and you're, hey these sockets are ready to be operated on and at that point you can call read or whatever operation you want on the socket and it's ready to complete immediately so it does not need to block the thread. So just to illustrate, take a TCP socket, if there's no pending data, the call to read is going to return with a wood block and this is a socket set to non-blocking mode and this tells you that if it was a blocking socket it would block. So the next step is, once that happens you then pull for raise and this is E pull KQ, et cetera so you then block your thread waiting for the kernel to tell you that the socket is ready. Socket becomes ready and finally when we can call read again and this time the buffer that we supply will be filled with data. The completion model is almost the reversal so instead of waiting to be notified for socket to be ready and then operating on the socket first you operate on the socket say I want to read, that read gets fired in the background and you are then notified when the read completes. So to illustrate again, a TCP socket, you issue the read, you supply a buffer. First thing that happens is that buffer the ownership of it gets passed to the kernel. So you, in C-Land, you will still actually own, you'll still have the pointer to that buffer but you can't free the memory, you can't read from it, you can't write to it, obviously bad things will happen, this is what rush protects you from. At that point, once you call the read, the call the read unblocks and you're free to do other work whether it's, I don't know, operating on other sockets or do something completely different, run a busy loop, whatever. At some point the socket is going to receive data and once it does, the kernel will fill the buffer that it has ownership of with the data that it has received. On the user space we're finally ready to check to see if the read completed so we pull for completeness, for completions and this is a completion port on Windows but it's basically a queue that you pop off completion notifications and once you pop off the completion for that read you'll get back ownership from the buffer and now you can read the data from straight out of it. Some things to note, you cannot use stack allocated buffers or at least you cannot look great difficulty because once you pass the buffer ownership to the kernel you then, you have to make sure that the buffer stays alive for the entire lifetime of that operation. Also because you're passing ownership of the buffer to the kernel, every single read operation that's currently in flight requires its own buffer. So what that kind of means is that the completion model is going to force you to bring the value in this case the result of the read into existence even if you aren't ready or don't want it and contrast that with ePool and the readiness model a really powerful trick is the kernel notifies you that the socket is ready you get that notification but you decide I don't want to operate on the socket quite yet I'm going to track that it's ready and deal with it later. So here's some simple pseudo code so I could fit it on the slide, it's basically a proxy. You have a source socket and a destination socket and you want to copy data from one to the other and the trick here is when each socket the destination of the source is ready all you do is track oh the socket is ready and once both are ready you can read in one go read into the buffer and write to the destination. So because the read and the write happens all at once the we can use a global buffer so in this case we only have one 4k buffer like in use for the entire program. Shimming some other details I'm not going to talk about here but anyway so this technique can actually be used to implement proxies for example that have very low memory requirements and it's a really powerful feature. But now I introduced the two models you may be able to tell from the way I talk about I'm a little biased I have my personal opinions which one I like better but at the end of the day it doesn't really matter which one is better the operating system provides the model to you and if you want to write code that runs on Linux or Windows or OS 10 you have to write code that uses the model on that operating system. So if a library that wants to be portable decides that it wants to provide the readiness model on Windows it has to implement the readiness on top of completion and the reverse is true as well if a library decides to provide a completion model it has to implement the completion model on Linux which is readiness anyway. And when you bridge these two models there's going to be a little bit of overhead it's not a huge amount of overhead but one goal with MIO was to be close to the metal was really I went at that level those little things will matter so no matter what you got to pick one and I ended up going with the readiness model and that's because I wanted to so yeah I will get more into this but I wanted to build something that was extremely cheap essentially zero cost abstraction on platforms that I cared about and most people cared about and reality is two-fold. One, Linux servers are the majority when it comes to production servers and my, are there any windows? Like really Windows fans here? I'll be a proxy for Windows. Sorry, so I am kind of hate we'll talk about Windows later. So my humble opinion is that on Windows people care about raw IO performance a bit less. So if we can do something that's really close to the metal Linux and then as close as possible on Windows that's going to be a huge win. And Rust as well like when I came to Rust I came from Java Ruby and one thing that really drew me was Rust itself was like really when it's finding itself and really starting to focus on zero cost abstractions that was something that really resonated with me and that's what I wanted to bring to an IO library. So that's the intro. Now let's get into how does one use MIO? And it's going to be based on E-Pulse so if you have any experience at all some of this will seem very familiar. There are three basic steps. The first thing you have to do is just wait for sockets to be ready. Step two, do something with the sockets and then step three, repeat. That is most of everything. So the fact that this pattern of just this pattern, this loop is a pattern that keeps coming up MIO starts by just giving you an event loop. So this could be in a main function. Maybe like they'll probably be other I'll fill in the example a bit later but start by creating an event loop then you define a handler and this handler's job is to be step number two which was to do something with the sockets. You start the event loop with the handler and the event loop now is going to do step one. Wait for sockets to be ready. Step two is going to then call the handler's ready function with the socket that like the information about the socket that is ready. And in the handler implementation you do something then you return and the event loop's gonna repeat. So just run everything in a loop. And that is the most important bit but there are some more details. I've not quite yet talked exactly what socket events mean. So I said already like when a socket becomes readable and that means a socket received data it becomes readable. When a socket becomes writable aka like when you're establishing a TCP connection once the connection is established then that socket becomes writable because you can write to the socket. Also if you have a very heavy write load on a socket you will eventually fill the buffer and once the socket buffer's full you can't write to it anymore. So it's not writable at that point and once the buffer's flushed the socket becomes writable again. So those are the event, the socket events that I'm talking about that you'll get notified in the ready handler. And there are a couple different ways that those notifications can get delivered. So the first is edge triggered and this edge triggered notifications is what most people are gonna really expect with a library like MIO. So the events once they happen are fired only once and they're only delivered once to the handler. So what I mean is if a socket receives data it becomes writable. The handler then receives the writable event. If the handler doesn't read from the socket and the data stays on the socket on the next event loop iteration there will be no more notifications on that socket. There will not be any more readable notifications for that socket until new data is received. So when using edge triggered events that is one thing to keep in mind so you either have to read all available data on that socket or just track that socket still readable and read the rest later. Whereas level triggered is pretty different. With level triggered it's, some say it's easier to use but every single event loop iteration every single socket that's readable and or writable or whatever interest that you asked for if it will notify the handler every single loop iteration that a socket is readable. So there are use cases for this and I just wanna put it out there but unless you know that you wanna use this I would recommend to use edge triggered notifications. Just to kind of illustrate exactly the points like at the start you register the sockets you receive four kilobytes you get the notification, read two kilobytes then loop everything up to there is the same but once you loop only level triggered will get the notification again versus edge triggered you will no longer get that notification. So continuing on with a previous example let's just kind of set it up so that we can accept like this is gonna be a little mini server that for really not gonna do anything we're gonna set it up to accept connections on ports 6567. First step is you just create a TCP listener to know if this is a MIO TCP listener MIO implements all of its socket types itself. They're very, very, very similar to the socket types in Rust standard library. The main difference is that the MIO types are non-blocking whereas the standard IO types are blocking. So once you have the listener you can register with the event loop. And here we're going since it's a server socket we're gonna say we want readable notifications. With a TCP listener when a connection is pending on it it's you are notified with a readable notification. And we say we want edge triggered notifications and finally we also pass in this magic token which I'm going to talk about a bit later but the token is how you identify in the handler which sockets triggered the notification. Yeah. So now let's update the handler. First we accept in a loop we're gonna accept the new sockets from our server socket and it's the same function name as the TCP listener in standard IO. However, the difference and this is the main difference between MIO sockets and standard standard Rust standard library sockets. Must be a shorthand for that. Is that the return type is different. So in MIO you get almost always a result of an option of T so in this case the new TCP stream. And that's because if there's ever an error it's just gonna return with an error directly but there's also the case that you're trying to perform the operation on the socket and the socket is not ready yet. And because it's non-blocking it's gonna return directly. And in that case you will get the okay none. Not the socket not being ready is not actually an error case. It's normal to happen during runtime of a well-written app because MIO may sometimes fire off spurious notifications. So even if a socket is not ready it's still permitted to send off a notification of readiness. And this is due to the underlying system APIs but so because of that you always need to handle the okay none gracefully or your server will crash for no reason. So tokens. So far we've associated the sockets with the event loop and we were notified when it's ready. Only but in our handler we're only dealing right now with one socket so it's really easy we get a ready notification we know it's for a server socket but MIO is built for dealing with many sockets hopefully thousands and thousands. So we need to be able to identify which socket triggered the notification. And that's what the tokens are for. When you register the socket with the event loop you will pass it a unique token and that token will be returned in the handler. So the token is simply a struct two-pole around U-Size it's a very simple type and the reason it's being used because every OS like OS IO system allows for at the very least a pointer size worth of data to be associated with the socket. So MIO just does that we're just gonna so it's like drop in a token and return it when it's ready. So in the handler the general pattern is to store have some sort of map structure of where token is the key and you map it to the socket in question and any associated socket state. One question I get a lot and why did I not just provide callbacks as the API? Because that seems to be, I mean that's pretty common. A lot of asynchronous IO libraries just go with callbacks. So the short version is that my initial goal for MIO was to provide the zero cost abstraction and tokens was the only way of doing that. I didn't, my kind of guideline of how to design MIO and what features to add and whatnot was basically how far can I push the envelope before without adding any overhead. Again, Star Linux and we'll work on that later. But so hopefully if other people want to like callbacks or like a future or stream based model you can build that on top of the MIO. But at the same time, if you're working on a TCP proxy where being as close to the metal as possible is really important, you don't have to throw away everything because that's something I found in the past. Like, oh, I'm going to play around like writing just really actually just a TCP router that has a little bit of business logic in it. I found either I had to use libraries and systems that had more overhead than I really wanted or I had to dive down to system APIs directly. So that was something I wanted to avoid with MIO. And especially with cargo and craze IO and the ability to just build these small packages. I'm hoping that people will experiment and maybe a callback style, it's not my favorite but maybe that's what everybody wants. You can, and there'll be some other library that builds on top of the MIO that builds callbacks because by providing the token strategy like which is zero cost, you can build callbacks on top of that just as efficiently as if you went directly to system APIs. The reverse would not be true. So, so far what we got is recap how MIO is used. Step one, you create the sockets and then you register the sockets with the event loop, specify the token and then decide whether you want edge or trigger or level notifications. Then you wait for socket readiness and that's just by running the event loop and the event loop will call the handler. Once the notification is received, you get the token, look up the states, operate on the socket and repeat. And because the, this is like a little gonna be a little tip, because the pattern of looking up socket states by token was so common as well as having to generate unique tokens, I just went ahead in MIO, provided this low utility. The slab utility is just a map from tokens to whatever so you can use it to map tokens to whatever sockets and socket states. You can use it for like, and it will handle generating the tokens for you so you don't have to worry about like keeping track of that. Also, it's pre-allocated so at runtime there are no allocations and insertion, access and removal are all extremely, extremely cheap operations. So, and because it's pre-allocated like the one catch I guess is that you have to, you can't resize it at runtime, you have to decide how you're going to, like how many, what's your capacity of your system which I kind of think is a feature, but anyway. So just how you might use this, like really simple to use, here we're just making a new slab, it has capacity of 10, 24 elements, right now we're putting in foo, we get back the token, that token we can then use when we're registering a socket, access and removal is about the same. Yeah, so initially, initially I was gonna put more examples but they couldn't fit on slides, so what I did do instead, and I'm gonna provide links at the end, was write up a well-confident example of the full echo server and so that you can just read it on your own time versus me trying to fit all the code in here because it gets a little tedious, but I'm just gonna wanna try to wrap up with a few thoughts, like things to keep in mind when working with MIO. First of all, just if one interesting thing is that while I spent the entire talk so far focusing on non-blocking sockets, because I'm using a decoupled event loop from sockets, it's fully decoupled, whereas I think, yeah, LibUV requires, it's actually combined. Sockets are part of it, but you can use, you can actually use Rust standard library sockets which are blocking with MIO. So for example, while you don't wanna read on the event loop, what you could do is have a bunch of blocking sockets, use an event loop to track when those sockets are ready and then farm them out to a thread pool. So at that point, it's kind of a middle ground between using one thread per connection and using just full non-blocking I.O. So the basic strategy, one event loop to just watch for sockets, and once the socket becomes ready, you farm it out to a thread pool and that doesn't sink this read. Another thought, like okay, another kind of tip, if you really, really do wanna minimize the work that happens on the event loop, it's really important to do. Because anything, any bit of code that takes time is going to prevent other sockets from being handled. So if, and if that backs up, you're gonna have like a backlog of sockets of process that will cause availability problems. So I would recommend to kind of try to keep the event loop exclusive to non-blocking I.O. kind of work. And anything that is in I.O. based, just use a thread pool or just move it off of the event loop for work. Finally, a question I also get a bunch is how do you share an event loop with multiple libraries or components because the, you might discover using as, oh, when you're working with the event loop in tokens, you have to kind of know, you have to be in control of all the tokens. So sharing one event loop across multiple components doesn't quite work well. And my opinion, so again, if this is something people want to do, you can easily build on top of it in I.O. But I'm not really a fan of sharing components. So I think that every component, whether it's the HTTP server or any HTTP client or even DNS lookup, I mean, I think that can be, it's handled encapsulated its own event loop where all the code for just that is like just in there and then communicates with the others cross thread. So an event loop is just a bit of code that needs to run on one single thread. So it's not that heavy weight. And actually the thread scheduler is really, really good decades of work put into to really optimize isolating like bad behaviors of each thread. So you can have like, you can have still like maybe one components using event loop and something weird might happen and it doesn't do the right thing. It won't affect other components. All right, really quick and I save this for last. So short version as of today, Windows support like my doesn't work on Windows. However, it's been something I've been wanting to and short of trying to improve the getting started and the docs and that kind of experience. Windows is kind of my next priority. The problem is I work on my kind of my nights and weekends and whatever spare time I have which really isn't much anymore. So work is slow, but I think Alex, Frighton, we only talk online so I never pronounce your name but is going to do some, going to get it done, right? You're gonna get it done. Okay, so I don't have to do it, woohoo. But the first kind of target goal with Windows support is just get the current API working on Windows. So as I said, that takes its very readiness model API and getting that working on the completion base. Completion system is going to add a little overhead. However, I think it can be the actual, I spent a bunch of time trying to figure this out and I think the actual overhead, we won't know until it's done but I think it's gonna be pretty minimal compared to even something like LibuV. But after that, step two is going to be, okay, how can we expose Windows specific, like non-portable APIs but Windows specific that kind of that lower the level of like just make it closer to the Windows model but while diverging as little as possible so that even though you may have to, if you really, really care about raw IO performance on Windows but then you can, the amount of non-portable code is like more minimized. So that's that, this is the end. So like I said, I'm just gonna point to the GitHub because I've read me, I started working on the guide. If you've tried to use MI on the past, you probably noticed there's basically no documentation. There was no documentation. There's documentation so we're still not great but I'm chipping at it away like kind of a little by little so I linked to the guide, the work in progress guide and in that is example server. So yeah, get started. I'm around for a little bit more today, probably another 30 minutes or so because I gotta get a flight home but if you wanna talk, talk to me before I leave. Thank you.