 Hi, everyone, thanks so much for coming today. So I just wanted to give a little bit of background on why I put together this talk. And essentially, asynchronous programming is something that's emerging in Rust. And when I was learning how to do asynchronous programming, then I essentially bumped into a lot of corner cases where it became a little less apparent how to proceed. And so while the ergonomics are quite good, then I kind of wanted to put together a talk that would basically talk about how to get started from ground zero in asynchronous programming in Rust. So the overview of the talk today is we're going to start by talking about the state of asynchronous programming in Rust. So basically just like language level support, what's there, what's in external libraries, and that sort of thing. We're going to talk about asynchronous programming fundamentals. So we'll talk a little bit about the architecture, syntax, and why that matters. And then we're going to do a quick case study of a modified read write lock that we implemented in storage for our use case. So the state of asynchronous programming in Rust. It's designed to handle use cases like asynchronous IO and network operations. So in general, anything that you would need to do asynchronously, the framework is designed to handle that. So long running blocking tasks in the background, that sort of thing. I wanted to give a couple of definitions, because in asynchronous Rust we talk a lot about tasks and futures. And tasks are kind of like green threads. So if you're familiar with Java or Go, you're familiar with this concept. It's basically multiple green threads can be served on a single OS thread in parallel. And futures are basically the handle that you can use to pull and access the final computation, similar to kind of JavaScript promises for those of you who are familiar with JavaScript. So one note is that Rust doesn't really have the same kind of level of runtime that Go does. So it can't preempt threads like Go can in their asynchronous implementations. So it requires an API design that builds in yield points anytime that a green thread can't make progress or a task can't make progress. And futures also require an executor to handle driving them to completion. So in general, you need some sort of executor to kind of drive the asynchronous workflow. So that's a lot of information, but we'll dive into some of this in future slides. So the language level support is really just kind of async and await at this point. So what async does is it's a keyword that you basically attach to a function, and that turns that function into a future. Await basically allows you to await on a future to complete. And while it has the same kind of semantics as blocking, it's awaiting in a non-blocking way. In terms of the standard library support, then there are a number of data structures that have been added into the standard library. And we're going to talk about some of these in more detail. But the one I really want to call out is the future implementation. So this is the trait that basically is the building block of asynchronous Rust. And you basically implement this for all of your future needs. So in terms of external library support, and then basically we have futures, RS, which provides utilities for working with futures. So that's going to basically be combinators for basically handling multiple futures in parallel, that sort of thing. And you also have an executor, which, as we mentioned, is required for actually executing the futures. Tokyo is basically a library that's built with more functionality built into it. And it has a compatible executor with futures RS, so the two can be used together. It provides a runtime for asynchronous programming. And basically what this runtime includes is a bunch of async utilities like an IO driver, a scheduler, a network API synchronization locks. So it's really kind of aimed to really cover a lot of the asynchronous use cases that you might bump into with asynchronous programming. And then I figured I should also call out async standard. But it does not have a compatible executor with futures RS and Tokyo. So the two generally bump into problems when used together. It basically aims to provide a similar API to the standard library, but it's all asynchronous. So this is becoming more popular for people that want to do kind of asynchronous standard library operations. So with all of that out of the way, we can talk a little bit about the fundamentals of asynchronous programming. So first, we're going to start with implementing a future. And consistently, this seems to be the thing that a lot of people struggle with the most in asynchronous programming. I put up the definition of the trait for the future just so that we could go over it in a little bit more detail. And the basic idea is that you just kind of define what output the future should give, and then you have this pull method. So what the pull method essentially does is it basically pulls over and over again as it's woken up. And basically, you return either that you're ready and you return the result or you return pull pending. And it goes back to sleep until it's woken back up. And it can make more progress. So one note is that you should never block and pull. So that can cause problems, and we'll get into that a little bit more in future slides, just why you should never block and pull. So this is an example of what happens when you're not ready. So when you're not ready, then you basically have to register this thing called a waker. And so the waker was listed on a previous slide as something that's in the standard library support. And what the waker essentially does is it's a way to signal to a sleeping future that it's ready to progress. It's ready to make another pull invocation. And so when you basically put the future to sleep, then you essentially have to register the waker somewhere to be called later to wake up the future. And basically, then you return pull pending. So here is an example of when you would wake a future. So here we have waker.wake. It's pretty simple. You basically call that when the future is ready to make more progress. So something always needs to call wake when the future has reached a state where it can progress, because otherwise the future is put to sleep indefinitely and will never wake back up. And so one of the common misconceptions or difficulties learning asynchronous program is kind of where do you call wake? And so I've put a few examples of kind of where you could call wake. It could be in a spawn thread in the background. It could be in the drop method on the return data type from pull. So say you return a handle indicating resource acquisition in the drop method, then you could theoretically call wake to signal to other sleeping futures that they're ready to progress now. The waker could also be sent using a channel across to another thread, where you basically have a thread in the background listening for events and determining whether the sleeping tasks are ready to make progress. So there are a lot of options and it's really up to the design of the developer. So some important notes on futures are that creating a future does not start its execution. A lot of people expect it to, but it doesn't. So it must be either pulled or awaited to kind of start execution. However, using Tokyo spawn will immediately begin execution in the background. So that's one case where you can start it immediately. Pull also needs to handle wakeups where no progress can be made. So in general, you may expect that pull only needs to handle cases where waker.wake has been called explicitly, but there are scenarios where it can be woken up due to side effects and where no progress can be made. So your pull implementation always needs to be able to handle that. We can now move on to kind of sort of the Tokyo runtime basics. The runtime contains an executor and scheduler. So that's where a lot of the IO code, like kind of the functionality for ePoll that you would get asynchronously in Tokyo is built into. It has multiple runtime backends, so you can choose either a single threaded backend where basically all of the tasks are interleaved with each other on a single thread, or you can have a multi-threaded backend for true parallelism. And futures are spawned as tasks and multiple futures can make up a task. Multiple tasks can be served on a single thread. So the end result is that blocking in a future can stop other futures from progressing, and it probably makes a little bit more sense now why you should not block in either pull or in async function. Because that could theoretically stop another future that is ready to progress from progressing. So there are two notes on the blocking aspect of things that I just wanna call out, which is that async functions can use blocking mutexes as long as they are not held across awaits. So basically you cannot acquire a mutex then do an await and then release the mutex. So as long as you're just acquiring a mutex for one single statement that is not across an await, that is okay to use blocking mutexes. And blocking mutexes are actually common in pull implementations, but they can only be held for the execution time of pull because the general thought process is that if you acquire a mutex and then drop it as soon as pull completes, pull is supposed to be very short running. So if it's only acquired for that scope, that is equivalent to non-blocking from the standpoint of asynchronous programming. So there are two types of threads. There are blocking threads and core threads. So the core threads are the kind of OS threads that we've been talking about that can serve multiple asynchronous tasks on a single OS thread. Blocking threads are kind of a special case where they're used for areas where you cannot get around blocking and they basically assign a single task to the blocking thread so that it can block without causing any other task on that thread to kind of stop progressing. And so that those blocking threads will get into a little bit more but that's one way that you can block in asynchronous programming. I also just wanna call out that the Tokyo documentation is very thorough so please take a look at the glossary and the more advanced topics in the tutorial because they'll give you more of a working knowledge of some of the terminology and kind of code that you can write with asynchronous programming. So we're just going to quickly go through syntax and common usage. We've already kind of covered async in a way but that's one thing that I just wanted to mention here. We also have spawn which we've referenced and so the way we can think of spawn is that basically we have this long network operation and so we call spawn on this asynchronous task and basically it spawns in the background and starts immediately executing. We do a lot of other work that we have to do before we can handle the response. Then at the end of that we await handle the response and that's a great example of kind of a basic asynchronous workflow. Spawn blocking is very, very similar but basically it's for blocking operations and so this is going to spawn this task on a blocking thread so you can basically do all of your blocking here and so it's a similar kind of API. You spawn the blocking network operation. You get the handle, do other work and then await and handle the response. So block on is kind of the inverse of spawn blocking where from that we're calling basically a blocking function from an asynchronous context. Block on allows you to basically call asynchronous code from a blocking context. So in general here we have an example where we basically want to make a blocking call but then after that we want to evaluate a future and this is assuming that we're calling mysync function from a context where it is safe to block so we can call the blocking call and then we can basically get the asynchronous API and then block on it so that we block until it's ready because you can't use a wait outside of an asynchronous function so you would need to block on the asynchronous code. This would evaluate it asynchronously in the background blocking until it's complete. Join all is kind of a great example of spinning up a bunch of futures in parallel and waiting for all of them to complete. So here we basically have a vector of futures so an arbitrary number of futures. We pass that in to join all and await on it and then when they're all done completing then we can basically iterate through all of the results and handle them. Select is kind of turning that on its head where we basically have a future of a couple futures and we basically wait until the first one returns and when that first one finishes evaluating then what we do is we basically handle the result of whichever future evaluated first and we cancel all the rest. So the previous one was kind of waiting for all of the futures to complete. Select is waiting until the first future completes. So hopefully that gives you a little bit of kind of an overview of basic syntax usage, kind of common functions that are used to handle futures and now we're going to dive into a case study of a modified read write lock that we had to implement on our team. So the use case is basically that we add a table of data structures that all need to be locked independently and the IPC mechanism that we're using requires fetching read-only information from all data structures at once. We initially considered locking each data structure individually but that scales linearly in terms of lock acquisition time and we began to notice a slowdown when we were fetching these properties and so we ultimately decided that we needed to design a synchronization lock that handled our particular use case. So we basically need to handle a single read lock on an individual element, a single write lock on an individual element and all read lock on all elements and in all write lock on all elements. So we're going to actually dive into a little bit of code here. So I've highlighted kind of the important sections here but basically you'll see we have our all or some lock and in that we basically have a lock record which is basically keeping the state of what is locked at any given moment and we also have this table inside of a mutex and that's keeping track of all of our individual data structures that we want to lock. So we basically have an all read lock kind of field here and this is an integer because we can have an arbitrary number of all read locks at the same time as long as they're not conflicting with an all write lock for example. We can only have one write lock, all write lock at a time and we can have as many single read locks on an element at a time as we want mapped to which element is locked. We can also have, we can have one write lock on each individual element so we keep track of which element is locked and have exactly a count of one for those. So as you might notice if you're familiar with Rust this is upholding kind of the mutable immutable reference requirement so basically we can have all of them read locked as many times as we want to or all of them write locked exactly once so this should look very similar to references in Rust. Next, we kind of have this implementation where we're going to dive a little bit into acquiring a single read lock on an element and as you can see here we basically just kind of create a future and then we await on it. So this is a great example of the kind of async await workflow. We create a future and then await on it and that's really all that this does. So it gets a little bit more interesting when we actually get to the future that we are awaiting and that's the sum read future. So I've highlighted that this is the poll definition right here so we're actually going to see a more complicated example of implementing the poll method here for the future trait and I highlighted the acquisition of the mutex mainly because this is a great example we're using a blocking mutex here and because it's only acquired for the length of poll it is okay to use this. Next, we kind of check whether we have a UUID or name that is registered in our table of data structures and if we determine we don't have anything with that name or UUID what we essentially do is we basically return that we're ready and we found nothing but we also above that call wake and one of the reasons this is important is because basically we've determined that we're done with our work but anything else that might conflict with this read lock needs to be woken because now it can make progress because we have finished our read acquisition so we have to wake up everything that's been put to sleep in the meantime. If we do find that something has this name or UUID in our data structures then but it conflicts with another lock that's already acquired as you can see in this next slide up on the top right then we basically add a waiter and what we do with that waiter is we store the waker so that we can later call wake on this for it to make future progress and then we return pull pending so this essentially puts the task to sleep and if we determine that there are no conflicting locks below in the else block then what we can do is basically register that we have acquired a single read lock on this element we basically get a reference to that element using unsafe code and put it in this kind of some lock read guard which some of you might recognize if you've worked with mutexes before in Rust. This is essentially a data structure indicating the lifetime of this lock acquisition and we can access the individual data structure from this guard. So one thing that I just wanted to call out about futures is that when you drop them that is generally considered canceling a future and in this case we have to augment the drop method a little bit for the future mainly because if you're just canceling a future and there's no state stored in an intermediate place then you can just drop the future and it'll be canceled but in this case we're storing things like what's waiting for the lock in our all or some lock and we basically have to have this cancel method to essentially clear out all of the intermediate state if we drop the future because say we drop the future and then it doesn't clear out all of that intermediate state we could bump into a place where basically the lock can no longer service any future requests so always be careful about the intermediate state that you're leaving in your asynchronous infrastructure. For the some lock read guard then this as I mentioned kind of indicates the lifetime of the lock acquisition and as I mentioned as an example of where we can call wake this is a great example of returning some lock read guard from the pull method and then calling wait in the drop method that was one example that I gave and what we do here is we remove our record of the read lock and then we call wake basically saying we're no longer holding on to the read lock and so wake up all of the lock tasks that conflict with this because they can now make progress and potentially acquire the lock. So that was a lot of information and I just kind of wanted to go through a quick kind of series of closing thoughts about asynchronous rust because as you can see the design definitely requires some knowledge of the implementation to do it right. So some of the benefits are that the highly parallel nature of asynchronous rust allows very good performance on very few threads. So in general, if you are doing a similar sort of workflow without asynchronous rust, you'll usually be required to have many more threads kind of handling that same workload and with asynchronous rust, you can actually have those tasks kind of executing on fewer threads and it gives you kind of an ergonomic way to interface with that. It also has support from any types of combinators and asynchronous workflows that are already built into a lot of the libraries that exist. So you don't really have to build everything yourself either. The design and documentation is actually very good and so in general, I'm usually able to find answers to questions beyond that the community is very helpful and rust's kind of promise of fearless concurrency also extends to async code. You're going to get all of the benefits of type checking, lifetime checking, all of that kind of stuff in asynchronous code as well. Some of the drawbacks that I just wanted to mention is that asynchronous programming requires care. So as you see, it can be very, very detailed and it can take a lot of thinking about the architecture, where you're blocking, when you're blocking. So an OS thread architecture can be used if requirements don't necessitate async. In general, it's not bad to just use OS threads but if you're looking for performance, then you know you probably want to go towards the asynchronous side of things. Lifetimes can also be harder to manage in asynchronous codes. So asynchronous code has kind of the problem that we have multiple threads operating and so in general, we assume that a lot of the time, if you have a lifetime, then you basically, it has to be static and so it gets a little bit harder managing lifetimes and deadlocks and hanging can be a little bit harder to debug. There are cases where you have tools like Tokyo Trace that kind of help you out, but it can be a little bit difficult. So with all of that said, thanks for attending. If you're interested in looking into the implementation a little bit more, I've linked the repository where our modified read write lock lives and hope you have a great rest of your time at the conference. So how, which I'm quite vocal on the rest, do you see Rust basically solving these issues of this native AI interface that cannot affect? So the question is whether I think asynchronous Rust could help in kernel space to solve some IO problems that have popped up and so in general, I've heard a lot of discussions. I'm also involved with the effort to put Rust in the Linux kernel and in general, I've heard a lot of people being very hopeful about asynchronous Rust in the Linux kernel. One of the things that I think I've seen work happening on is basically building an executor that works in kernel space. So as I mentioned in the talk, one of the major challenges is that you need an executor to actually handle the asynchronous tasks because of the architecture. So there basically has to be some work on basically building an executor that works in kernel space. And that may be merged. By now I'm a little bit out of date on my information, but that's one of the things that's popping up. So an executor can work without an operating system. So in general, the question is, is there an executor that works without an operating system? And in general, there are executors that work without an operating system. So I think I've heard of people implementing executors even in lower level situations than the kernels. So it is possible. It's, Rust is a pretty low level language. So as long as you kind of satisfy the executor requirement and the trait requirement, then you can basically implement your executor wherever you want to. The problem gets a little bit more complicated when you're dealing in environments that don't actually have an allocator. And that's where you have to get a little bit smarter about how you're designing your executor because that can get a little bit complicated. Sure. Okay. I think that's it. Thanks so much. Thank you.