 Hello everybody! Super excited to be here with you at Cloud Native Rust days. For those of you who haven't had the pleasure to meet yet, my name is Luca Palmieri and I work as a lead engineer for True Layer, a financial technology company based in London. I've been contributing to the Rust community for the past three and a half years. You might have seen some of the work I did to get people interested into the language, workshops, code doges with the Rust London user group, or you might have used some of the packages I maintain, Cargo Chef for faster docker builds, or Wiremock to test some of your Rust API clients. For the past 12 months or so, I've mostly been focused on writing zero-to-production in Rust. Zero-to-production in Rust is a book that guides a developer which is new to the ecosystem or new to the domain in understanding what is available and how to make use of it efficiently to actually build cloud native applications. And some of the topics that are touched in the book will also form some of the material for this talk. Coming to the matter at end, what drives most of the conversation we're going to be having today is the work we've been doing at True Layer for the past year. True Layer built a new product line called PayDirect. The details of it don't matter necessarily too much. You can think of PayDirect as an API which allows a merchant to manage what looks like a bank account. So they have funds held by True Layer that can receive payments from the Rust Indie ecosystem and they can initiate payments towards other accounts on the network, all of it done via APIs. That's one question that keeps being asked over and over again. Why? Why have you chosen Rust? And one side of the question we might frame as a question of novelty. Rust, no matter how awesome we believe it is, it's still a young programming language. 1.0 came out shortly five years ago, more or less, compared to the age of maturity of ecosystems like the JVM, like the CLR, like Python and TypeScript, is still fairly young. So there's an inherent element of risk and we're going to touch on some of those later on. But the other part, the other like intent behind the question is actually confusion between what people think Rust is and what we're using Rust for. And that's because Rust has made a lot of his early victories and a lot of his most striking successes in a different domain, not in cloud native applications, but in system programming. And this is a little scrapbook that I put together in like five to 10 minutes of just high profile projects that not so long ago have been on the headlines for using Rust. All of these things are massive accomplishment, especially if we go back to what we were discussing a few moments ago, considering how young the language is. But all of these massive accomplishments fall within the domain of system programming. And as people like to remember me, you are not doing system programming. So a lot of the things that matter in system programming, memory safety, squeezing out merely seconds by being very careful while you're locating memory, adding binaries which are very small because you're deploying them on constrained devices don't necessarily matter in the cloud native ecosystem. We often are constrained, not by our fast, our programs are, but on our fast, we can develop them. How many mistakes we make along the way, how fast can we fix those? And how fast most importantly, can we evolve that software over time? Speed of iteration is almost always the biggest constraint in big cloud native deployments, especially microservice architectures. So why are we using a language which was engineered to solve problems which are not the problems of the cloud native ecosystem? You don't have to draw a wall between different domains. Things which are good for system programmer can be equally interesting and equally beneficial to people working in back-end development. The same applies to front-end and other domains like game development, which are usually seen as very silent and not communicating with one another. Actually, by foraying outside of what is familiar, we can find out that some of the problems they have are some of the problems we have, if you look a little bit under the surface. And the solutions can be reused and can be the foundation to build something much more interesting for what we want to solve. And that's pretty much going to be the theme of this whole talk. You have seen the title, we're going to talk about the good, the bad and the ugly of using Rust for the cloud native development. And this is of course based on our own very specific experience. And I'm sure that different people coming from different backgrounds trying to solve different problems might have a different take. But I think it's important to start to break, to one extent, the strong association that Rust has with system programming. So let's start from what is good. Once again, people remember beginning and the end, so put the bed in the middle. What are the best aspects of Rust when it comes to cloud native development? Well, the number one for me and like, Q80% of why I love to use the language today is that I'm able to compose smaller bits of functionality into larger program without having surprises found at runtime. So let's look at an example. Let's look at this function, verify signature. It takes a JSON WebToken as argument and it returns us the claims inside the JSON WebToken. Now, if this was written in a different programming language, depending on what that language was, this might look slightly different. In a dynamic language, you wouldn't have the types. So you would infer that anything that behaves like a token in some not very well defined way is going to be a valid input. And then you would have to write a bunch of tasks to make sure that anything which is not a valid input, fails in a predictable manner. For example, raising an exception. And that is his own problems, right? Is somebody's coming back to the code base tomorrow. They don't know about all those assumptions because they're not in that realm. They're not loaded or the implicit knowledge required by the code base and they might make mistakes. And if those are not tested for, then those mistakes might arrive in production. We don't like that. If you move to something which is statically typed, like C sharp or Java, then what you might have is something similar. So a function which is marked as a synchronous. So it does some input output. It might be over the network. It might be over the disk. It takes as input a token and you would definitely have a GWT type, which has some constraints encoded inside the type. And it would return some claims. But that function might mutate the GWT token and passing to it as an input. And also it might fail, but the signature doesn't tell me that it might fail. That's because they're using exceptions to signal that some things might not complete successfully. But I don't see in the signature what exception that function can raise, which means that I either need to go and look at the source code and inspect every line to understand if it might throw an exception or I need to be very defensive. So I need to add tri-catch statements which are very broad to make sure that they actually fulfill all the different corner cases. Now let's come back to Rust. Once again, I have an async keyword on the left that tells me this function might do IO. That's great. The token, not only is a GWT, which tells me something about the structure of the input, I also know that I'm passing in a reference to that GWT. And that's an immutable reference. That means that I know that my verify signature function relaxes the GWT, but it won't mutate it. And that already simplifies massively debugging, for example. If you're trying to understand where things go wrong, knowing that your data is not changing unless it's very explicitly demarked as being mutable makes it much, much faster. You can rule out massive parts of the call tree without even looking at those. And equally important, at the end, I know immediately that this function can fail because this function returns a result. And a result is a way to encode inside the type system that if things go well, I'll get a claims, but if things go wrong, I'll get an error. That error, the verification error, I can go and look at the definition and I can see all the different cases, most likely it's going to be an enum, and I can decide what to do in each of those cases. This allows me to reason about what matters to my software when calling your software without actually having to go inside and do technical due diligence of your implementation. And that's the basis for powerful reuse, being able to actually isolate and encapsulate complexity by exposing just what matters to the person using the abstraction and powers us to build towers which are much higher and much more robust than what we can do if we stack things on top of each other without controlling all these effects. Let's get to the second pattern, which is actually very important when writing cloud-native brass code, being able to write state machines. State machines are everywhere. In your programs, you have resources that most likely have a lifecycle of sort. A user, a user might sign up, they might confirm their email, and then they might suspend their account or close their account. A payment gets authorized, then it gets booked, then it gets submitted, then it gets settled, or it fails. They have a finite number of states which are valid, and they are very clear business rules on how to go from state A to state B, and it's also clear that you can go from state A to state C. This is extremely difficult to encode if you're using class hierarchies, like you can have an infinite number of subclasses. Instead, if you're using Rust, you can encode these finite state machines using algebraic types, so using enums. Enums which are able to have values, so each of the variants can contain a payload. It's not just a fancy way of having an integer, 0, 1, 2, 0, pending, 1, active, 2, suspended. Each of those is actually struct, which has fields, can be constructed and used. And they're finite, so you know if you actually used and verified all the possible paths every time you're using a user. Let's make an example, and this is classic. When you start modeling at domain, you don't know exactly all the different corner cases. So if you're starting to write your platform, you might say, well, a user is either pending or active. And that's because the world is simple, right? You don't have users yet. Like your biggest focus at that point in time is getting more users. So you focus on the user creation part of the user lifecycle. So you go to production with pending and active, and all the different parts of your code bases know what to do if a user is pending and if a user is active. At a certain point in time, you realize the user might want to freeze their accounts. You have become successful, now have a sizable user base. Now this new part of the lifecycle enters your domain model. And you need to make sure that everywhere in your code base, when you're handling a user, you also are taking the appropriate action if the user is suspended. For example, they shouldn't be able to look in. And Rust helps you massively here because enums being finite. So the not open-ended class hierarchy that a finite number of states allows the compiler to actually do exhausted matching. So every time you want to use a user, you need to match on that user. And you need to have a match arm for every possible variant. The moment you introduce a new variant, the compiler will scream at you. We do all the different places in the code base, which require you to handle the new variant. And then you know that until it compiles, you actually come forward. And that is just amazing because you're using technology to scale up your mental model in a complex code base written by somebody else involved by somebody else afterwards, which you landed on 15 days ago. You don't know all the cool parts. You don't know all the places where the user is used. But technology is there to help you scale your understanding is there to help you fall into the pit of success. Third good part of Rust for cloud native development, predictable performance. And I want to stress the word predictable. So I'm not here to talk about the fact that Rust is fast. For most of the application doesn't matter. As we said before, the biggest constraint when doing enterprise software is how fast can you iterate. And in most cases, if your application takes 70 milliseconds instead of 100 instead of 150 doesn't actually make a massive difference to the user. In certain places, it does. There are bottlenecks in the system where you want to be as fast as possible. But there are also plenty of parts where speed doesn't matter. What matters though is being able to reason about the system as a whole and being able to reason about the way it's going to behave in a consistent fashion. So what matters is when looking at your latency lines is seeing something flat. So I want to know that fetching the settings for the client is going to take on average 50 milliseconds. And it's much easier to reason about the behavior of the system if it always takes 50 milliseconds. If from time to time it takes 50, but sometimes it takes 300, then the situation is much more complicated. And that happens a lot if you're using garbage collected languages. But it's much easier to reason about your system if everything you need to look at is the code you have written. And Rust being able to free memory in a deterministic fashion thanks to the Bodo checker and life type tracking actually give you that super flat fine. Now, note everything is roses and rainbows. Certain things are not working at the moment when it comes to using Rust for Cloud native ecosystem. And actually, my list is not particularly long. There's one items, but I think it's massive, compilation times. Rust compile times are long. We have applications at true layer which are reasonably complex and take up to 15, 16 minutes to compile in release mode. So when we're building Docker images to release in production on local machines, incremental compilation, caching, the experience is fairly smooth. But we are experiencing a lengthening of the feedback loop, which is actually very, very important in a company, which is write code commit, merge to master, deploy, observe. The merge to master, deploy, step is too long. I can't wait 15 minutes to go from code is in my trunk to code is running in production. Continuous deployment is the tempo of your organization. If you have an organization that can only deploy once after 15 minutes, that means that you can make four experiments per hour and you can make with like awesome usage of your time, which doesn't happen, 32 experiments per day. In reality, probably there's something like 20 because I put something in CI, then I do something else. I forget, waste a bunch of minutes, then I go back and so on and so forth. If I can deploy every two minutes, I can make the same number of experiments within a single hour. And within a day, I can iterate on so many different configurations that actually I can converge to what I need much faster. This is probably the most threatening aspect of the Rust programming language today when it comes to adoption within the cloud native ecosystem. Now, that's not to say that all the rest is amazing, but the rest they classify as ugly. Ugly means that the situation as of today is not optimal, but there is a clear path to get those things unblocked and to make them better. First one is ecosystem acuity. And we touched on this point at the very beginning. Rust is a young language. And Rust in the cloud native ecosystem specifically is even younger because Rust really became viable to do back-end development from the moment it got a sink await. And it got a sink await in November 2019. And let's factor in four or five months for the ecosystem to shift to use a sink await while talking like mid 2020. And there's a load of exploration that still needs to happen in the ecosystem to understand what ergonomic means, what are the best practices of using that feature of the language. Therefore, the ecosystem is moving. Like there's a bit of churn, libraries are still evolving, and that might create a little bit of additional work. The other side of it is that a lot of libraries do not exist as of today. I'm talking about libraries to interact with big vendors. You want to use it up, there is Brasoto, great. You want to use GCP. Well, you might have to buy the bunch of stuff on your own. You want to use Volt, same story. So you cannot consider, as you might do, for example, in Python or in Node, the open-source ecosystem as a vendor you are shopping from. Once you use a package in Bras, which is open-source, you need to take into account that you might have to devote a percentage of your development time to actually go and contribute to that library. It might be that the documentation is missing a bit. It might be that it is the usual code path that you are exercising, which is not as well tested. Factor that in. Go into the ecosystem with your eyes wide open. It happened to us as well. Over a year we contributed, I think, almost 10 different patches to fairly well-known projects. So we're not talking a project with one or two stars. Be ready for doing your part in pushing the ecosystem forward. The second one, and I know this is once again top of the agenda for many different people. Rust does not shape an async executor inside the standard library. Rust shapes the future trait, so it shapes the interface on what doing an asynchronous task looks like, but it pushes the actual logic to orchestrate those asynchronous tasks into libraries. At the moment we have Tokyo on one side, the oldest executor, then we have Asyncstut, then we have Bastion, and a bunch of others who specialize these cases. And it's not easy to interoperate. We don't want libraries to have to choose in which side they're playing. I want to be able to write a library that works with most executors, and that requires us standardizing on certain traits and requires us making sure that that experience works. And I know this is one of the focus of the Asyncfusion effort that is getting started in the past couple of months, and I'm sure we're going to see improvements. But as of today, I think that I recommend to beginners, which I don't like, is pick an executor and stick to it, because that's going to make your experience simpler. I would like to see a future where these advances are not necessary. Third, intermediate learning resources. Rust is amazing when it comes to getting beginners into the language. You have the Rust book, you have programming with Rust, you have Rust in action, you have awesome video series on YouTube of different kinds and levels. But when you want to do cloud native development, there's a lot more that is not in those resources. Async await, for example, is often not contemplated. What libraries to use, not discussed, how to stitch them together, not really discussed. Part of this is why I wrote Zero to Production in Rust. It's like, I don't plan to add this as an official resource. This is, as the title says, an opinionated way of getting something that works, or putting a bunch of tools inside the box and getting you something that you can use with Rust to make these kind of programs. But I'd like to see more. I'd like to see more resources with different angles, videos, books that take a different approach. People need to have the possibility to choose how to approach the ecosystem. Because everybody learns differently. And there's a void, a massive void here that needs to be filled in. And that was the last item in the list of ugly things that they're seeing Rust for cloud native today. Thanks a lot for tuning in. I hope these were a useful presentation and shared an experience that you can learn from. If you have any questions, I'm happy to take them now.