 Well, hello everyone around the world. Thank you so much for having me here today. When I work with someone who is just learning Rust, they sometimes question whether the Rust Borrowchecker is their friend or their foe. If you take a look at the Rust subreddit, it's common to see posts with headlines like this. Newbie or question regarding the Borrowchecker. Help fighting the Borrowchecker. And does it ever get easier fighting with the Borrowchecker? However, as you watch people get more experience in Rust, they tend to come around to the Borrowchecker and realize what it protects them from doing. So in answer to the question, is the Borrowchecker a friend or a foe, I say the Borrowchecker will become your friend through experience. And along with gaining experience with it, it's also very helpful to understand how it works and why it does the things it does. We're going to dive deep into that, but before we do, let's briefly cover who I am. I'm Nell Shamerell Harrington. I'm a principal software engineer at Microsoft. I'm also a board director of the Rust Foundation. I'm the lead editor of This Week in Rust. If you're not subscribed, I hope you subscribe right after this to keep up with all the wonderful things the Rust community is doing. You can also tweet at me at at Nell Shamerell if you wish to get in touch with me or just say hi. And to set expectations, we will first do a bit of background on the Rust compiler in order to put the Borrowchecker in context. Then we will do a deep dive on the Borrowchecker. And let's go ahead and start with an overview of the Rust compiler. Let's take a look at a code example. This Rust function declares a vector composed of the integers 1, 2, 3, 4, and 5. After it declares this vector, it then uses a for loop to iterate through each integer in the vector and print it out on a new line. So let's go ahead and run this code using cargo. And as expected, we see the numbers 1 through 5 printed out on the screen. This seems pretty simple. Cargo builds and runs the piece of code for us. But there is a lot that happens underneath the surface in the compiler when cargo is building it. There are five general stages to compiling a piece of code. It starts with the lexical analysis of the code, then parsing of the code, semantic analysis of the code. This is where the Borrowchecker comes in in the Rust compiler, optimization of the code, and finally code generation, where the compiler creates the executable binary of our code. When we look at the stages laid out in a list like this, it seems like they will run linearly. And in some compilers, they do. However, if you've delved it all into the Rust compiler internals, you might be thinking, wait a minute, isn't the Rust compiler at least partially query based rather than linear based? And the answer to this question is yes, but that is out of the scope of this particular talk. For the sake of clarity, I'll speak to the internals of the compiler as if they were functioning linearly. However, if you want to delve more into how the Rust compiler is query based and what that means, check out the guide to Rust C development for more information. This guide has been a big help to me as I've learned how to hack on the Rust compiler. Going back to the stages of compilation, let's start with the first one here, lexical analysis. During lexical analysis, a program within the compiler called a lexer takes the raw Rust source code called a lexeme and analyzes it, then splits the code into tokens to make it easier for the compiler to parse. And that brings us to the next stage of compilation called parsing. In this stage, a program within the compiler called a parser analyzes the tokens generated by the lexer and translates them into an abstract syntax tree or AST. Having the tokens in this data structure makes it much quicker and easier for the compiler to do the rest of its work. At this stage, before it moves on to the next stage of compilation, the Rust compiler takes that abstract syntax tree generated by the parser and first expands any macros included in the code. If we look back at our code example and let's zoom in on this line. Print line is a macro, which means this line would be expanded to this. This is what the full print line macro looks like and how it will be represented in the ACT. This is part of me. This is what the full print line macro looks like and how it will be represented in the AST. It also de-sugars some of the syntactic sugar that makes writing Rust so delightful. For example, in Rust, the for loop is a piece of syntactic sugar for an iterator. If we were to de-sugar this section of code, it will consist of both a match statement and a loop. The functionality of this code is identical, but de-sugaring it makes it easier for the compiler to understand it and optimize it. At this time, the compiler also resolves any imports in the code. So if you are bringing in an external crate or using internal crates or modules, those will be resolved here as well. After these steps, the compiler takes that AST, that abstract syntax tree, and converts that AST into higher level intermediate representation, or HIR. Let's pause here and take a closer look at that HIR. It helps to understand the data structures that make it up. The first data structure is a node. This corresponds to a specific piece of code. This is identified by an HIR ID, and that node belongs to a definition. A definition is an item in the crate we are compiling. These are primarily top-level items within the crate. This is identified by a def ID. And a definition is owned by a crate. This is the crate we are compiling with Rust. It stores the contents of the crate we are compiling and creates a number of maps and other things that help organize the content for easier access throughout the compilation process. This crate is identified with a crate num. This is the original Rust source code we are compiling. And let's take a focus on this section here, the for loop that iterates through the numbers vector. Remember, this for loop d sugars into a match statement and a loop. If we look at the node in the HIR that represents this match statement, we would see something similar to this. It's a little hard to read for humans, so let's go ahead and break it down. This arm structure represents a single arm of the match statement that our for loop d sugars into. And first, we have the HIR ID for this piece of code. This identifies a node within the HIR. And that node is owned by a definition. The definition is some top-level item in the crate. And that definition data structure is owned by a crate data structure. So our for loop is a node within a definition within our crate. That's how we can identify where this node corresponds to our original code. What this also helps us do is we use something called a span. This span stores the file path, line numbers, and column numbers of the original source code. This will be very important in the future as we optimize and desugar the Rust code. If we encounter a problem with the code after it is desugar and optimized, we still want to be able to show the user where in their original source code that the error was generated. If we were to show them the desugar code, which is different from the code they wrote, it wouldn't mean much to them. The compiler then takes the HIR and lowers it into the mid-level intermediate representation, also known as the MIR. The MIR is constructed as a control graph. The units within this graph are called basic blocks, which identify with values like BB0 and BB1. Within these blocks, each has a sequence of statements that execute in an order. The very last statement is known as a terminator. This controls when and how the program proceeds to another basic block. This is a pretty simple example. There's only one direction the basic blocks can go. But if our code had an if-else statement, then the terminator of BB0 would have the option to either proceed to BB1 or to BB2. In this case, there is more than one path that the program can take when it encounters the terminator in BB0. And there are definitely more data structures involved in the MIR. If you're curious about them, definitely check out that guide to Rust C development. Again, it's a great resource for learning how the Rust compiler works underneath the surface. Let's go back to our D-sugar code. And let's focus on this match statement, which is assigned to a variable called result. If we looked at the MIR for this piece of code, it would look similar to this. I've simplified it for the sake of appearing on a slide. Up here on the top left, we have our basic block identified here as BB2. And then we have what is called a local. A local in the MIR represents a place in memory or, more specifically, a place on the stack frame. In this case, underscore five corresponds to the value of the variable result. And like in the nodes in the HIR, we have a span. The piece of the original Rust source code that each node in the MIR corresponds to. Again, if we encounter an error when we're operating with the MIR, we can still easily refer to what lines in the original source code cause the error. And that brings us to this third stage. And this is big in the Rust compiler. Semantic analysis. This is where the compiler tries to figure out what the programmer is trying to do in a way the compiler can understand it and then translate it into machine code. And at this point, after it's lowered the HIR into the MIR, the compiler will run several checks, including the borrow checker. And we'll come back and dive deep into the borrow checker in just a few minutes. But for now, let's focus on these last two stages of compilation. Optimization and code generation. These stages are where the code is transformed into an executable binary. And the Rust compiler, we use LLVM to do this for us. LLVM is a commonly used collection of modular and reusable compiler and tool chain technologies. The Rust compiler uses it to further optimize the code and generate the machine code to run it. Before it uses LLVM, the Rust compiler takes the MIR we created earlier and lowers it into the LLVM intermediate representation, or LLVM IR. And the LLVM IR honestly is pretty unreadable to humans, but it looks something like this. As you can see, it's organized into basic blocks like the MIR. We then use LLVM and pass it the LLVM IR, and then it runs more optimizations on it and emits machine code. It then links the machine code files together to produce the final binary. And that's why when we call Cargo Run, we see all those numbers printed out. So yay! That gives you a bit of an idea of how the compiler works to take your Rust source code and make it into something a computer can execute. And at this point, I do want to go back and take a deeper look at that borrowed checker because this is really where the magic of Rust happens. And in order to do that, let's use a different piece of code. If you have some experience with Rust and you're looking at this code and thinking it will error out, you are right. We'll see how and why that happens in just a moment. First, let's go through this code line by line. We declare the variable X and give it the type string. And then we set the value of X in the string to high cloud native Rust day. Then we say that the variable Y's value is equal to the value of X. And then we attempt to print both variables. If we tried to build this code with Cargo Build, we would get an error. And this error is a result of the borrowed checker. Let's go through how the borrowed checker identified this error. The borrowed checker does several things, including tracking initializations and moves. How this plays out in our code is when we start with this first line where we declare the variable X with the type of string, X is not actually initialized at this point. It won't be considered initialized until it is assigned a value. If we look at the MIR for this line of code, we see that X is represented by the local underscore one. And local underscore one is assigned the type of string. Now let's look at this line of code where we create and assign the high cloud native Rust day string. Now that X has a value, it is considered initialized at this point. If we look at the MIR, we create a new place in memory, local underscore two, where we store the high cloud native Rust day string. Then we move the value stored at underscore two to underscore one. Remember, underscore one corresponds to the X variable. So we have created the string in memory and moved it to be the value of X. Now let's look at this line where we attempt to create the variable Y and assign it to the value of X. This line is where the value of X is moved to Y. If we look at the MIR created for this line of code, we will see that Y is assigned to the local underscore three. Remember, a local represents a place in memory. And underscore three is given the type string. And then the value at underscore one, remember that represents X, is moved to the value of underscore three, which represents Y. This means when we get to here, this means when we get to here, when we try to print X, X is not initialized at this line. So we cannot print it. And that's what generates this particular error. We're attempting to use a value of a variable after it has been moved and the Rust compiler says you cannot do that. Something I'd like to specifically call out is how the compiler shows where the error was generated from in the original source code through a span. Even though we had lowered this code to MIR or mid-level representation, we still tracked what items in the MIR corresponded to what places in the original Rust code. This is very helpful to the end user. What is also helpful is this. Remember, the Rust compiler not only tells you what the error is and where it is, it gives you a command to get even more information about the error so that you can fix it. If we run this command, we see not only an explanation of the error, but also a piece of example code that would generate it. Then the message gives you even more information about how to fix it. This suggests using a reference to borrow a value rather than attempting to move the value, which is what we saw done in the MIR representation of our code. So let's do this and change this line so that Y is assigned to a reference to the value of X rather than moving the value of X into Y. After we've made this change, if we run our code again using Cargo Run, the compiler will build the code and execute it, and we see the message High Cloud Native Rust Day printed out twice. Rather than fighting the borrow checker, we used it to make our code even better. Along with tracking initializations and moves, the borrow checker also deals with lifetime inference. And let's go over what this means. Rust uses the word lifetime in two distinct ways. The first is to refer to the lifetime of a value. That's the span of time before the value gets freed. Another word for referring to the lifetime of a value is referring to the variable scope. Let's see how this plays out in our code. And let's start with this first line, where we assign the value of X as this string. At this point, X is live. Its lifetime begins here. When we get to here and move the value of X into Y, this is the end of X's lifetime. When we get down here and try to use X again, X is dead. Its lifetime is no longer in effect, which is why this program as it's written now will error out. The other way Rust uses the term lifetime is to refer to the lifetime of a reference to a value. This is the span of code in which the reference can be used. Let's take a look at our corrected code, where we assign the value of Y to be a reference to the value of X. If we look at the MIR for this line of code, we remember that the local underscore one refers to X, and the local underscore three refers to Y, and we see that underscore three, that underscore three local is assigned a reference to the value of underscore one. Looking back at our code, let's alter this slightly and try to drop the value of the variable X before we try to print out the value of Y. After we make this change and try to build this code using cargo build, we get an error. We can't drop X because it is borrowed, and that borrow is used later. The borrow checker tells us that X, because it is referenced by Y, needs to stay alive for at least as long as Y needs to stay alive. X's lifetime must be greater than or equal to Y's. Looking at this line in the code, or looking at these lifetimes in the code, again, this is X's lifetime, and this is what needs to be the lifetime of Y, which is a reference to X. Notice that even though they overlap, X's lifetime ends before Y's lifetime is supposed to end. Y can no longer reference the value of X after this line, and at this point, Y along with X would be dead. In order for this code to compile, the lifetime of Y X must last at least as long as the lifetime of Y. The overarching way the scope and the lifetime of a reference relate to each other is that if you make a reference to a value, the lifetime of that reference cannot outlive the scope of the value. As I move toward concluding this presentation, I want to make sure that you know there is so much more to the Rust compiler and the borrow checker. Again, please, if you want to know more, check out the guide to Rust C Development. It's a fantastic resource. Going back to this question from the beginning, is the borrow checker a friend or a foe? I say it's a friend, though a very strict one, but the best thing about this friend is it will not only tell you when something is wrong, it will also tell you how to fix it, and I find that to be one of the best qualities I can find in a friend, as well as one of the best qualities I can find in a compiler. Thank you.