 Hello everyone, I saw the Rust Survey 2019 results and while reading through, I came across this little bit. People are asking for more learning material about Rust, specifically intermediate level material and a lot of it is asking for video content specifically. And if you're watching this, maybe you already know, I have a YouTube channel where I do live intermediate and uploaded intermediate Rust content. So I was like, huh, this sounds like my wheelhouse. And so I tweeted out like, what would you like to see if I were to do some of this that's like a little less advanced than the stuff I normally do and a little shorter and a little more self contained than what I usually do? Like, what do you want to see? And I got a ton of responses to this and it was pretty cool to see all the ideas that people had. The suggestions were sort of all over the place. And what I basically what I got from it is people are confused about lifetimes and they don't want to see another explanation of lifetimes. They wanted they want to see code that actually uses them in order to try to understand what's going on. And so I figured maybe that's something I could do. I only have one video that's like more beginner friendly than the normal ones I do, which are usually much longer sessions where we build something real in Rust. And that is where we did a live coding of a linked hash map. And that one actually turned out pretty well. I think that's something that's easy to follow if you're relatively newer to the language. But I figured I would do something that's sort of dedicated to this. And if it turns out to work well, I might do more in the future. And so that's where we are now. Specifically in this stream, what I want to do is basically have us write a bunch of code in Rust. Not very much, but sort of a you'll see where we cover like multiple lifetimes, a little bit about strings, because that seems to be something that's confusing people and a little bit of generics depending on how we how we turn out on time. My guess is this will be about 90 minutes. That's sort of the target range I'm going for for this, which is much shorter than my usual streams. I'll try to be less verbose than I usually am. And if this is the first time you watch it in my videos, first of all, welcome. And second of all, if you go to my Twitter account, I have this is basically where I post any new videos that I'm about to release, both to announce the live streams in the first place to ask for input and what you want to see. And also whenever I upload a recording of a live stream, then I will put them up here. And so without further ado, let's get started. Because this stream is sort of geared more towards people who are still getting to grips with sort of some of the complexities in Rust. I'll also be taking a bunch of questions from chat. So if at any point you get sort of you feel like you're not quite following or you feel like something doesn't make sense and you would like me to explain it again, then please mention it in chat and I'll try to make sure to look over there now and again to like make sure that you're all following along. And in particular, this is important because the people who are watching this after the fact, they won't have the opportunity to ask questions. If you have a question live, chances are someone else will also have it later. And if you ask it, the answer will be recorded in the stream. Great. Okay. So I also got some questions actually from people who relatively need to rest in the like, how do I even start a Rust project? That's not normally something I cover, but I figure given that we're starting something new, I'll start from the very beginning with cargo new. We're going to make a library. And the library we're going to make is one that lets you take a string and split it by the string and walk the splits of that string. And so we're going to call it string split or stir split. It's not a very original name, but it doesn't have to be. And inside here, you'll see that we have the cargo Tamil where we get to define sort of metadata about our new crate and also source lib, which currently basically has nothing of real value. And the starting point here, there's a bunch of stuff I like to add as a prelude to any package that I make. And that's things like I like to add a warn for missing debug implementations. Rust 2018 idioms and missing docs. There are a bunch of others you might add to these aren't going to matter that much for the stream. We're not going to be writing a ton of documentation, even though normally I do when I publish something like this. But here we're going to focus more on sort of the insides of the thing we're going to build. But I figured it would be good to give you that prelude. It's something you can use in your own crates too. I like this to be worn, not deny because sometimes these change over time like the compiler gets smarter at some of these lints. And you really don't want a lint to be breaking your compile because someone has a later version of what you originally built with. All right, so here's what we want. Well, we want to type and it's going to be called let's say string split. And the methods really what we want for spring string split is we're going to add fields to it after a while. But there's going to be something that's going to be a new and it's going to take some haystack. Haystack is usually the thing that you are searching in and it's going to take some needle. In this case, we're going to call it the delimiter. So this is the thing that we're splitting by and it's going to return a self. Self here is sort of a special type that refers to the name of the input block. It's useful to use self rather than we could write stir split here. Using self is just nice because it means that if we rename the type later, we don't have to change all the return types of the methods and stuff as well. So basically this is saying I want to split this by this. And we could call this like split by instead of new, but we're not really doing API design here. This is just to give you a sense for what this type is going to do. And then what we want to do is implement iterator for stir split. So that's the iterator trait is the iterator trait allows you to do something like force part in, let's say you have a X that's of type stir split. Then you can do four part in X as long as X or stir split X is type implement iterator, right? So the item here is going to be stir. And the only thing that you need to implement on iterators is the next function, which takes a mutable reference to self and returns an optional item. Notice I haven't put any implementations, any fields yet. I'm just giving you roughly the API we're going to be working with. And the idea here is that the for loop construct is really turned into a while let sum equals the type.next. That's what 4D sugars do. So it's going to keep calling next while it's still returning sum and when it no longer returns sum, then the loop will be terminated. And so the idea here is, we could arguably write this as a test, although this test won't really do much at the moment. The idea is that you have some string like let's say ABCDE. And you want to do for letter in haystack. Well, I guess in this case is going to be stir split new haystack and then let's say here space, right? That's going to be the idea and this is going to produce A then B then C then D then E, ideally. How do we write a test for this? Well, we could do letters is this dot collect or in fact could just be this. And then we can do assert equals letters with something like you can compare iterators as long as they have the same type. This is the basic thing we're building makes sense. We'll probably not talk about higher kind lifetimes that probably won't come up. Won't that just add noise when debugging early prototypes? Yes. This whole thing that this prelude is in the initial phases of development you probably don't want it on because it's just going to cause you get more warnings. So it's harder to see which ones are which ones matter and which ones are more stylistic, if you will. How do you decide between library and binary and how do you check the library output results while coding? So you decide between library and binary depending on whether you are whether you're building a binary or a library, right? Binary is if you're building a program that someone's going to run on the command line, everything else is library. You might build a thing that has both a library and a binary, in which case it doesn't really matter. The only difference between the lib and bin flags are that the bin flag creates a source main and the lib flag creates a source lib. Those are most of the differences, but you can have both in one crate. And if you build a library the way you check its output is you write tests. What do you do to mock external dependencies? I'm not going to cover that in the stream, but there are good ways to do it. I thought all loops desugar to loop with a break condition. Well, in mere, I think, while loops desugar to loops as well. It's just, it's easier to explain it as four turns into while, but you're right, the deeper down while turns into loop. Yep, all of this is going to be about lifetimes you're about to see. Great. All right. Quality comparison between iterators is element wise. Yeah, it also checks that the links are the same. Okay, so now we have a program. And now we're going to have to actually figure out how to write this. And there are a couple of things that immediately come up. So let's start with, I'm going to call this remaining remainder. And we're going to have a delimiter. Right. So we're going to have a remainder. This is the part of the string that we've not yet looked at. We haven't returned anything from it yet. And delimiter is what are we splitting by, which we need to remember over time. Right. And so new is really just going to create a new self where the remainder is the haystack and the delimiter is the delimiter. Notice that the limiter here, I don't have to put field colon value because the two, the field and the variable have the same name. And so I can sort of deduplicate them in the case of remainder, the field and the variable do not have the same name. So I give both. And I, the reason I don't change, I want this to be called remainder and I want this to be called haystack. And that's why I end up with this format. Okay, so new is pretty straightforward. And most of the magic here is going to be in the implementation of next. Right. So the question now becomes, what do we do in order to implement next? Well, the implementation is pretty straightforward. Right. What we're going to do is we're going to find where the next delimiter appears in the remainder. And then we're going to chop off that part of the string. That's what we're going to return. And we're going to set the remainder to what remains after the limiter. Right. So the next delimiter. The next delim is going to be self.remander.find. And we're going to look for self.delimiter. And this is an if let sum because it could be that there is no delimiter in the string. Right. The delimiter no longer appears in the string, in which case we're sort of done. But if it does appear, then what we want is until the delimiter, right, is going to be self remainder from the start until the next delimiter. Right. And then we're going to modify the remainder to be self remainder. Everything following that delimiter. Right. So that's going to be next delim plus the length of the delimiter. So everything from there and out. And then we're going to return some of until the limiter. Right. And if the delimiter is not found, well, in that case, we can just return none. Actually, then if the delimiter is not found, we have two options. We either, either the remainder is empty, in which case we return none. Or if the remainder is not empty, then we return the remainder. So in that case, we're going to set rest is going to be self remainder. Self dot remainder is going to be empty. And then we're going to return some of the rest. You'll see there's actually a bug in this code. I'll get to that later. Okay. Let's take some more questions. Is the cascade itself really the preferred way of implementing that? I'm very much new to Rust and it seems a bit odd coming from other languages. I've seen both. I personally prefer using self, as I mentioned, because it means that if I change the name of the type, I don't have to change anything else. I think the biggest cost to this is that it means you can no longer do sort of local reasoning. Looking at this line of code, you sort of have to figure out which type you're inside the impulse of, which is trivial here. But if you have a really long one, it might be trickier. And also you need to rely on a later release of the Rust compiler. Although this was added, I think a decently long time ago now. I can explain later, when should I use associated types versus generics? There's actually a decent description of that in the Rust book, I believe. The basic idea there is use generics if you think that multiple implementations of that trait might exist for a given type. Use associated types if only one implementation makes sense for any given type. When you use match versus iflet sum, I use match if I care about more than one of the patterns. If I only care about one of the patterns, then I use iflet. How should I read line 15? So I'm using relative line numbers. So when you say 15, it's not clear what you mean. Is else if not required? Oh, you're right. That should say else if. Good catch. Great. All right. So we have an implementation now that we think, I'm just going to write a to do bug here because I know there's a bug, but I don't want to talk about it yet. Okay. So in theory, we're now done. Right. We have our implementation. Great. We're going to just CD into stir split cargo test. Oh no, it doesn't compile. What is this? Right. It's telling me missing lifetime specifier, missing lifetime specifier all over the place. Let's do a cargo check instead so we don't get these duplicated. It's telling me in all these cases where I have references that I need to give them a lifetime because I can't figure out what that lifetime is. Okay. So we do what we've been told. The compiler said add a ticket. So we're going to add a ticket. Okay. Now it's telling us we have to do that here as well. Okay. So we're going to use the new fancy like anonymous lifetime here. We're going to do the same thing here. All right. So we did that. Now they all have a lifetime. I'll talk a bit about what this actually implies, but let's just see if it's happy. It's still not happy down here. Okay. So down here in the iterator, we say that the iterator returns a string reference. But Russ doesn't know, think of this as a pointer to a string, right? And Russ needs to know how long can I hold on to this pointer for? So for example, this pointer, right, we know it points into the remainder here. Right. That's where we know that that's where it points. But Russ, when it calls iterators, the iterator next method, it just gets back a pointer to a string. And if that's all it gets, then can it hold on to that for the end of the lifetime? Can it the end of the program? Can it drop stir-split and then still use it? It doesn't know how long it's okay to keep using this pointer for. And it needs to know, right? Because otherwise it might use that pointer after the string it's pointing to has already gone away, after the memory has been deallocated. And that would be a problem. Well, so we have this tick A now, right? And this tick A here is really a lifetime to like, well, let's keep calling it tick A for now. And this is really how long does this reference live for? Right. What we're saying here is that if you have a stir-split, then the remainder and the delimiter both live for this long. The pointers are valid for that long. And down here, really what we're saying is, well, if the remainder is valid for this long, right, the tick A here and the tick A here are the same. We could call them different letters. Think of this as like generic over lifetime. Then the thing that we return has the same lifetime. You could imagine other lifetimes here, right? The lifetime of the return string could be something that's tied to the lifetime of the stir-split itself, right? But that's not the case here. We're actually having, there's some lifetime that's longer than stir-split, right? Even after you drop the stir-split, the thing that you get back from the iterator is still valid because it's about the lifetime of the string we were originally given, the haystack we were given. That's what matters. All right. So that was a lot to cover. So let's talk about that before we continue. That looks very foreign. Yeah, it is foreign. Lifetimes, you don't have these in other languages in general. The plugin here is COC with Rust Analyzer. Can I be wrong by specifying lifetimes? You can never be wrong by specifying lifetimes. If you specify a lifetime that the compiler, think of it this way, it's like using the wrong type. You can use the wrong type. But eventually there's going to be, like, you have to call a function and you have to provide something that is of some type and you give it some other type and the compiler goes, these are not the same. So the compiler won't let you compile a program with the wrong lifetimes. It won't generally let you do that. It, like, yeah. So you can't really give the wrong lifetimes any more than you can give the wrong type in the sense that the compiler is going to catch that you did that. How to tell where an anonymous lifetime can be used? So anonymous lifetimes are places where you tell the compiler, guess what lifetime? And that only works when there's only one possible guess. So one example of this is if you, let's have some other impulse here. I'm just going to make up an impulse. And it's something like get ref. It takes a reference to self and it gives back a reference to a stir, right? Here, if I put this here, there's only one other lifetime here and that's the lifetime to self. And so the compiler can guess what this type is as you don't need to give this, right? The compiler understands when you give this that it must be this lifetime. So that's an example of where the compiler can guess. What's the difference between tick A and tick underscore? Tick underscore is telling the compiler you guess the lifetime. It sort of means it's sort of the same as underscore for types. That's where it originally comes from. Sorry, underscore for not underscore for types. It's sort of like a pattern that matches anything. It's not quite true either. But it's something that you can use when you don't want to specify the lifetime and you think the compiler can figure it out. Tick A is a specific lifetime. It's similar to a generic. It's like a T. Is there any kind of ordering on lifetime specifiers? Like it's tick A more than tick B. No. Well, yes, you can order lifetimes based on how long they are. So for example, the special lifetime tick static is a thing that lives for the entire duration of the rest of the program. And so you can have some tick A that's shorter than or smaller than tick static. In general, though, the name you give does not matter. Just like the name of a generic doesn't matter. Like tick A versus tick B is just a name that you choose. How does the compiler know it's wrong, but it cannot infer it? Think of this as I can write a function multiply. And it takes an X that's a unit and a Y that's an I32. This is wrong. I can't write the implementation here. And so the compiler knows that it's wrong. The compiler doesn't know what this should be. Only you know what this should be. I hear right, I guess X times Y. You know what this should be. The compiler does not know. So the compiler can tell you that you're wrong, but it can't tell you what the right answer is. Yeah, so underscore is basically type inference for lifetimes. Why would you not alight the lifetime if you're leaving the tick underscore in the type? You basically want to alight whenever you can. There are some cases too where tick underscore, you can use it to say, don't consider this lifetime for the purposes of guessing. As we expand this a little bit more, this might become clearer. There's a way to use multiple lifetime specifiers at the same input. Yes, we'll see that in a second. Yes, you can specify an order for lifetimes. We won't need that here, but you can. Does tick underscore only get used if there's only one possible lifetime? No. So tick underscore, you can also use it if... Imagine that you have a function... Imagine that you're writing something like this, where it takes a tick bster. And you want to say, this returns something of the lifetime of the X. So we could call these X and Y as well as instead to make this easier to read. If this is the input you have, you can simplify this with anonymous lifetimes by saying this, this, in which case this gets ignored. This basically gets in argument position, it gets turned into an arbitrary unique lifetime. And in the output position, it means type inference basically, lifetime inference. And so it's going to infer that this must be tied to X, but must not be tied to Y because Y has its own lifetime. So in other words, the lifetime of stir split remainder and stir split delimiter is now tied to the lifetime of the stir split itself. No. So this is where you'll see that we still get a compile error. So actually let's move on to that because I think those are most of the questions. Great. So let's do a cargo check. Okay. So what does this do? So now we get an error saying lifetime of reference outlifts lifetime of borrowed content. So this is where we get into sort of weird lifetime land, right? And this is probably an error that you've seen in the past. You throw up your hands ago. What is even going on? So let's try to actually read through this. It's complaining about the new function. And it's saying specifically there's a problem with Haystack. The reference is valid for the lifetime tick underscore as defined on the implementation up here, but the borrowed content is only valid for the anonymous lifetime number one defined on the method body at 10, 5. So 10, 5 if you see is here. Oh, sorry. Yeah. So 10, 5, right? Line 10 column 5 is like right around here. So what it's telling us here is you told me that you are going to give me something with this lifetime, right? When we say new return self, then that self has this lifetime. But the thing that you gave me in remainder, which is supposed to have that same lifetime, right? The remainder here is supposed to have take a where the take a is the one from the definition of stir split. But you gave me something that has a lifetime that is just whatever this lifetime is. Those are not the same. Specifically, I don't know that the Haystack pointer here lives for as long as this lifetime here. They're just both like some lifetime and we haven't given any relationship with them or between them. And the same thing here for delimiter. Like the delimiter that's given in is a pointer to the string to some string, right? For all we know, the moment that new returns, the string that Haystack and delimiter point to might be deallocated immediately because we haven't put any restrictions on the lifetimes of those parameters. So imagine if that were the case, if the caller immediately removed those strings from memory. At that point, we still have a stir split hanging around with some lifetime that has some random name, right? And that stir split struct has pointers to those strings still. And that should obviously not be okay because it's not okay for us to continue to refer to those strings once the memory has gone away. So clearly, there has to be some relationship between the strings that are the string pointers that are passed in here and the lifetime of the pointers we hold inside stir split. And we want the compiler to ensure that as long as the stir split is around, those strings are still accessible through the pointers we were given. And so how can we express that? Well, what we really want to say here is that I can give you a stir split with a lifetime tick A if you give me string pointers that are also tick A, right? You see the difference here? So here we're saying the pointers you give me in, they can live for however long you want. But they have to live for at least some duration tick A. And the type I give you back has a lifetime that is the same as that lifetime. And the compiler is now going to check that as you can only keep using this as long as that lifetime is still live. Which implies by the fact that it's connected to this lifetime that you can only keep using the stir split for as long as the input strings are still valid. Does that make sense? Okay, so that was a lot more. Let's iterate on that and then continue. Why do we use generic names for lifetimes and not proper names like typical variables? I mean, why do we use T for generic types? That said, I have seen an increasing number of people using more descriptive names. And my plan is to do the same here. We can't currently do it, but I'll get to it a little bit later. How resilient is the anonymous lifetime? Will you get yourself in trouble if you rely on it too much? Or is the compiler going to pick correctly the vast majority of the time? Use it if you can is generally the answer for the anonymous lifetime. Can you post restrictions between lifetimes? Yes, you can. Here, so far, we only have one lifetime. We only have the lifetime tick A, and so there's no relationship to really give. But yes, you can give lifetimes. You can give more than one lifetime and then give relationships between them saying, this reference must live for longer than this, at least as long as this reference. I don't think we'll need that here, but we'll see. Yes, this is very much related to type systems like lifetimes are types. Lifetimes are like types and you can use similar language to talk about them. In some sense, I don't know to what extent this is actually accurate, but in general, you can think of lifetimes as the relationship between lifetimes as sort of like subtyping. Why is the tick A next to the input keyword needed? Yeah, so notice we're doing tick A here and tick A here. The reason those are needed is for the same reason as if you have some struct foo that's generic over T, you cannot write this. That is not something you can write. If you did, the compiler would say you're using a type T here, and I don't know of a type T. Placing it after the input block is what makes it a generic input block. It's saying this input block is generic over T, over any type T. Similarly, this is saying this input block is generic over any lifetime tick A. The Rust type system is two bottom types. Subtyping is actually the language used for lifetimes in the Rust Nomicon. Yeah, that makes a lot of sense. I'm generally not going to be answering questions about other things because I want to keep this stream short. Great. All right. So let's see whether this works now. Okay, so here's another thing that won't work. So if we run cargo check now, you see that the errors we got for new have now gone away. This, oh, that should not be that. It's an empty string. Notice here, the compiler, let me get rid of these because they're not that useful at the moment. Notice that the compiler, so it's not giving us any errors now. So the compiler is totally okay with me having a stir split that contains a tick A reference to a stir and me just assigning the empty string to it. Why is this okay? Right? Think of this as self dot remainder has type tick A stir, right? This has type tick static stir. So why is it okay for me to take one of these and assign it to something here? Well, so this gets back to the static lifetime. So the static lifetime is the lifetime that extends until the end of the program. Think of it as it basically never ends. And this is where the subtyping relationship comes in. So if you have any lifetime, you can assign to it. If you have a reference of any lifetime or the thing that contains any lifetime, you can assign to it anything of the same type, but a longer lifetime. And the reason for this is sort of straightforward, right? If I need something that lives for at least A, then some other lifetime that's longer than A, trivially can be reduced to that description, right? The other thing, the other way is not true. If I require something that's a pointer that lives, that is valid until the end of the program, I can't give it anything that has a shorter lifetime because it wouldn't meet those criteria. But going the other way is fine. All right, so let's try our test case here. It does not implement debug, okay? So we're going to derive debug up here. Oh, I forget what the trick here is. It's like letters.eq this, I guess. Great. Okay, so our test passes. So the question now is, are we done? Let's just see whether the things I just did. So now we have sort of a complete program in the description of static. Let's see whether that roughly made sense. So everything by default has a static lifetime. You can sort of think of it that way, although it's not really true. Any value has a lifetime of however long that value... If you have a value that you assign to a variable, say, the lifetime of that value is until that value is moved. If the value is never moved, then it has a static lifetime. But the value itself, if you store something on the stack of a given function, the lifetime of that value, unless you move it somewhere else, is going to be the lifetime of that function, basically the stack frame for that function. And when the function returns, that lifetime ends. And it has to, right? Because if you gave out a reference to something that's on the stack, then that reference can't be allowed to continue living after the function returns. That wouldn't be okay. Can I think about stir-split like a fold R? No, it's not a fold. It's a split. It takes a sequence of characters. Well, it takes a string. And it splits it into multiple smaller strings separated by some delimiter. Yeah, so one reason why this empty string over here is static is because any constant string, any string that you write directly in double quotes is compiled into your binary. It lives as a little... It's stored in the program that's stored on disk. And when your program is launched, the operating system is going to load that binary into memory. And anything that is static like that, anything that's a value that's written into the binary, is in sort of read-only memory that will never move. And so if you take a pointer to it, which is effectively what this does behind the scenes, it takes a pointer into the text segment of your binary. Well, it takes a pointer into a particular segment of your program. Then that reference naturally lives for the rest of your program. That pointer is always going to be valid because that part of your program's memory never changes. Yeah, so ek is the thing on iterators. It's very handy. You're right that we don't need to. Another way to do this would be to collect this into a veck and then assert ek of letters and this. We can do that instead. Let's find two. This is more to show that it's neat. Don't variables die at the end of scope, not just return. Yeah, so lifetimes are for as long as a value still lives. And so this is why a value, it's not like values default to being static. They default to living for as long as they do. There's not really a default. It's just any value only lives for as long as it lives for it, like until it's moved or dropped, basically until it goes out of scope. But it can be shorter too, right? If you call some other function with that argument and it gets moved, then the lifetime for that value ends and you can't use it even later in the same scope. Yeah, so one reason to prefer assert equals is we'll get nicer errors. Okay, so I mentioned a bug. Let's do that before we go on to the next. And the bug is this. If I run this, it's going to fail. Specifically here, we have a delimiter that tails the string. In this case, the iterator should produce the last element as an empty element because the delimiter was there. And so technically it should produce an element there. And so we need to distinguish between whether the remainder is empty or whether the remainder is an empty element we haven't yielded yet. And these are a little bit subtle. The way we're probably going to end up doing this is I'm going to make the remainder. Oops, that's not at all what I meant. And option this. And so here we're going to do, we want to do this. In fact, there's a different problem here, which is really, hmm, it's a good question. There's a separate problem here, right? Which is in fact, even more subtle or not even more subtle. This is tricky, which is that currently it's not even going to do the right thing for actually, no, that is the only case that gets it. Does the bug make sense? Why can't the compiler infer these lifetimes? The compiler doesn't for these lifetimes. The compiler infers the lifetime for every value. Here what we're saying was we're writing code that is generic over lifetimes. And so the compiler doesn't, it can't infer that the type, the lifetime we return here is tied to the lifetime of the remainder, which is tied to the lifetime here. It would have to do some pretty sophisticated code analysis to figure that out. And so we're adding these lifetime annotations to tell it how long we need different pointers to live for. Lifetimes over all allocated memory. If you have a heap allocation, then that still has a lifetime. It's just the heap allocation has a, the heap allocation lives until it is dropped. So it still has a lifetime. If it's never dropped, then it would be static. In general, the only way you can get something on the heap and then never drop it is with something like box leak. And box leak does return a static reference. If you dump the binary, could you spot the static allocation? Yes, you could. Not for the empty string because it gets optimized out. But in general, yes. In fact, there's a program on Unix called strings that prints all the strings in a binary. Great. All right. So let's fix this bug. I think what we want to do here is else if let some remainder is self dot remainder dot take, then we're going to return some remainder. Actually, we don't even have to do that. We can just do self remainder take. I'll write this out first. It might be easier to follow. Right. If let some ref remainder is self remainder. Here I could use the new like smart smart matching patterns. I don't really like that feature, but I could. So if there is some remainder, then we're going to search through the remainder. I guess this has to be a mute. People are going to have all sorts of questions about this and I'll get to that in a second. And then this is going to be none. Sorry. This code is currently ugly. Let me get to that. Yeah. So if there is some remainder that's still to be searched, then we're going to look for the delimiter in that remainder. If we find the delimiter in that remainder, that's inside this sector, this sort of nested if let some, then we extract and we do what we did before, right? We extract the stuff until the next delimiter and then we set the remainder to be everything past that remainder. And then we return some and otherwise we're going to return just what the remainder was. Regardless of whether it was, well, this, this will trivially be some because it was some up here. But we want to take it so that we leave none in its place. And then I guess this just becomes a none. And so now you might wonder, well, what's this business going on down here? Why do we have to do any of this? So actually this might even be able to do this. It might be fine. Yeah. So question here. I'm aware of string split at. Thanks. So one question here is what is the ref keyword? So if I did this, that moves out of self remainder. This is saying this is assuming that I own this and I get to move the value. But that's not really what I want to do here, right? I want to get a mutable reference to the value inside of self remainder if it is some. And that's what ref mute does here, right? The type of this here, right, is an option, an option take a stir. And I want the type of remainder to be a mutable reference to the take a stir, right? No, I did not want that. That's what I want remainder to be. And that's what this ref mute does. If I did not have the ref mute here, then what I would get back is this. Which wouldn't help me because I need to reassign that value to move it to be beyond the next delimiter. And so I don't want to sort of take that value. I want to modify the existing one. Yeah. So ref a means that I'm matching into a reference. Like I want a reference to the thing I'm matching rather than the thing I'm matching itself. And similarly ref mute means I want to get a mutable reference to the thing I'm matching rather than get the thing I'm matching itself. What was the ampersand star? We're going to ignore it because you don't need it. It was an attempt at a reborrow that wasn't needed. Why can't you write? Okay. So this is a good question. So why can't I write this? I think it's the question. So this sort of does the opposite. This is saying take take what the right hand side is and try to match it against this pattern. So the mutable reference here is a part of the pattern. It's saying what I'm going to give you, this would only match something that was a option tick mute T. And then remainder would be the T. Do like a visual match here. Let me try to line these up so you can see it more clearly. So remainder would be assigned to T because it would automatically do the dereference. If I write ref mute and you give me something like this, then remainder is going to be a mutable reference to that T. And so this is the way in which they differ. There's sort of inverses of each other. If let some mute remainder equals, yeah. So with the new like auto magic things, I could also do this. That would also work. I don't like writing the code this way because it looks weird to me. But it does mean you get rid of the ref mute. But there's more magic going on here. So I like writing it this way. Yeah. So ref you can think of ref as make a new reference or take a reference to and similar to ref mute. What's the D ref on the left side of that assignment doing? Ah, so the type of remainder here is a mute is. Right. Remainder here is of type this. Right. But the right hand side is of type this and I can't assign something like this to something like this. That won't work because they're not the same type. And so I need to dereference. I want to assign this into where remainder is pointing. And so that hence the dereference there. Next del m plus self delimiter length. This. Yeah. So this might end up being one past the end of the string. It might point to just beyond the end of the string. And that is a valid position to cut a string or a slice at it basically gives you the empty slice. What is the take call doing? So take is a really handy method on options. So take is a function. It's all implemented on option T. Yes. Fine. Let's make it proper. And take takes a mutable reference to the option and gives you back an option T. And the idea behind take is if the option is none, then it returns none. If the option is some, then it sets the option to none and then returns the sum that was in there. And that's what we want here, right? We only want to return the remainder that doesn't have a delimiter once. And so that's what this would do because the moment you take it, what's left is none. And so on a subsequent call to next, what you would get is this would no longer match and you would get to the non branch instead. And in fact, so just to check this should work. Now we can simplify this code even more, which is the question mark operator, the try operator also works on options. So we can do and people are going to hate me for this. We can do this. Okay, so this is something most Rust programmer would never would rarely actually write. Remember that every let statement is a pattern match. And so here what we're saying is I want a pattern match on what was inside the sum of self remainder to take a reference to what is in there. This is weird Rust that you won't see very often. You could also write this as this and we do the same thing. They're sort of inverses of each other. If self is mutable here, why self remainder not mutable by default. So there's a you need to keep in mind that the mutable references are only one level deep. So if you have a mute to self, what that means is you're allowed to modify any of the fields of self. But so I'm allowed to modify remainder. I'm allowed to modify the limiter. But what the limiter is is an immutable pointer to some string. And so while I can change the limiter itself to make it point somewhere else. I can't change the thing that the limiter is pointing to for that delimiter itself would have to be a mutable reference. Question mark an option is available and stable as well. Should be at least. Okay, great. So this now works. Let's just double check. Oh, that does, I guess, not work. Because probably this would be my guess. Oh, huh. So this actually does a move. So we can't do that. So this has to be an as mute. Okay, so this is kind of subtle. So let's go over this as well. This is not something I plan to cover, but we might as well while we're here. What this does is if self remainder is none, then it returns none. Otherwise it returns the value inside the sum. And normally that would move the thing that's the T that's inside the sum. But because the thing that's inside the T is copy, we get copy semantics instead of move semantics. So it copies this reference out of the ocean. This means that remainder is no longer the same remainder as the one it's in here. It's not a mutable reference to this. It is just a separate reference pointer. This means that when we modify it down here, what we're actually modifying is just our copy of that pointer. It's not modifying the pointer that's stored inside self. And so we can do as mute here as mute is a function on option. So it is a function on option that takes a mutable reference to self and returns an option that contains a mutable reference to self. And so now if this is none, then we return none. If this is some, what we get back is a mutable reference to the thing that's inside the option. And so now remainder will be a mutable reference inside of stir split. Great. All right. So now we have a working implementation. It doesn't hang. That's all fine. As now you might wonder, well, I came to the stream to learn about multiple lifetimes. And so that's what we're going to look at next. Imagine that you want to write the following implementation. You want to write this function that is a split by character or actually let's do even better. Let's do until character. It's going to take a string s and it's going to take a character. And it's going to give you the string until the first occurrence of that character. Right? So if we wanted to write a test for it, we'd write something like until car tests. I'm expecting that if I do hello world and I give it a oh, then this should return hell. It's a fun coincidence. I did not plan that. Okay. So that's our plan. And here we'll do a tick underscore to telecompiler just in for this. In fact, we might not even need to, but with the rust 18 idioms, it's going to complain at us. And naively, now that we have stir split, this should be pretty straightforward. Right? We should be able to just do stir split new, give it the s just format the C to be a string and do next and do an unwrap because we know that there will be a 0th element. Right? So here, let's make this an expect and say stir split always gives at least one result. Right? So we'd sort of hope that we were able to do this. And if we run cargo check here, it tells us, okay, it expected a stir and it found a string. So we're going to just take a reference to the string. And it says, cannot return value referencing temporary value returns a value referencing data owned by the current function. Okay, so let's dig into what this is actually saying. It's saying you're creating a temporary value here. And you're trying to return a value that references that temporary value. Basically what it's saying is the stir that we're returning is tied to the lifetime of this string. But that's stupid, right? Because we know that stir split only ever returns substrings of this string, the first argument, the haystack. It never returns references into the second string. The lifetime of the second string doesn't matter for the purposes of what stir split returns. But if we look at our definition, we can sort of understand where Russ is coming from here, right? We've said there's only one lifetime. Both of these have that lifetime. And the thing that we return from the iterator has that same lifetime. And so when Russ gets what we're saying here, right, when you create a new stir split, it's saying that these two things have the same lifetime. And so when we down here pass two elements that have different lifetimes, right? One has the lifetime that's only the scope of this function, whereas S has whatever the lifetime of this is. Then Russ goes, okay, these two have different lifetimes. And so in order to make them the same, I'm going to take the longer lifetime and turn it into the shorter lifetime. And so the tick A for this stir split is going to be the lifetime of this scope, right? And so when we try to return a reference to that here, that reference has a lifetime tied to the scope of this function. But what we've said in the function definition, right? If we sort of fill out the alited lifetimes here is we're really said that this is the contract we want. But the lifetime that this returns is tied to the scope of this. It's not tick A. It's not, let's call it tick S, because that's what the argument is called. So how can we tell Russ that this is okay? Well, what we need to do is we need to have two lifetimes here. Let me see if everyone understands the problem first. Let's see. Should we copy the delimiter into our struct? Okay, so one option is that we don't have multiple lifetimes. We just stored the delimiter as a string. And this gets us, in fact, let's explore that option first. I sort of don't want to, but let's talk about this without necessarily exploring it fully. So imagine the delimiter was a string instead of a stir. You'll notice that the string does not have a lifetime associated with it. And this gets back to the differences between stir and string. So a stir is similar to but not quite the same as this. It does not have a size, just like a slice that's not behind a reference does not have a size. It's just a collection of characters, a sequence of characters. It doesn't know how long that sequence is. It just knows that it is a sequence of characters. Usually you will see stir in the context of a reference to a stir, just like you would normally see reference to car. Here, there's all sorts of things we could talk about here. But the basic idea is the reference is this is a fat pointer, not a shallow pointer or not a narrow pointer, I guess. And so the fat pointer stores both the pointer to the start of the string or in this case, the start of the slice and the length of the string or the length of the slice. And so this is just a thing that remembers both where the string starts and how long it is, just like a reference to a slice is the same thing. String is a little bit different. So string is more equivalent to a veck of characters. So there are two ways in which this differs. First of all, a string is heap allocated. This reference can point anywhere. It could point to something's on the stack, something's on the heap, something that's in static memory. It's just a pointer to a sequence of characters. A string, though, has the property that it is heap allocated and it is dynamically expandable and contractable. It's a heap-allocated thing just like a vector. It can shrink and grow. Now, if you have a string, you can get a reference to a stir. If I have a string, then I can go to a reference of a stir because the string obviously knows where the string starts and that is the in-memory representation of it as a sequence of characters and it knows how long that sequence of characters is. And so going from a string to a stir is trivial and, in fact, this is why string implements as a stir because if you have a string, if you have a reference to a string, you can trivially get a reference to a stir. Going the other way is harder. So if you have this and you want to go to a string, you don't know where this reference is pointing so the only way you can construct a string is by doing a heap allocation and then copying all the characters over and now you have a string. And so this is cheap and uses asref. That's not what I wanted. And this is expensive. It basically uses clone. It's not quite clone but it has to do mem copy, I guess. So it's true that we could store the delimiter as a string but this has two downsides. The first of those is that now we require an allocation, right? In order to create a stir split, you have to allocate. This is not great for performance but it also ties into the second problem which is now you need to have an allocator. This means that once we start using a string, this library can no longer be compatible with embedded devices, for example, which may just not have an allocator and don't have a heap. And so really we'd like to keep this a stir if we can. Let's see questions about this. Can you get that character from until car and transform it back to a stir? Yeah, so that's basically what this reference in front of the format is doing, right? Format produces a string and then takes a reference to that. This means that the lifetime of the reference we get back is tied to the lifetime of the string. This might be more visible if we move this out. So if I say delimiter is this, this might be more obvious. This string is going to be deallocated. It's going to go out of scope here. And so when we take a reference to it, the lifetime of that reference is going to be this scope. And the lifetime of s is tick s. And when the compiler is told these have to have the same lifetime, it's going to use the shorter one. And we can't make it longer, right? Because this is going to be deallocated, the memory is going to be gone. And so this reference is just like not okay the way we've written this above. All right, so how do we fix this? Well, the solution here is to have multiple lifetimes. And I will say before we start this, usually you do not need multiple lifetimes. There are only some cases where you do. This is one of them. And it took me a while to figure out a case that needed multiple lifetimes. It is quite rare. The time it comes up is when you need to store multiple references. And it is important that they are not the same. Because you want to return one without tying it to the other. So let's name these lifetimes and say haystack and delimiter. I told you I was going to name them. And now this input block is going to be a generic over haystack and delimiter, right? And the haystack is going to be haystack-generic. And the delimiter is going to be delimiter-generic. And notice that these now have different lifetimes. So the compiler no longer has to force these to be the same by downgrading the lifetime down here. And now down here, right, we do the same thing. Oops, right. And now we have access to another lifetime here. We can say that the reference we give back is tied to only the haystack lifetime. It is not tied to the lifetime of the delimiter, right? And notice here that the compiler is totally happy with this. Because the code we wrote indeed follows that contract. Because any reference that we returned from in here is a reference into the haystack. If I changed this and said, let's see, I just like made a mistake and somehow returned like self-delimiter. Now the compiler is going to complain. So there was a question earlier, right, about whether you can use the wrong lifetime. So here, the compiler is going to get very mad at us. It's saying cannot infer an appropriate lifetime due to conflicting requirements. First, the lifetime cannot outlive the lifetime delimiter as defined so that the reference does not outlive the borrowed content. So this is saying the thing you returned, self-delimiter, has a lifetime of delimiter. But the lifetime must be valid for the lifetime haystack as defined on the impulse, right, up here. So that the types are compatible. This error is a little bad. This should ideally be pointing at item. But specifically what it's pointing at here is the, what it should be pointing at here is the self.item up here, right. We promised in our code that the item would have a lifetime of haystack. And so that's what the compiler is pointing at. It's saying you said that the lifetime should be valid for the lifetime haystack, right. It's saying that that contract, that guarantee you gave over here. But the thing you returned, self-delimiter, has a lifetime of delimiter. And these two are not the same. And in fact, we haven't even given a relationship between the two. So one stupid way, if I actually wanted to write this code, right, is I could say where delimiter is greater than haystack. This is something I'm allowed to write. So I'm saying now the compiler is going to go, okay, you returned something with a lifetime delimiter. You promised you were going to return something with a lifetime of haystack. Normally that was not okay. But here I have a clause saying delimiter is longer than haystack or sort of phrase differently. Delimiter implements haystack, right. This is the subtyping relationship. And if delimiter lives for at least as long as haystack, that if I have a reference with lifetime delimiter, it also can be downgraded to the lifetime of haystack, whereas the reverse is not true. Of course, this is not the code I want to write. And I don't want that bound there. So I'm going to turn it back to what it was. And now the compiler is going to be happy. If I now run cargo test, now it passes and the until car function actually works. And the reason, of course, is now, even though this string gets deallocated really quickly, that doesn't matter to us. That's totally fine, right? Because the items that are yielded by stir split as an iterator have a lifetime that's only tied to the first argument that was given to you. See, can you put underscore for the delimiter lifetime to say it's not needed? Yes, you can. So here, I can do this, right? This block does not care what this lifetime is. We don't need to be able to name it. And so we can use the lifetime elision, basically, or the anonymous lifetime to say, any lifetime here will do. It's going to be unique from all the other lifetimes. It's a good catch. And same thing down here, actually, until car. Here, the compiler can just infer that this must be tied to this lifetime, because there are no other lifetimes to attach to. Now, let's say, let's say that someone writes, actually, that's not that important. Okay, this is what we did now make sense. I think so. Yeah, so this does do a heap allocation, because that's the next thing we're going to look at. Introduce multiple threads. I don't see in what way multiple threads are relevant here. Each thread would have its own store split if you ever were to make one. You don't technically need this, I guess, pub. But if you turn on this, so Rust 2018 idioms, one of the things that they introduced, oh, actually, I guess it doesn't. Yeah, one of the things they introduced was that if you return a lifetime, oh, they really should require it here, too. Fine, fine, you're right. It can be left out here. I like to give it just to indicate that it is auto-inferred, but it's not required. All right, so I want to cover one more thing in like the last 15 minutes or so. And that is, we do have an allocation here, which is kind of sad. Can we get rid of that? And this is actually going to end up getting rid of the lifetime for delimiter, which is what if instead of the delimiter being a string, we wanted to be able to be anything that can find itself in a string. String is one such example, but it doesn't have to be. So let's say that this is going to be D for delimiter. So now this is generic over D. And notice that D is not a lifetime, D is just a type. And the delimiter is going to be D. And then here, what are we going to do? What are the requirements on D? Well, the only thing we really need is the ability to figure out basically the bounds of where that delimiter next appears in a string. So we're going to introduce a trait, a pub trait. And we're going to do like, let's call it just delimiter. Why not? And the things that a delimiter has to be able to do, at least for the time being, is it has to be able to give us its length so that we can skip past it. Actually, let's do skip. So it's given a reference to a string and it needs to return a reference to a string. We can even do better. We can say that it's going to be find next, given a self and a string. And what it needs to return is an option with two numbers, where it starts and where it ends. That's all we get. So now we know, let's say here that we want to say that D has to implement delimiter. So we want to implement iterator for a stir split for any D where D implements delimiter. And now we just need to write this in terms of find next. So this is going to be delim start and delim end. And then this is going to be self delimiter dot find next of the remainder. And now we know that until the delimiter is delim start, and after the delimiter is delim end. Okay, that wasn't too bad, right? We just sort of flipped it around. And now what we can do is we can implement delimiter for a reference to a string. S is a stir and it needs to return an option. U size, U size, this. Okay, so for strings, this is pretty straightforward, right? Finding a string in a string, we already did this. And it's essentially just S find self. And then I guess we're going to map that to start because we also need to return the end that's part of the trade contract. So it's going to start and it's going to end at self dot lend. So let's see whether this still works. I'm going to cheat a little. Ignore that. Okay, so that still works. It's now generic over whatever the type D is. And notice that there's no longer delimiter lifetime. And yet we were allowed to give a reference to a stir. And the reason is because here we're saying we're generic over any D. And that D could be a reference. It could be, it could live for whatever time it wants. There's no requirements on D other than it implements the limiter. And now where this gets really neat is we can implement delimiter for other things. We can implement delimiter for character. And now we want to find, well, we can start to cheat here. But what I'm going to do is S dot car indices position where C is self. Start plus one. Let's see if that actually this is not to be position. I guess this can just be cars. Actually, no, it can't. All right. So what this is doing is it's iterating over all the characters of the string, looking for one that is the character we're searching for. And then when it finds whatever results it finds, if it finds one, we're going to map that sum to take the position and return that position and that position plus one, right? Because the character is only one character long. And now let's see if this works. So instead of now allocating the string, can we just pass the C here? We can. Okay. So now the allocation is gone. And now we can implement this pattern for all sorts of other types. So anything that can find itself in a string will now just work. And so now we have a generic stir split implementation that works for anything that can find itself in a string. All right. Questions. Why do you use pattern? I'll get to pattern. Self here in the implementation down here. The self here is a reference to this type. So it's a reference to a reference to a stir. This does need to be Len UTF-8. I think you're right. Is that even a thing? No, I think that's what Len will do. I think that's right. Isn't there a simpler way than character indices? There is, but this shows the concept of it. Like you can do way more efficient things than this. What does find self return? Find self. So find is a method on strings that you can give it a string and it will tell you the start of that string in that string. That plus one is wrong and will panic your code. Is that true? I think you slice by... Oh, it's by byte indices. Yeah. So this is going to have to be... Is this a thing? It's going to be bright for those of you in dark rooms. Let's do character. Does character have like a UTF length to it? Yeah. Okay. So this is going to be Len UTF-8. So that's the one you were referring to. So that should do it. All right. So some questions here. Can you explain find map? Yeah. So find gives you an option of where the thing is found. An option of the position where it's found. And the map is if it's none, then just return none. If it's some, then I want to change the contents of the sum to be this. Because remember the trait requires that we give both the start and the end. Why self-len and not s-len? Because self is the thing we're searching for. It's the length of the delimiter. So in order to find the end of the delimiter, it has to be the start of the delimiter plus the length of the delimiter. Not the length of the string we're searching in. The s here is what we're searching in. Okay. Great. So now some of you have been observing this that why are you doing this? Like find just works and why don't you use pattern? So my plan was to keep this a secret. And now I successfully have. So all of the things we implement today exist in the standard library. Trust me, I am aware even though I don't bring it up. So if you look at stir, you will find the find function. Better yet, you will find the split function. So split on a string takes a reference to self and some pattern P. And it returns a split. And if we look at split, split implements iterator. And it gives you the things split by that. And you'll notice that split has a lifetime of tick a. And the tick a is the lifetime of the string you're searching in. But then it also takes a pattern, a delimiter that implements this trait pattern. And pattern is a trait that it's a little more convoluted than our delimiter trait. But it's basically the same thing. It gives you a way to look for some something in a string. And so really what we've done today is go the whole route through how you get to what's in the standard library today of how you split a string. But going through it in such a way that we also go through where multiple lifetimes are useful and how to turn these kinds of things into traits and generics. And so you could find actually that all of the tests we wrote today, we could just use the standard library instead. We could just do sort of haystack.split on space and it would work the same way. Similarly, we could do haystack.split on characters because characters also implement patterns. And so all the things we did today, there's no reason to publish this as a crate because it's already in the standard library. But hopefully it was a useful exercise in understanding how these different pieces fit together, different types of lifetimes when you might need multiple lifetimes, how to read some of these lifetime errors and also things like differences between strings and stirs and references. Okay, so I think that's getting us close to the end of the 90 minutes. But let's take some questions now that I've revealed the big secret. Let's see here. Why can't you create a string from a stir fat pointer? Because you don't own the memory. A string assumes that it owns the underlying memory. It assumes that when it's dropped, it has to free that memory and it assumes that it can grow or shrink that memory as necessary. Which would not be true if you took some arbitrary pointer and length and just decided that I own this now. That's not generally true. It's true. We should have a Unicode character test. So it's usually obvious that the top level thing is being re-implemented for education, but it's usually not obvious that the deeper things are also done for the same reason. So it's a good summary of what I was trying to do here. And some people jump the gun in chat, but that's okay. It's good that you observe that this exists in the standard library. Don't you think the Rust is kind of less readable than other languages? No, I don't think so. I think if you wrote Rust using only the features that existed in those other languages, it's equally readable. But Rust has additional features that require additional syntax. And it's true that when you use those additional features, your code becomes harder to read. But it also adds additional features. So you couldn't even do the same things in those other languages. The pattern in the haystack seems to be sharing the same lifetime tick A. Yeah, so pattern in the standard library is a little interesting. There's a reason why it's nightly only. And that is basically because they haven't quite figured out what the design for it should be. The tick A here is the lifetime of the string that the pattern is searching in. That gets communicated sort of all the way down the stack. You'll see there's a second trade called searcher, which then lets you do basically the same thing we did today, right? So you see there's a next match and a next reject. And all these things get to operate on the tick A reference to the haystack. So you'll see actually this is very similar. It's just more convoluted in order to basically write more efficient implementations. When you see something like type tick X, how do you know what the tick, what the X is the lifetime of? You don't. Just like if you see type T, you don't know what type, what that type T is. What do you think of Rust having a future in the industry? I give a whole talk on this. If you look it up on my YouTube channel, it's called Considering Rust and the last like 10 minutes of it's basically looking at the future of Rust in industry. Can you publish this is a gist so we can play with it? Absolutely. I will post this somewhere and then I will actually I can do that. I'll do that after the stream and post it in the in the description for the video as well. Can you work with standard in as an input instead of a stir? That's harder because standard in is a stream. It's not just a constant. So it's not it's not something you can seek in, for example, so that might be trigger, but you could try. How do you think generic associated types will improve trait definitions? I think it will help a lot. So get generic associated types will probably not necessarily help with trait definitions. For that you need existential types more so and a couple of other things. It will help a lot with being able to clone less as one of the big things that will help with. Does it matter that the second use size in the fine next is the end index and not the length? It could be either to be honest. It might actually be better for it to be the length and that's something we could modify. Do you intend to do some lecture for newcomers to Rust? I'm not planning to do any complete like beginner streams. I think that that would be a good addition, but it's not something I plan to do. I might do more of these sort of relatively focused videos though. At least it seems like there's some appetite for them and the people enjoyed this style. And so I might do some more similar types of things, but they will still be more geared towards intermediate than beginner. All right. I think that's about time for now. If you want to hear about other upcoming videos of the style or of the other videos that I do, check out some of the past recordings on my YouTube channel. And also I am on Twitter slash John who and there I will post any and all notifications that are relevant. Sweet. Thank you all for joining me. I hope you learned something. I hope it was possible to follow and I hope that having it be 90 minutes instead of six hours made it more digestible. Thanks everyone. Stay safe. Stay home. And I will see you. And I guess next time there's a video. Bye.