 The official title of my talk is something informative, like who owns this stream of data that sort of tells you what my talk's gonna be about. But I'm not here to inform you, I'm here to talk about my feels. And my feels are, oh my god, iterators. Because they give me feels. So, let's talk about three lines of code for 30 minutes. This is the iterator trade. Aaron and Niko touched on this a bit, but we're gonna go really deep into this and explain the basics. This talk is gonna be kind of in two parts. Basic iterators and my crazy fever dreams. All right, so traits. This is Russ' fancy name for an interface. Types implement traits and then you can be like, I want a thing that is an iterator. Give me an iterator. So, saying you implement iterator, you say I can yield things and to claim you implement iterator, you have to provide two things. You need to say, this is the type of item I give and here's how I give you items. So, two interesting things about the next function, which is how I yield them. You need mutability because you want to make progress. So, if I didn't let you mutate yourself when I iterated, all you could do is yield the same element over and over and over forever. And that would be an awful interface, I think. The option thing is a tagged union, which you may or may not be familiar with, but all you need to care about is it's coalescing the idea of I have another element and here's the next element. So, when I return some item, so an option can return exactly two things if we return some item, in which case it's saying I have another thing and here it is, or it can return none, which is I don't have anything else, you're done. That's it, that's the whole iterator trait and we're gonna talk about it for a really long time. So, here's basic usage, everything we saw in the previous slide doesn't matter at all at all now because you just use a for loop and you just say for the name of the item that you're gonna be giving me, in the thing to iterate. In this case, we're using the really nice range iterator syntax, which is just yield zero through 10, don't include 10 because that's really useful in most cases and who wants inclusive ranges? All right, so I went for three slides, let's talk about collections. So, collections have three kinds of iterator. These are not different interfaces, these are not different traits, these are just concrete implementers of the iterator trait. So, we have intuiter, iter, and itermute. Oh my God, why are there so many? And the answer is ownership. The iterators get really complicated with collections because we care about ownership. So, let's look at intuiter, which is the most basic fundamental version of iteration. Intuiter returns owned values. So, this moves the data out of the collection when you iterate it. This gives you total ownership over the values, which means you can do absolutely anything with it that you want. So, the most extreme one being you can destroy it. So, if say you had a collection of files, you could close those files when you intuiter because you're allowed to destroy them. So, let's think of a world where all we had was intuiter. So, here I have a function. It processes data, which is a vector of strings. A vector is just an array. It's a fancy, growable array. And all I do is to get the intuiter, I call the intuiter method on my vec. And then I get all the strings out and I can print them. Great, that's awesome. Unfortunately, after I've run my loop, my vec is gone. I can't use it anymore. And that's really, really awful. Like, this would be a trash language and I would quit forever. If iterating a collection meant it was gone. Like, that's just sadness, right? So, that's why we have iter, which is shared references. So, instead of getting the data out of the collection, you just, it just shares the data with you temporarily. And the important thing about shared references is their read-only access. Because you're sharing them. And that means you can hand out lots and lots and lots of them. So, tons and tons of people can be concurrently iterating your data structure with shared references because it's just reading and concurrent reading is totally fine, right? So, iter lets you look, but not touch. So, now instead of process, where we just have print, now I only need to, so actually something I didn't mention with this. You had to pass the vex of string by value, which meant just calling process of itself, you had to give up your data forever, regardless of how we decided to iterate it. So, now I'm only passing a shared reference of vex of string, which means the body of the function is very limited in what it can do, but it can still do iter. It can get the normal iter and only read. That also means, after I iterate it, the collection still exists. Isn't that great? What a fantastic language Rust is. You can iterate something and it doesn't destroy it. Awesome. The really, really cool thing about iter is now everyone can look at the same time. So, here I am. This is my new function, print combos, which it does the same thing. It gets the iter, it loops over it, and then it will, whenever it gets an element, it'll yield that, and then it'll randomly index into the array and grab a different element and print it to. Because you can, because it's sharing. Everyone gets to read in random places. Who cares? Chaos. Hooray. This is intermute. This thing keeps interfering with my sizes. So, intermute is mutable references. Mutable references is instead of sharing the data, you're loaning it out. You're giving them exclusive access, but that exclusive access means you get read-write access. So, you can mutate the things inside of the collection while you're iterating. Awesome. So, here I have the make better function, which you have to pass in a mutable reference to a veck of string. But now I can push exclamation marks onto all of my strings and make them better. And now, when this function returns, the vector in the parent function will be changed and every string will be better. The sad thing is though, I can't share. I can't do this crazy random iteration thing. I can't multiply iterate it at the same time because that would cause a lot of problems with people mutating things at the same time and it would be, it would be trouble. And then we have drain, which is the secret fourth iterator I didn't even tell you about. And it represents the, I drink your milkshake form of ownership. So, where we had values and shared references and mutable references, drain is about partially moving the data out. So, this is actually largely just a performance trick. So, we were really sad that iterating our collection by value destroyed it. But we did actually, we could actually want to get the values out. We could actually want to get all the files. We just don't want to destroy the whole collection. For instance, it might have an allocation that we don't want to remake. Or we just don't want to have to do some crazy fiddling where we swap out collections to make the owner happier or anything. So, it gives us full access to the elements, but it doesn't destroy the container. So, here's the most advanced version of drain. Here, I call dot drain and I pass in a range. And this is saying drain out this range of elements from the collection. And then, I don't know, I consume them because I have full access to the values. And then the collection still exists, but items two and three have disappeared. So, this is a double performance win because we get to reuse the allocation and we also basically called remove really, really, really fast because calling remove in the middle of an array is really, really slow. So, as a recap, iterators naturally fall out of ownership. We have intuiter, which is values, iter, which is shared references, intermute, which is mutable references, and then drain, which is I drink your milkshake. The secret fourth version of ownership. So, now we're getting into the fever dream section of the talk. So, Rust doesn't understand indexing. So, here I'm trying to do something that seems super duper reasonable. I'm trying to get two mutable references into my array. I'm trying to get one at index zero and one at index one. Those are disjoint, so that seems totally reasonable, right? I'm getting at most one mutable reference to each location. Unfortunately, Rust throws up its hands in disgust. This is because Rust does not have crazy integer theorem provers, and it refuses to try and reason about, oh, obviously you passed in two different numbers to this index again. Also, I understand indexing, so clearly this is correct. It's just like, no man, I don't know. You could have been passing zero in both of these, and that would be unsound because then you'd have two mutable references to the same value, and Rust doesn't want that. I would argue this is a good thing because crazy integer theorem provers make the language really complicated and hard to reason about. It's nice that you index, that's it, you're done. However, intermute just lets you do this. Here, instead of indexing, I make an intermute into the data, and then I just call next twice. Here, I'm explicitly using the iterator using the actual iterator pro call. I'm not using a for loop, so I'm calling next, and then I'm unwrapping the option, which is basically freak the fuck out if it wasn't a sum. And here, I'm pretty confident it won't freak out. So now I have two pointers into my VEC, and I can mutate them both at the same time approximately. I could pass these to threads and let them mutate them literally concurrently, and it would all be fine. Is ownership busted? What happened? Because this is basically the same, right? Why is one allowed and the other not allowed? I would like to claim that this is 100% legit and safe. So there are several reasons that come together to explain why this is legit and safe. So first off, iterators are one shot. All you can do with an iterator is call next, next, next, next, next, next, next, and then it's done. You can't be like, oh, give me the previous element again. Similarly, each element is yielded at most once. The third element of your array won't randomly be yielded 17 times. That would be weird. Why would you want that? Someone probably wants that. I'm not putting it in the standard library. And the third most subtle one is you can't get a fresh intermute while there's outstanding references, and this is statically done. So just like the indexing situation, whereas like, dude, you got a mutable reference into there, you can't mutably access the array again while this is happening. In this case, it's like, dude, you have some mutable references that you got this iterator. I'm not gonna let you call intermute again and get a new one. What about double-ended ones? Yeah, so double-ended iteration is exactly the same idea except for, so you can think of iterators as a stack that you're just popping values off of. A double-ended iterator is just like a deck that your popping values off both ends. But once you get to the middle, it'll stop. You can't keep going. So it gives you the same properties. We're not gonna touch on the double-ended iterators because it turns out they actually don't make anything more complicated, surprisingly. So you might be wondering, why does the API actually allow this? Because these bullets are all nice and good, but they sound kind of like implementation details to me. It's not clear that it's safe to actually implement this interface, right? And the reason is lifetimes. And I don't really wanna dig super deep into this slide. What I will say is these tick A's are what we call lifetimes. They are basically scopes somewhere in the program, but for our purposes, we'll think of them as restrictions. These are restrictions for where you can put this and how long you can keep it around. And the only thing Rust really cares about is if lifetimes are the same or have a clearly strong constraint established on them. In this case, we're just like, oh, the left-hand side, that's some tick B constraint. And the right-hand side, that's some tick A constraint. So obviously they're not connected at all. There's no way these lifetimes are related. So it's totally fine, just call next again. It's totally fine. Don't worry about it, Borochecker, only dreams now. So this is crazy. Maybe it's reasonable to be able to declare this interface, but how can Rust actually let you implement this? And the answer is the Borochecker, the thing we love, is really, really, really, really smart. You can actually legitimately statically prove to the compiler that you safely implement this interface. So, here's an example. Here's InterMute for an array. So this is a bunch of nonsense with lifetimes and stuff. The real meat of this is this line. This line is a really, really awesome function. It is, you give it a slice, that's D in this case, and you say, split that slice into two. And these two slices are completely disjoint and therefore we know they don't overlap at all and there's no way to regrow a slice. You can only shrink slices further and further down. So once we do this, we know it's totally sound to treat these two as completely disjoint and we can just work with them on their own and it's great. So what we do is we put the right-hand side of our split back into our iterator for later and then we just yield the first element of the left-hand side. So in this way, we are shrinking our, every time we yield an array, we shrink the world that we get to know about statically and the compiler is happy and everything is magical. This is not just for arrays. You can do this for a singly linked list or you can even do it for a binary tree or probably even a B tree I haven't tried. I've definitely implemented singly linked lists and binary tree. Anything that you have, where ownership is really, really clear, so doubly linked lists don't qualify, they have tangled ownership that makes no sense. And things that you can get subviews into. So for arrays, you can get sub slices. You can sub slice, sub slice, sub slice and you're not allowed to see the rest of the world anymore. Similarly for linked lists, when you go to a node, you can't see all the nodes before it. And for a tree, if you look at a subtree, you can't see the rest of the tree anymore as long as you don't have parent pointers. So what if I don't want them to call next again? This is perhaps strange. So we saw that it's really cool that you can call next and next and next again because you can shard your data structure into lots of little mutable references and give them to people even on different threads and it's totally 100% statically safe. Awesome. But there's actually totally valid use cases for saying, no, no, no, you can't do this, that's bad. So some of them are, if you want to mutate your collection structure during iteration, that's not sound if your user has arbitrary references inside of it. So say you wanted to remove an element while you're iterating, something really common that a lot of us want to do. Another is if you want to have sliding windows into it. So say you want to have a sliding window of five elements throughout your slice or something. You can't do that with mutable references because you could call next again and the two slices would overlap. Also, there's this weird case where you want to construct elements and store them in one place and only yield them once at a time. That one's complicated, I don't want to get into it, but it's the thing I can talk about it later if you want. So none of these are sound because iterator says you can always call next again. So none of these APIs are valid to give and rest with iterators. So one way to work around this is my favorite borrower checker escape hatch, just use indexing. We saw before indexing is limited, but that's because we were trying to hold on to indexes. If we just do it very transiently, rest is totally happy and it's like, okay, that's cool. So here's a classic example I had when I did lots of game dev stuff. You have some array of enemies with health and while the game's playing, they lose health and they lose health and whenever an enemy dies, they run out of health, you want to remove them from the array because it would be terrible to just have them sitting around. Also, that's an easy way to be like, don't render this anymore in that it's gone forever, but this is concurrent modification and iteration. So the iterator API would not let us do this not usually. So the way we have to do this is we go, rather than iterating over the enemies directly, we go from zero to the enemy's length and we also do it backwards, which is what this dot rev thing is. That's when people are screaming about double-ended iterators, that's the magic of double-ended iterators is you just go, oh, reverse this and then it will just read from the other side. So what we do, the reason why we reverse is I can get into detail later, it's not super important, but it makes this work. So it would work if you didn't do this, but you would get dumb results that you would be sad about. I will say that much. So every iteration, I don't know, our enemies are all in like an acid pool or something, so they're all just slowly dying. So they all lose health every iteration and then if any enemy gets down to zero health, we print, oh, I'm dead, blah. And then we do a swap remove, which is where you swap the element with the end of the array and then pop off the array. So that's the reason why we wanna do rev, it swap remove will scramble up our array and rev makes it okay. Again, I can get into this more detail offline. The main point is indexing lets you do concurrent iteration and modification because it separates out the state from the data structure. You're maintaining the state on your own and it's up to you to do it right. If you do this wrong, you could end up indexing out of bounds or doing something nasty. Rust will safely protect you from that being like a safety issue, but your program's gonna do things that you didn't want. Rust can't protect you from making your program do things you don't want. The main problem with this thing is so you only wanted to work with arrays, right? Like you guys only use arrays, right? You don't use maps or trees or lists, link lists or anything, because this only works for arrays because we can reasonably, efficiently keep external state for iteration. So a more robust solution to this problem is what I call what we've been calling for a while, streaming iterator. This is a bit of an urban legend because it has problems as we'll see. So streaming iterator is exactly like iterator except for one small detail. Instead of saying, oh, I just yield self that item, we're saying I explicitly always yield a mutable reference to self item. Now that there's explicitly a reference, we can link all the lifetimes up and Rust understands, oh, there's a constraint here. I can't let you call next again. Yeah, it would take me a long time to explain full lifetimes to fully grok this. The main thing is Rust cares when lifetimes are linked up and this lets us explicitly link up the lifetimes. So now you statically can't prevent, you're statically prevented from calling next twice, which lets you do these other fancy APIs like sliding windows or mutating during iteration. So good, we have the mutual exclusion, the compiler error for this code. Bad and mute item is hard coded. What if I wanted shared references? What if I wanted some type that contains a reference in it somewhere, maybe? You can't express that. You have to specifically settle on, I'm gonna do and mute item. That's all my API supports. We can't express this generically today. We need some kind of higher kind of what'sits. If you are sad about this, please email Aaron Turan at Mozilla. I'm sure he implemented this five years ago during his PhD before he even worked on the Rust project and he's been hiding it from us. No, this isn't anywhere near being able to express. No one's even properly proposed it, so this is gonna happen for a long time. However, say we're happy with the andmute thing. Now we can go backwards. We can have a previous method and that's totally fine. Here I've removed the lifetimes because this is actually such a pervasive pattern that Rust just assumes all the lifetimes are the same. There's a lot of places in this. You can work with references lots and lots and never see a lifetime in your life and it's fantastic. But yes, so we can go forward, we can go backward. I can actually yield the third element 17 times now if I really, really wanted. I can go absolutely crazy and say, oh, you know where that iterator is right now? Insert an element there or remove that element and that actually works too because all of these say the things you are, or next in prep specifically say, the things you get out, you're not allowed to call any other andmute methods while they exist. You can't even call shared reference methods either because you get neutral exclusion. One really nice thing is we don't need no stinkin' traits. We don't need this higher kind of what sits because it turns out you can't be generic over collections in Rust anyway because we don't have higher kind of types. Being generic over iteration stuff generally needs higher kind of whatever's anyway. Also, you don't really want to use the generic for loop infrastructure with this stuff much in my experience because you're doing something really complicated and just having it unconditionally move forward without your consent isn't really what you want and in fact, I implemented this. I implemented this like pre-Rust 1.0 here. Here's a concrete cursor on a linked list. You can go next, you can go prev, you can peek the next one, you can peek the previous one, you can insert, you can remove. Hey, you can even split off the list right here or insert an entirely different list right in the middle of the list here and that all just works because I didn't insist on being extremely generic. Being extremely generic makes your APIs absolutely atrocious and horribly brutal. This is really neat and clean because I didn't insist on being incredibly, incredibly generic. So, in conclusion, we have iter, iter-mutes into iter and drain. They are all natural reflections of ownership. Iterator always lets you call next again and that's really, really awesome in some cases but also it excludes other use cases. Streaming iterator prevents you from calling next and therefore enables you to call prev, insert, remove or all these other fancy things. And all of this works completely, completely, concretely today. You just can't be like, I work with an arbitrary zipper, cursor, streaming, iterator thing and so you can map over a tree or a list or your file system all using the same API which I don't super care about. Some people do. Yeah, that is iterators.