 to introduce Trey Hanna. Trey hails from San Diego and he works as a Python and Django trainer, helping companies onboard new hires and improve their skills through exercise-driven team training. He also runs a weekly video chat about Python. I believe one this weekend might be coming up as well. And on top of this is a director of the PSF. We are very lucky to have him here today to help us unravel some of the complexities of Python's iterables and to help us how to loop better. Please welcome him to the stage. Thanks, Andrew. So I would like to go on a journey with you all through the land of iterables and iterators. There will be a lot of code on my slides. So this is a warning before the talk. I'm going to move very quickly. You are definitely going to miss some things. So I don't want you to worry because I'm going to tweet out a link to my slides after this talk is over. So you can review them on your own afterward. Some of my slides are also assuming you are using Python 3. Who is using Python 3 primarily at this point? Okay. Who is not yet using Python 3 primarily at this point? Okay. So that was most of you actually using Python 3 at this point. So for those of you not using Python 3 yet, I don't know. Upgrade. Sorry. I'm going to introduce myself before we go anywhere. So Andrew already said most of these things. My name is Trey. I help Python and Django teams on board new teammates and turn frontend developers into full stack Django developers. I also host a live webcast every week on Python related topics, usually Saturday mornings at 9 a.m. Pacific time. Tomorrow I will be hosting one here for attendees of this event to share with me their experience so that the many folks, most of the Python developers who are not in this room with us, understand what this event was about and what it was like. I'm also one of the co-organizers of my local Python Meetup in San Diego and I've helped organize a few Django girls workshops in Southern California and I'm one of the directors at the Python software foundation. Okay. So that's enough about me. Let's talk about looping. I would like to start by taking a look at a few head scratchers here. Now we're going to be revisiting these at the end of the talk. I don't expect you to remember these. I just want you to note that there are a few confusing things in Python that involve looping that we will have resolutions for by the end of this talk. Numbers here is a list. Squares is a generator. This generator will give us the squares of each of those numbers in that list. If we pass this generator to the tuple constructor, it will make a tuple out of those squares. And if we pass this generator to the sum function, we'll get the sum of these squares. Except we get zero. We expected a much bigger number than zero. So this is a little bit odd. We're going to have a resolution for that at the end. Another example. If we take that same generator and we ask whether nine is in this square's generator, Python will tell us that is true. Nine is in this generator and is the square of three. If we ask Python the same question again, this time it tells us false. Nine is not in this generator. We asked Python the same question two times. It gave us two different answers. That's a little bit weird, too. All right. One more example. This dictionary has two key value pairs. If we try to unpack this dictionary into two variables, you might guess here that we would get an error, that this doesn't make any sense. Python does actually allow us to unpack this dictionary, though. This is valid Python code. So that's a little bit weird, but we're going to run with this. So at this point, you might guess that x and y are maybe key value pairs, tuples of key value pairs for this dictionary. They are not. They're keys. So we're going to talk about what's going on here, and we'll revisit those three examples at the end of this talk. Okay. So we're going to explain those three problems that we just saw at the end of this talk. I'd like to do a little bit of review at this point. I'd like to review specifically loops, iterables, and sequences, and how they work in Python. This is not Python code. This is the only bit of code on my slide that is not Python code. This is JavaScript code. This is a traditional C-style for loop written in JavaScript. In this loop, we are starting with i, the variable i set to zero. We check whether i is less than the length of numbers, which is an array, like a Python list, and we loop. We print out the index that we're at there, and then we increment i by 1 each time that we loop. Once this condition, i is less than the length of numbers is no longer true, we will stop looping. So we end up printing out all the numbers in this array. So this is a for loop in JavaScript. Python does not have for loops, at least not for loops like this. Python does not have JavaScript's style of for loop. The style of for loop that we have in C, JavaScript, and many other programming languages. We do have something in Python that we call a for loop. It is a for in loop, and most programming languages call that a for each loop. If you look up for each, all one word in Wikipedia, you will find Python's for loop. So this is Python's for loop. In this for loop, we're looping over each item in our list and printing those items out. Notice that Python's for loop doesn't have any index variables. There's no index lookups. There's no checking the length of an array, index incrementing. There's no indexes at all. Python's for loops magically do all that work for us under the hood. So Python doesn't have traditional C style for loops. We do have something that we call a for loop, but it works very differently. Okay, so that's for loops in Python. A little bit of review there. If you can loop over something with a for loop in Python, it is an iterable, and if something is an iterable, you can loop over it with a for loop. So if you're not sure what that word iterable means, it is something that you are able to iterate over. Iterables can be looped over, and anything that can be looped over is an iterable. Sequences are a very common type of iterable. Lists are sequences, tuples are sequences, and strings are sequences. Sequences are iterables which can be indexed starting from zero and ending at one less than the length of the sequence. Lists, tuples, strings, all sequences can be indexed this way. So iterables are anything that can be iterated over. Sequences are one type of iterable that has some extra functionality here, and we see sequences all over in Python. Lots of things in Python are iterables. Many things in Python are sequences. Many things in Python are not sequences, though. Sets are iterables. Dictionaries are iterables. Files are iterables. Generators, which we saw earlier, those are also iterables. There are even infinitely long iterables. Count in the iter tools modules and infinitely long iterable. None of these iterables here are sequences. You cannot index these things the way that you can with lists, strings, or tuples. Okay. So Python doesn't have traditional for loops. We do have something we call a for loop, though. It works differently than for loops do and see JavaScript, many other programming languages. Anything that can be looped over with a for loop in Python is an iterable. Sequences are just one type of iterable, but there are many other types of iterables. So we are done with review at this point. Hopefully you learned something new here, maybe, as well. We have talked about for loops. We have talked about how Python's for loops are somewhat magical. They do not work like for loops in other programming languages. There are no indexes. We are going to talk about how for loops actually work in Python. But before we do that, let's try to loop over an iterable, any arbitrary iterable, not necessarily a sequence, without using a for loop at all. So you might think that under the hood, Python's for loops use indexes. Here we are manually looping over an iterable using a while loop and indexes. This works for lists, but it won't work for everything. This way of looping actually only works for sequences. This will not work for all iterables. For example, this doesn't work for sets. If we try to manually loop over a set using indexes, we will get an error. Sets are not sequences. They do not support indexing. So at this point, we could try to convert this set to a list and loop over it manually using indexes after we've converted to a list. That would be kind of cheating, though, and that wouldn't work for infinitely long iterables. If you try to convert an infinitely long iterable to a list, your RAM is going to fill up. Your computer is going to be unhappy. You can't loop over that. So we can assume that Python's for loops don't use indexes under the hood. So what do Python's for loops use under the hood for looping over any iterable, if not indexes? Under the hood, Python's for loops rely on something called an iterator. Iterators are the thing that powers all for loops in Python and all looping in Python, in fact. You can get an iterator from any iterable in Python, and you can use an iterator to manually loop over any iterable. Okay, let's take a look at how that works. So we have three iterables here, a list, a tuple, and a string. We can ask each of these for an iterator using Python's built-in iter function. This is built straight into Python. Passing an iterable to the iter function will always give us back an iterator, no matter what type of iterable it is that we're working with. Lists, strings, tuples, sets, any type of iterable in Python we can get an iterator from. And every iterable in Python will provide us with an iterator if we pass it to the built-in iter function. This is the way we get an iterator from any iterable in Python. Okay, once we've got an iterator, there is only one thing we can do with it. Get the next item from it. We can use Python's built-in next function to get the next item from any iterator. And if we ask an iterator for its next item and there are no more items left, you'll get a stop iteration exception. So you can get an iterator from every iterable in Python, and the only thing that you can do with those iterators is to ask them for their next item. And if they don't have a next item, you'll get an exception. So you can think of iterators as kind of like one-directional tally counters. They're kind of like one-directional tally counters, except that they don't have a reset button. They keep track of where they are as you ask them for their next item, but they can only go in one direction and they cannot be reset. They're one-directional tally counters without a reset button. This is what an iterator is in Python. You can also think of iterators as like Hello Kitty Pez dispensers. They're like Hello Kitty Pez dispensers that cannot be reloaded. Once you take up Pez out, it is gone forever. So iterators are like Hello Kitty Pez dispensers. You can take Pez out only in order. Once they are empty, they are useless. You have to dispose of them at that point. Okay, we know about iterators now. We've at least seen a little bit about iterators. We're going to try to now manually loop over an iterable, not using indexes this time, but using iterators. So we're going to turn this for loop into a while loop, the same way we did before. This time we're not going to use indexes, though. We are going to use to grab an iterator from our iterable, and we are going to repeatedly call next on it to get the next item from it to loop over that iterable. All right, so that's what we're going to do here. This function, we now have a while loop. In order to manually loop over our iterable, we need to get an iterator from it. Once we have that iterator, we can repeatedly loop and get the next item from the iterator each time that we loop. Once we have that next item, we can execute whatever the body of our for loop was supposed to do, in this case, action to do item. And if we get a stop iteration exception while we're asking for that next item, we know that it's time to stop looping at this point. We have just reinvented a for loop using a while loop in Python. This is essentially how for loops work under the hood in Python. All looping in Python, not just for loops, all looping in Python actually works this way. So the iterator protocol, the iterator protocol is a very fancy sounding way of saying how for loops work in Python. It's essentially the definition of the way iter and next work. So the thing that we just saw in that last slide, that is essentially a definition of the iterator protocol for the most part there. It is the thing that powers all forms of iteration, not just for loops. For loops use it. Multiple assignment also uses it though. The iterator protocol is also used by star expressions and it's used by many built-in functions in Python, many third-party libraries, many standard library functions. Anything in Python that works with an iterable in some way probably relies on the iterator protocol. And if you were looping over something, the iterator protocol has to get involved. Okay, so you might be thinking at this point, iterators seem cool but they also just seem like an implementation detail that you might not need to care about as users of Python. Who is a Python core developer? Okay, I see a hand. There might be more but it's a little bright up here. Most of us are not Python core developers. So why do we care about these iterators? Why does it matter? So I have news for you. You have seen iterators before. You actually saw an iterator before I mentioned the word iterator in this talk. This is a generator. This generator object is an iterator. It is an iterator and one thing we know we can do with iterators is call next on them. You can call next on an iterator to get the next item from it. So generators are iterators. Okay, if you've used a generator before, you know that there's something else that you can do with them. What else can you do with generators? You can loop over them. If you can loop over something in Python, what type of thing is that? It's an iterable. So generators are iterators. We can call next on them but generators are iterables meaning we can loop over them. So generators are iterators. Generators are iterables. What is going on here? How can they be both iterators and iterables? So I haven't quite been telling you the truth, at least not the whole truth yet. There's something important that I've neglected to mention about the way iterators work in Python. This is fundamental to the way all iterators work. Iterators are also iterables. Iterators are also iterables. So what this means is that we can get an iterator from an iterator using the built-in iter function. So we can call iter on an iterator to ask it for an iterator and when we do that it will give us itself back. Iterators are iterables and their iterator is themselves. Iterators are their own iterators. Who is confused at this point? Yeah, so this is a little bit confusing. It's difficult to make this not confusing, which is why I didn't mention it before, but this is also important. So this fact that I've neglected to mention so far, that iterators are also iterables. This is the last part of Python's iterator protocol. It's the last part of the iterator protocol I didn't mention before. If you ask an iterable for an iterator and it gives you itself back, then it must be an iterator. All iterators are iterables and all iterators are their own iterators. Who wants me to say iterator and iterable a few more times? So the similarity between these two words makes this talk really confusing. Iterables are something that you can loop over. Iterators are the thing that helps you loop over an iterable. It just so happens that in Python iterators are also all iterables, meaning you can loop over any iterator. So if you have an iterator, it's handy that you can manually use that to loop to get the next item, but you can also loop over it. You can treat it as an iterable as well. Okay, let's talk about why this matters. So iterators are iterables, but they have no idea how many items that they contain. They do not have a length. They also can't be indexed. The only thing that you can do with iterators is get the next item from them, as I told you before, and as you now know, you can also loop over them. And if we loop over an iterator a second time, we will get nothing back. So you can think of iterators as like lazy iterables that can only be looped over one time. Iterators are lazy iterables that can be looped over one time only and then they're done. So some quick review. Iterables are not necessarily iterators, but iterators are always iterables. For example, generators are iterators, meaning they're also iterables. Lists are not iterators. Iterables are not always iterators, but iterators are always iterables. I don't think I've messed up and mixed these two words up yet in this talk. If I have, I'll find out in the video afterward. So this is still a little bit confusing and I haven't really answered the question of why we really care about any of this. Let's get back to that. So there are some really good reasons for understanding the iterator protocol on Python. Understanding iterators will allow you to be lazy and it will allow your code to be lazy. Iterators allow us to both work with and create lazy iterables that don't do any work at all until we get the next item from them. They allow you to delay code execution. Because we can create lazy iterables, we can even make infinitely long iterables. So they really allow us some interesting functionality with iterables in Python. And we can even create iterables that specifically conserve system resources. We can save on memory, we can save on time depending on the way we use these lazy iterables. You have already seen lots of iterators in Python. You've already worked with lots of iterators in Python. I've already mentioned that generators are iterators. Even if you haven't used generators before, you've probably used enumerate objects. Enumerate objects are iterators. Well, in Python 3 at least. Zip objects are also iterators. Files are iterators. There are a lot of iterators built into Python, especially in Python 3. These iterators all act like lazy iterables. They all have the same weird functionality that iterators have. They don't do any work at all until you start looping over them. They perform work as you loop. So it's useful to know that you've already been using iterators, but I'd like you to know that you can also create your own lazy iterables using iterators. So this class here makes an iterator that accepts an iterable of numbers and squares each of those numbers, but it doesn't do any of that squaring. It doesn't do any work at all until we start looping over it. So the double underscore next, double underscore iter, I'm not going to talk about how that works, but that is how the iterator protocol works under the hood on classes. Usually when you make your own iterator, though, you don't make a class. You make a generator function. Generator functions are a special syntax for making the iterator. If you haven't made a generator before or a generator function that yield probably seems magical and weird, it is magical and weird, but it is also very powerful. So yield allows us to put our generator on pause in between next calls. When you call next on an iterator, that yield is where it stops. We could implement this a third way. We could also use a generator expression. This generator expression is equivalent to that generator, equivalent to that iterator, but it uses a list comprehension like syntax here. So if you need to make a lazy iterable in Python, I'd like you to think of iterators, and specifically I'd like you to consider making a generator function or making a generator expression. Okay, let's look at some examples of this. So once you've embraced the idea of lazy iterables, you'll find that there are lots of possibilities for making helper functions and for rewriting your code to allow for lazy evaluation or lazy execution. This is a for loop. This for loop sums up all billable hours in a Django query set. You don't need to understand the specifics of it, but I want you to compare it with the same code that uses a generator expression. So we're using a generator expression here for lazy evaluation. These two blocks of code do the same thing. Notice that the structure of the code is pretty fundamentally different though. We're able to use a sum function in that second example because we have a lazy iterable to work with, whereas we don't have any lazy iterable at all in that first example. Iterators allow you to fundamentally change the way that you structure your code, often for the better. Okay, so this code prints out the first 10 lines of a log file. This code does the same thing, but is using an iterator. That first 10 lines thing is an iterator. So iter tools in the standard library is a handy module for doing a whole bunch of stuff with iterators and iterables. This iterator, first 10 lines, allowed us to name something that didn't even have a name before. We had no first 10 lines variable. Our code now has an extra variable that makes it more descriptive. We didn't have first 10 lines before. Now we know when we look at this code, we are looping over just the first 10 lines of this file. Also, we got rid of that break statement and break statements are kind of ugly. So iterators and generators allow you to restructure your code often for the better. One more example here. This code makes a list of differences between consecutive values in a sequence. Notice that there's an extra variable we've got in this code that's kind of hanging around outside of a function and inside of a function. We have to assign to that previous variable every time we loop. This code does the same thing using a made up generator function that you'll just have to trust me exists. Someone wrote it. It's in some library and you can import it and use it. Notice that we don't have an awkward variable assignment here. We don't have a previous equals hanging out in our for loop. The generator function with previous does that work for us. It gives us two values at the same time. Okay, if you're curious how that generator function could be written, here's one possible implementation. I'm not going to talk about how this works. You can look at my slides afterwards. I do want you to note that we are actually reaching into the iterator protocol here. We're manually getting an iterator, manually looping over it with next for good reason because this works not just with lists, not just with sequences, but with files, with enumerated objects, with generators, with any lazy iterable you might imagine. Okay, so at this point we are ready to jump back to those odd examples that we saw at the beginning of this talk. We have a generator object, squares. If we pass this generator to the tuple constructor, you'll get a tuple of its items back. If we then try to compute the sum of these numbers as we saw before, we get zero. This generator is now empty. We've exhausted it. If we try to make a tuple out of it again, we'll get an empty tuple. Generators are iterators and recall that iterators are like Hello Kitty Pez dispensers that cannot be reloaded. Once we run out of Pez, our dispenser is forever empty. If we ask this generator whether nine is in it, it will tell us true. If we ask the same question again, it will tell us false. When we ask whether nine is in this generator, Python has to loop over the generator to find the number nine. If we kept looping over it at this point, we would get all of the items after nine. Asking whether something is contained in an iterator or in a generator, because remember generators are iterators, asking whether something is contained in it will partially consume it. Iterators and generators are like one directional tally counters without a reset button. There is no way to know whether something is in an iterator without starting to loop over it. When you loop over dictionaries, you get keys. Looping relies on the iterator protocol. Iterable unpacking also relies on the iterator protocol. These do the same thing under the hood. Unpacking a dictionary is really the same thing as looping over it. So we get keys in both cases. This isn't unexpected behavior now that we know that's how looping works with dictionaries, and we know the iterator protocol works the same way for all types of looping. Okay, so I'd like you to remember, iterators are the most rudimentary form of iterables in Python. Sequences are iterables, but not all iterables are sequences. You can't index all iterables. And when someone says the word iterable, you can only assume that they mean something that you can iterate over, and that's it. Don't assume that your iterables have more features than just iteration. Also, if you need to make your own lazy iterable, think of iterators and consider making a generator function or a generator expression. And finally, I'd like you to remember that every single type of iteration in Python relies on the iterator protocol. So understanding the iterator protocol is key to understanding quite a bit of weird and not so weird looping behavior in Python. Okay, that's all I got for you. These are some links for you to look up. I guess you can try to click on them here on the screen. You can click on them if you find my slides on Twitter afterwards. I don't have time for questions. I'd like to answer questions in the hallway, you know, afterward maybe. Thank you.