 Hi, I'm Steve Oni. I'm an assistant professor here in the School of Information at the University of Michigan. So I teach some of the introductory programming classes in the School of Information. I'm really excited to have you join us because I really think programming is going to increasingly be a fundamentally important literacy and a way of dealing with the increasing amounts of data that we get and deal with in our everyday lives. In my private life, I play soccer. Like Professor Resnick, I am an avid biker as well. And you'll be seeing me in courses one, two, and four. Most of my research deals with making programming tools more usable. In other words, making programming tools that are designed around the ways that people think and the ways that we actually program as well. Hi, I'm Paul Resnick. I'm a professor and associate dean for research at the University of Michigan School of Information. As associate dean, I don't get to teach that much. But I really enjoy teaching the material that I'll be teaching you in this specialization. I'm glad to have a chance to share it with you. You'll see me in courses one and two, almost all of course three and a little bit in course four. I'm a fan of nerdy pun humor, also known as dad jokes. And so I'll be sharing a few of those with you at the ends of some of the lessons. In my research, I'm probably best known for what's now known as recommender systems. I first published on this back in the early 90s. Things like at Amazon where it says people who bought this book also bought these other books. And more recently, I've been working on online communities and then on educational technologies. In my personal life, I like to play tennis and ride a bike. I've ridden very slowly up a couple of the iconic climbs of the Tour de France. Four years ago up Mont Ventoux, and a couple of years ago up the Caldu Tour Malais in the Pyrenees. I don't travel nearly as much as Dr. Chuck. But I do look forward to trying to adopt his practice of holding live office hours in places when I do get to travel. So maybe I'll get a chance to see you. See you in the lessons. Hi, my name is Jackie Cohen. And I'm a lecturer at the University of Michigan School of Information. I teach a lot of programming courses, including courses a lot like the ones you'll see here. And I also build and design and support a lot of course resources. All of this means that I've seen a lot of different students complete a lot of different programming projects. And what I'll be doing here is orienting you to the end of course projects and giving you some hints and tips about what might be useful and exciting while working on them. I hope you enjoy them because I think they're really fun. And they'll give you a lot of tools for working with programming in your everyday life. Hi, I'm Chris Brooks. I'm faculty here at the University of Michigan. In this specialization, I'll be teaching the last course. In that course, you're going to take image manipulation libraries and large image sets and use Python to change it into useful information. My research focuses on educational technology. And I teach a lot of data science courses, including some on this platform. And I'm very interested in how learners like you approach technology, interact with technology, and use it to enable your learning. I'm looking forward to seeing you in that last course. Hello, my name is Charles Severance. And you may have seen me before in the Python for Everybody specialization, which is some of you took that and then came to this class. I'm really enjoying what I'm doing in this class and that I'm not actually teaching any of the core material, but I'm doing what we call the way of the programmer. And that is sort of I get to play a little bit and not actually teach you anything, but show you something cool. And that gave me a lot of freedom to show things that I consider fun. My research area, as some of you may know, is educational technology. The platform that you're using is something I'm very curious about, how we can improve it, how we can make it better. And I have lots of hobbies, but my most recent hobby is racing, racing on road courses. So if you look, you might find a picture of me in a race car. Hi, I'm Lauren Murphy. I took the on-campus version of this course a couple years ago and ended up working as an instructional aide to help other students learn the material. I since returned to help out with this course, building up the quizzes and assessments and projects that you'll be doing. I'm very excited that you'll have the opportunity to learn the same material, and I hope that you have a good time. And in this specialization for Python 3 programming, you learn how to become a competent Python programmer by learning the fundamentals of the language in detail. You'll learn how to navigate complex data structures and accumulate results from them. And you'll learn how to convert data into a format that can be used by other programs. At the end of the specialization, you'll be able to write Python programs of a few hundred lines. You'll be able to use and integrate Python modules into your code. You'll be able to use external tools, like APIs, by reading their documentation as well. We start from the beginning, and we don't assume any prior knowledge, but we do go deep into the fundamentals of Python to be sure that you understand every aspect of code. So you want to say something about what's our runestone interactive environment? Yeah, so the runestone interactive textbook allows you to interleave learning materials with active code assessments that will allow you to actually write code. And we find that writing code is really important, because even though you can learn how a concept works in theory, so you might know how some particular feature of Python works, it's really important to actually write code to gain more of a working understanding and to know how to actually apply those concepts in practice. So there's also the way of the programmer segments. So most of the course is about how to use Python and learning about Python features. The way of the programmer segment is more about how programmers can and should work. Programming is a little bit more of an art than a science. There's lots of correct ways to do things, but there are best practices. So there are things like how to write programs incrementally. In the way of the programmer segments, you'll also learn about how to write good automated test cases. So that's going to come in course four. Until then, we are going to write those test cases for you. Lauren has created a whole lot of assessments where not only can you run the code in the browser, but it'll tell you whether you got it right or not. And you get that immediate feedback and you can try it as many times as you want. In fact, we've set up the assessments so that you have to get everything right 100% in order to pass the assessment. And the reason for that is we really want you to build mastery so that you don't go on to the later stuff until you've got the early material really solid. You'll also notice that in all of the projects that you do, you'll find ways of translating the concepts that you learn in the courses and throughout the specialization into your real life. For example, different ways of building programs that might be fun in your job or your school or your work or whatever it is that you do. So one of the things that I really like, as I've watched you all put this together, is in Python for Everybody, and you kind of already said this, in Python for Everybody, I really focus on the program. If you get the program, it's like you win. You get the gold star. And we didn't have the time or the luxury to really understand what was going on inside the program. We're just like, we've got the program done and we've got to move on to the next thing. But with some of the stuff that you have in Runestone, you get to say, what's really going on inside of the program and how does this really work? And that's part of the mastery, is so that if you can't, as a programmer, kind of put yourself inside the program and understand how the program is actually functioning, it is difficult to write more sophisticated programs. And so that's where, even though this technically is a beginning course, I think it's really important for people to take more than one beginning course because you have to go over the same material over and over with, in a sense, deeper understanding each time you go through it. Yeah, we have this great code lens tool, I think you're referring to, that lets you visualize what's happening in the execution of the program one line at a time. And you can go forwards and back and see what actually was the value of that variable and when did my list change what its contents were? And so it gives you a way of thinking about it. It's really great for debugging so that you don't have to just do trial and error. Let me change something in the code. You can really think through what is a program. So another thing that the students always ask me at the end of my course is what next? And I think that it's kind of cool that you've built into this specialization kind of a step into what they're gonna do after this, Chris? Yeah, so one of the things that we've added to this course at the very end is a project course and that's really to focus people on how to take other APIs that might be out there or packages and use them and do something novel with them outside of just learning. And it gets to this repeated practice comment that you made. And for that, we're actually doing it within the Jupyter environment. So just like you need repeated practice with APIs and with Python fundamentals, there's so many different places that you can write Python code and Runestones one of them and the tools you use in Python for everyone are one of those. Jupyter is one that's quite common and we teach that in the data science specialization that students could follow this with. And there's other environments too. And so we're trying to really showcase a diversity of learning environments and production environments for Python. Programming is not one environment, right? It's not like you have this one thing you type this stuff in and that's all the programming. When you're out in the real world, each job often has different kinds of environments. Yeah, and practice is so important in the context of programming. I think Lauren has written some great examples of practice problems for you to work on throughout the course as well. And we have this great practice tool that you'll get to see where it represents to you for review some questions that you've already seen in the past. It keeps presenting them to you more frequently if you're having trouble, less frequently if you're showing mastery of them and it's a way to really reinforce what you've got. So look for that practice tool. It also has these fun fireworks that'll show when you've done all of your practice problems for the day. So as you can tell, we're all really excited to share this material with you and we hope you have a lot of fun and wish you a lot of luck. Here at the University of Michigan, our school colors are maize and blue. You might think of them as yellow and blue, but we call it maize and blue. And if I travel anywhere and I have a Michigan logo thing on, someone will come up to me in the airport and say, go blue. So on three, one, two, three. Go blue. Hi everyone. I'm excited to show you some useful features of the free interactive textbook that will be available to you as part of this specialization. Content in the first four courses all track pretty closely to the textbook content. So whichever course you're starting with, you'll want to go through this video to see the important interactive features. You can skip it if you've already seen it in a previous course. The runestone interactive textbook environment is the brainchild of my friend Brad Miller. I've made a few contributions to both the software environment and especially the textbook over the past four years, but Brad deserves almost all the credit. Let's take a look. The first thing you'll need to do before accessing any of the textbook pages is to log in from Coursera. So I just click on this open tool and I'm automatically logged in. You've already logged into Coursera and Coursera is passing the credentials to runestone, so you'll be automatically logged in here. Once you're logged in, all of your work will be saved and we've deliberately disabled any other ways to log in except by doing it through Coursera. So when you first log in following that link, you'll be taken to this practice page in the textbook. It's our way of encouraging you to use the practice feature every day, but we'll come back to that later. Once you're logged in, you'll be able to click on any of the links for the readings and you'll be taken directly to the pages in the textbook for those readings. So here's a link to the runestone page for variables and I'll click on it and now I'm on a textbook page. In the textbook, you'll find text and images, diagrams, but you'll also find some interactive elements. For example, here's what we call an active code window. It's got some code in it and I can click save and run. It'll run and print something out over here in an output window. I can change that code and I can run it and all of your code versions when you save and run them will be saved. I have this little scrubber here and I can move it and I can see all of my old versions and they're not just saved while this page is open. They're saved permanently. For example, let's reload this page. When the page loads, we're back to the original window contents, but I can click load history and then I get the scrubber and it shows me my last code run. Now, if I rerun a previous version, it won't show in the scrubber as being the latest version but if I change it, set of 17, I do 18, now it becomes the latest version in the history. Show in code lens is a really useful feature of active code windows. This is an amazing tool developed by Philip Guo, a professor at UC San Diego. It lets you step through the execution of a program one line at a time. I can click forward and it'll just show me what happens after one line is executed in the next and the next. You can print out just the first message and so on. That's not such a big deal now but it'll be really useful for you when you start to do more complicated programs with conditional execution and iteration and defining your own functions. Part of our educational philosophy in this specialization is to reveal all the magic. We wanna give you a way to reason about how your programs are executing because that's the foundation for being able to debug your code through understanding rather than through trial and error. Code lens really helps with that. Now, sometimes these code lens examples are built right into the textbook but you can always get to code lens by hitting the show code lens or hide code lens for any active code. Here are some that are built in to that textbook page. There are also other interactive features. Here's a multiple choice question. You can answer those and get immediate feedback by clicking on check me. I've actually already answered this one but suppose I said Thursday as the thing that would print out here because day is set to Thursday and I click check me and it gives me some feedback. It's true, Thursday is the value of day but it gets overwritten later so the correct answer is 19. Now, when you get to the bottom of the page I suggest that you click on mark as completed. If you haven't clicked on it this is what it'll look like initially. If you click on mark as completed a couple good things will happen. One is you get the satisfaction of it says, yay, completed well done but you get a couple other things too. First, some of the multiple choice questions or other activities on the page get added to the practice tool which I'm going to show you in a minute. That practice tool will help you review things so that you don't forget them sort of like vocabulary flashcards when you're learning a foreign language. Second, the pages that you've marked as completed will be marked in the table of contents so you can keep track of the textbook of what you've read and what you haven't. Here's the table of contents and you can see these orange dots indicate things that I've completed and I've marked as complete in the check boxes. The check marks indicate things that I've opened but I haven't marked as complete. So this completed button at the bottom of the page is separate from marking a reading as complete in Coursera. You may want to do both of those things. In Coursera, we'll generally provide you with links to particular pages and so you can just read that one page but if you want to, you can navigate through the textbook once you're on the Runestone site and we have these forward and back buttons. This goes to the next page in the book back to the previous page. If you click on the textbook title as I showed you a second ago you'll get to a table of contents that's very detailed with every single page and sometimes subsections within the pages. If you want a more overview, look at it. You can click on this chapters and it'll show you the different chapters and you can just see the detail for one chapter at a time. Now notice that the orange dots aren't shown on this detailed view of just a single chapter. That's a little unfortunate and now that I've noticed it I'll try to add that feature at some point. Finally, there's a search option so I can search for variable and it'll tell me lots of pages in the textbook where the word variable shows up. There's also an index. I want to look for various things and I can click on them and it'll take me to where they are in the textbook. Normally if you log in from Coursera you'll be taken directly to the practice feature but you can also get there from within the book by clicking on practice. What this practice feature does is it represents to you questions on topics that you've marked as already completed. That thing at the bottom of the page where you mark the page is completed. When you're here in the practice feature you get to answer it again and if you get it right it'll remember that and it won't ask you that same topic again for a long time. If you get it wrong then it might ask you again tomorrow. So this practice tool is the brainchild of my doctoral student, Iman Yakisare. He just implemented it last year and in the first semester where we made it available to students in our on-campus classes those students who used it in the first semester where we made it available to students in our on-campus classes those students who used it more did a lot better on the course exams than those who didn't. It was a pretty striking result for me because I'd been monitoring for several years to see whether just spending more time in the textbook had a similar effect on student performance and it didn't. Now in my on-campus classes use of this practice tool is now required and earns a few points towards the final grade. For the Coursera courses it's not required but based on the results I've seen with our on-campus students I strongly encourage you to use it a little every day. I think you'll also find it rewarding. Our on-campus students love the fireworks that they get. So here I'm gonna answer a couple of questions. I have only two left to practice for today and I'm gonna say done, ask me another question and it gives me one more. It says hang in there, last question for today and what's gonna print out? Oh, this is a review. The one we just looked at. I say check me and then I done and I get these fireworks which are a little fun when you finish all the questions for the day. For those of you who are taking this course for a certificate you'll also see links to graded assignments usually at the end of each lesson or set of lessons. In the first four courses the assessments and projects are in the Runestone textbook and they're all auto graded there. You'll only be able to see these in Coursera if you're paying to take the course for a certificate. If you're not paying you can find similar questions in the end of chapter assessment pages in the Runestone textbook. So let's follow the link for this first assessment and this assessment just has two questions. I've actually already answered one of them correctly before. That was a multiple choice question and they want me to write some code. The answer to this one is Grand Tello World. I'll save and run it and I get some immediate feedback. There's an automatic test in here and it's telling me that I got the right output. If I said hello word instead, I would get feedback saying that I had failed. Now it actually, when I tell it to grade me it'll use the best answer I've ever given. So if I ever manage to pass the test I will pass this. We've set up the assessments so that you have to get, usually that you have to get 100% in order to pass the assessment. But you can keep trying and keep getting feedback until you get that 100%. We've done that because we think it's really important to master the early material because things keep building on each other. So I click grade me and it comes back. You can see now that it's updated the score to one instead of zero. I've gotten a total of two out of two for this assessment. And if I go back to this page on Coursera and I refresh it, it'll tell me instead of trying again it's gonna tell me that I've passed. Passed with 100%. That's the Ruinstone environment. It's been a labor of love for all of us who've worked on it as an open source project over the last few years, especially Brad Miller who started the project. I hope you'll find it really helpful to you as you master the fundamentals of Python. I usually end my on-camera segments with a little joke so here's a bit of humorous advice. Procrastinate today, always today. Don't put it off until tomorrow. Okay then, don't listen to my advice. Don't procrastinate today. Go get started with the first lesson in this course. I'll see you next time. Welcome back everybody. This lesson will teach you to deal with outer lists that contain inner lists and dictionaries as items. And similarly, dictionaries whose values are lists or other dictionaries. We call it a nested data structure. It's a little bit like one of these Russian dolls. This one's painted as a robot where there's an outer doll and inside it there's another doll, a sublist. And eventually you get down to something that's not a container, like a string or an integer or a turtle instance. But you might have to descend a few layers to get to it. In this lesson, we'll also get our first look at the JSON data format, which lets you represent a very nested list or dictionary as a long text string so that you can store it in a file and read it later or send it to another computer. At the end of this lesson, you should be able to use square brackets repeatedly to pull data elements out of a nested data structure. And you should be able to read from and write to strings that are in the JSON format using the Python JSON module. I'll see you at the end for a recap and some punny jokes. Bye for now. Welcome back. In the past, we've snuck in a few nested lists but without ever really looking closely at what they are or how to work with them. Here we have a list called nested one whose items are themselves lists. So here's the first item, ABC, a second item, DE, and a third item is the list FGH. I'll talk about the whole list, nested one as the outer list and each of these smaller lists inside as the inner lists. So the first item in the outer list is an inner list, ABC. I'm going to comment out some of the later lines of code here and we'll work our way through it seeing what the effect is of just one line at a time. On line two, we're printing the first item from the outer list, which is the first inner list, ABC. Now I'm going to uncomment line three and run it. We're going to ask for the length of nested one. The length of that is three items. It's not ABC, it's item one, item two, and item three. So three items in nested one. I can append a new item as we've seen before and nested one is now going to have four items. The last item will be a list containing just I and I can iterate through these lists and print each of them out. So this is going to print out all four of the lists we get, the first inner list, the second one, the third one, and the fourth one. So it's helpful to see the code lens representation of a nested list. Let's see this in code lens. So when I assign this variable nested one to have a nested list as its value, you can see it's printed representation and it gets a little funny here because code lens is sometimes deciding to make an inner list be just shown right inside the outer list and sometimes it's deciding that it doesn't really have room to put ABC here so it's showing us the pointer to the inner list. So nested one again is a list of three items. The first item is this inner list, the second item is DE and the third item is the inner list, F, G and H. Let's see what happens if I append an item to each of the inner lists. So nested one, square bracket zero, that's referring to the first inner list and I'm going to append a new item to that making it ABC and Z and I'm also going to append something to the second inner list. Let's see what happens in code lens again. So we get this original view of nested one of what the reference diagram looks like and now what happens to that first inner list when I append Z to it? Well, you'll see this odd thing has happened. Instead of having this pointer out to the other inner list it's now magically decided that it has enough room to put it right here. So this doesn't actually make any difference. It's just how code lens is choosing to show it to us and it's put a fourth item in there. If we now see what happens when we append W to the second inner list, we find that it's now put that here and instead of keeping the list right in here it's made a pointer out to here. So either way it's a way of representing that this item in the outer list is itself a list. We can put all the list items right in there or we can make a pointer out to another inner list. So these are two alternative ways that you can represent a nested list in these reference diagrams. You can do double indexing with two square brackets to pull an item out of an inner list. So the first indexing we're gonna do on line two is we say nested one square brackets one that pulls out one of the inner lists and Y is now bound to that list containing D and E. On line three, when I print Y, I should just get the list D, E as my output. On line four, if I ask for Y square bracket zero that should get me the first item in Y. So I should get D printing out. Now let's think about what happens on line six. On line six, I also have two square brackets, but it's not delving into a big nested list. The first square brackets is actually used to define a list, to create a list. So we create a list 10, 20 and 30 and then we pull out the item at position one. So what we should get is our output is 20. On line number seven, we also have two square brackets, but in this case, before either of the square brackets, we have nested one, which refers to this big list, the outer list that has inner lists as elements. And we are first grabbing the thing at square brackets one and then we are from then that we are grabbing the thing at square brackets zero. So this is gonna print D. How does Python know? In a situation like this, when there are two square brackets one after the other, is it indexing twice to get inside an outer list and then inside an inner list? Or is it, as in line six, is it creating a list and then indexing into that list? And the answer is that the Python interpreter can tell just by the context. If there's a list before the square bracket, then the square bracket means index go into that list. If there's no square bracket before it, as with 10, 20, 30 on line six, then it means create a list. You already saw previously that we could append to a list, even if it's a complicated list, but you can also assign values to positions in both the outer list and an inner list. The way you do that is you make an expression that picks out a list position, like nested square bracket one, but you put it on the left side of an equal sign. On the right side, you say what new value you want. So line number two here is gonna say nested square bracket one refers to the position where currently we have D and E and it's gonna replace that D and E with the list one, two, three. And we can take that same logic one step further on line three. Nested square bracket one, square bracket zero goes to the list that's at position one and it finds within that inner list the item that's at position zero and it replaces that item with the value 100. So let's see what happens when we run this in code lens. The first we create this nested list. You can see there are four items. The first item is the inner list ABC, the second item is the inner list DE and so on. On line two, when I replace nested one square bracket one with a new value, let's see what's gonna happen. Notice what happens with the second item. It's now the list one, two, three. It used to be a list DE. Let's just back up in our execution, we'll see that. So it used to be DE but when we move forward that DE got replaced by the one, two, three. On line three, we're taking nested square bracket one, square bracket zero. So nested is this list position one is this list and position zero within that is right there. Currently containing the value one, we're gonna make that one the 100. So let's see that, see if that one becomes a 100. And sure enough, it did. We can have other complex objects as list elements. For example, here, each of the items is a dictionary. Let's see how CodeLens chooses to show that. So we have a variable nested two, its value is a list. That list has three items and each of those items is a dictionary. So the first item nested inside of our outer list is a dictionary, it has two keys A and B and they each have values one and three. The second item in the list is also a dictionary and it has keys A, C and five. Here we have an even more nested list. Our task is to pull out the word willow which is nested pretty far in here. It's not at the top level of the list. It's not even at the second level. You can see that here's one of the list items and it has as one of its values, the list of willow, birch and elm and then willow is a list inside of that. So we actually have one square bracket here, second square bracket, a third square bracket. So we are three levels deep inside of the data structure inside the variable data. So if it's three levels in, we're actually gonna need three square brackets total to pull out the word willow from inside of data. Now I like to develop my extraction expressions incrementally, that is I'm gonna do it a little bit at a time, I'm gonna first make the expression that pulls out something from within that I'm gonna pull out a little more and then from within that I'm gonna pull out a little more. So let's do that incrementally and you can see sort of my approach to writing code for something like this. Now they told us to name our variable plant. So I'm gonna say plant is data square bracket something and I'm gonna just print plant to help myself see what I got. Now what item am I gonna get? Where am I gonna index? Well, this is zero, one, index two, index two, index three, four, bacon is five, this inner list is six and this inner list which contains willow somewhere in it is seven. So that tells me I should extract data square bracket seven. Now that's not gonna get me just the word willow and I could try to make a more complicated expression where I fill in the others right away but I do like to do this incrementally because I might have counted wrong or whatever. Let's just see if I got the right kind of thing from data square bracket seven. So I did pick out the right element from data, the right inner list. You can see that willows in here somewhere but I'm gonna have to do another extraction and since this is an exercise that has an auto checker for us it's told us that we failed. We were supposed to get just willow but instead we got this list that somewhere has willow in it. So we have a little more work to do. We knew we were gonna have more work to do. So inside data square bracket seven I now have some help because I can see what it looks like and it's actually in the first element, square bracket zero. So I'll do this and check, make sure I still have it right. Sure enough willow is in that and which element is it? Position zero, one, two. I want willow so I'm gonna say square bracket zero. Sure enough I now have just willow which is what I was looking for and the test tells us that we passed. So when you see something like square bracket seven, square bracket zero, square bracket zero that means you're descending several layers into a complicated data structure. And the best way to create expressions like that is to just do it one at a time and keep building it up until you've descended farther and farther into the list. So that's nested lists. You can index into them with repeated use of the square brackets or you can assign values or append deep inside the nesting. You can say something like data square bracket seven, square bracket zero dot append of some other value or you can say data square bracket seven, square bracket zero and have that on the left side of an assignment statement. The append will assume that there's a list at data square bracket seven, square bracket zero and it'll give that list another item at the end, 29. And if we just do an assignment statement we will replace whatever's there even if it's a whole list it's just going to replace it with one new value 42. See you next time. Welcome back. We can have multiple levels of nesting of both dictionaries and lists. Here's a pretty complicated data structure. Our task is going to be to extract the value associated with color. So that's here and on line six the color has the value which is a dictionary I blue and hair brown. So we're going to use the same incremental approach that I showed for extracting something from heavily nested dictionary. We're going to need one square bracket for each level that we're descending. So we're going to need one square bracket to get the stuff associated with personal data. And then within that we're going to need another square bracket to get the stuff associated with physical features. And within that we're going to need another square bracket to just get the color. So let's do this one step at a time. And we're supposed to make a variable called color whose value is going to be that dictionary I blue and hair brown. We're not going to get that in one step. We're going to extract something from the info dictionary. And let's just see how we're doing by printing it out. Now what am I going to first extract? Well first I'm going to extract all of this. All the stuff that is associated with personal data. So I have to ask for the value that's associated with the key personal data. When I run that and I asked to have it printed out we're going to see, oops, we got an error. The error says I have bad token on line number 16. So let's see if we can figure out what I did wrong. Here's line number 16. And the color coding helps me a little bit here too. I can see I forgot to close my string. Let's save and run this. And we can see that color is now bound to a smaller dictionary than before. And it's got exactly the contents not so nicely printed, not so nicely formatted. It's got exactly these contents. And you can see that since there's a test to see whether we've finished the problem we have not finished the problem. We have descended one level into this nested data structure but we haven't pulled out the value that we're supposed to have pulled out. So let's go one step further. In the dictionary that we have so far for the value color where does color appear? It appears here associated with the key physical features. So my second level of nesting is going to ask for physical features. And each time we descend one level our task is going to get a little bit easier. The reason it gets a little easier is that we have less stuff. We've extracted more. We now have a smaller data structure so it's easy to figure out where we are. And we wanna get this dictionary which is associated with the key color in the thing that we have. So let's do one more level of extraction and we ask for color. And now we've managed to pass the test because we were supposed to get the value associated with color. Now notice that the value is still a dictionary. We didn't get just a simple value. We still have a dictionary but that's the thing that we were supposed to get. Now one thing to notice about this complicated data structure where we have all this nesting of one dictionary inside of another. A dictionary is always a value associated with a key. We can never have a dictionary like this be a key. Keys of dictionaries have to be immutable objects like strings or numbers or tuples. They can't ever be lists or dictionaries. Now one other thing to try. Suppose we wanted to change the value that was associated with the color key. So instead of what it used to be, let's make it be something else. Let's make it be 95. Doesn't mean very much but let's just try that. Well we can use the same trick that we did before. Once we have an expression like info square brackets, personal data, square brackets, color, we can just assign to that same expression. Because info square brackets, personal data, square brackets, physical features, square brackets, color, that picks out a location in this complicated data structure. I'm just gonna copy this and before I extract it, I'm gonna change it. If I make it be the left hand side of an assignment statement, I can make it be 95. And now when I extract it, I'm gonna get the value 95 instead of getting I blue and hair brown. Of course I've now failed the test because I haven't picked out the right value but I'm just trying to illustrate the idea that info, square bracket, personal data, square bracket, physical features, square bracket, color, that's a way of picking out a value deep inside of the data structure but I can also assign. That defines a position and I can assign that position a new value, 95 as I've done on line 15. So that's how you extract from and assign to complex nested lists and dictionaries. See you next time. Welcome back. Nested lists and dictionaries are very commonly shared between computer systems. A standard format's been defined that makes it easy to share that data. It's called JSON, J-S-O-N, which stands for JavaScript Object Notation. It originated with a different programming language, JavaScript, but Python makes it pretty easy to both read in and write out data in JSON format. In fact, JSON format looks almost exactly like the printout of a Python list or a dictionary. In most cases, you wouldn't really notice the difference but there are a couple of small differences like the Python value non being represented as the word null and true and false not being capitalized. We'll use the JSON module to read in and write out data in JSON format. A little warning, the JSON module that we have in RuneStone only does a subset of what the full JSON module in a full Python environment would do but we have the two most useful functions. Let's take a look at them. The first one is called load s. It takes a string as input and it returns a dictionary or a list. So here we have a code snippet. We're importing the JSON module and then we've got a variable called a string whose value is naturally a string. That string has some new line characters in it and then it's got the curly brace which would suggest to you that it's going to be a Python dictionary. It's really just a string but once we take that string and we on line three, we pass it as an input to the load s function which is part of the JSON module. The load s function in the JSON module takes a string as an input and as an output it produces either a Python list or a Python dictionary depending on what the contents of the string were. So in this case the string has as its first non blank space, its non white space character. The first one is the curly brace so what's going to be output is a dictionary. So on line four, we print out the type of D and it is a dictionary. Since it's a dictionary, we can ask for its keys and it has two keys, result count and results. We can ask for the value associated with result count and it's 25. That's the 25 that you see there in the string. One important thing to keep track of here is when you have a string and when you have a dictionary. So we start with something that's a character string and by passing it in to the load s function we get a dictionary as an output. If I lost track of that let's say and I tried to print a string square bracket result count because a string kind of looks like a dictionary and I try to run this I'm going to get an error and it says that string indices must be integers not strings, on line seven. Let's say on line seven we're trying to treat a string as if it's a dictionary. It isn't a dictionary, it's a string. We're allowed to ask for a string square bracket four to get the fourth character but we can't ask for a string square bracket result count. We can ask for D square bracket result count because D is a dictionary. The second useful function from the JSON module is called dump s. It does the opposite of load s. It takes a Python list or dictionary even a nested one and converts it into a string that's in the JSON format. Once we have a string in the JSON format we can write it to a file or do anything else that we would do with a string. The dump s function always takes a list or a dictionary as an input but it also can take a couple of other parameters. If you pass sort keys equals true then whenever you have a dictionary it's going to output those in alphabetic order. And if you pass the indent parameter it will pretty print the string. It'll do some indentation and line breaks. You can specify how many characters of indentation you want. So let's see what happens when we run this. Oops, I got an error. It's telling me that on line two, JSON is not defined. And the reason is that I forgot that I need to import the JSON module. So the dump s function is in the JSON module and this code snippet didn't have the JSON module imported. Now let me run it again. Now you can see when I pass D to the pretty function on line nine that it is creating a string by calling the JSON.dump s function. And that string because I said indent equals two each time I descend one level into this next data structure it's adding two more levels, two more spaces. That makes it just a lot easier for a person to read when I print out that string. Now you may have noticed a little funny pronunciation that I've done here instead of calling this dumps and loads I've called it dump s and load s. That's very deliberate. It helps me to remind me that that s is for string. The s is not for plural. You still have to remember which of dump s and load s reads from a string and which one writes to a string. Here's how I think of it. The JSON format is a set of conventions for how a string should be structured so that it can be loaded into a Python object. Thus load s means load from a string or load a string into an object. Dump s means dump an object to a string. Dumping from a string wouldn't really make sense so that's how I can keep track of it. Dump s, dump an object to a string. Load s, load an object from a string. That's the JSON format folks. We'll be seeing a lot more of it when we look at REST APIs that let us fetch data from servers on the internet like from iTunes or OMDB. JSON.load s loads from a string into a Python object, either a list or a dictionary. And JSON.dump s dumps from a Python list or dictionary into a string. JSON.dump s with the indent parameter yields a string with indentation that makes it easier for people to read and understand. See you next time. Welcome back. You've now seen some multiply nested data structures with inner lists and dictionaries inside outer lists and dictionaries. You've picked up some vocabulary about outer and inner elements and you've seen that we just have to use square brackets multiple times to descend into these nested data structures. If there's a list of numbers inside a list, three levels of nesting, then we'll have to use three square brackets to pull out one of the numbers. We can do that all on one line or as several steps each with its own line of code, which will need three square bracket operations somewhere in the code. You've also learned about the JSON format. Remember JSON.load s to load from a string creating a Python list or dictionary. JSON.dump s to dump to a string so you can save the string in a file or pretty print it with nice indentation and line breaks. As a reward for your perseverance, a few punny jokes about birds. Birds? Why birds? Because we've been talking about nested data. It's almost winter time here in Ann Arbor as I record this. What do you call a bird in the winter? A bird. Why do birds fly south for the winter? Because it's too far to walk. And did you hear the joke about the broken egg? Yeah, that one cracked me up too. See you next time. Welcome back everybody. You've already seen that you can use square brackets multiple times to extract an item from deep inside a nested data structure. Often, you'll want to traverse all the stuff that's in a nested data structure or at least some subset of it. Perhaps you'll print all the items out or accumulate them into a flat unnested list. In this lesson, you'll do that through nested iteration, a for loop indented inside a for loop as many levels as you need. We'll also look at the confusions that can occur from a list or dictionary being nested inside other lists or dictionaries. It's analogous to the problems of aliasing that we've seen before, where multiple variables point to the same object. If you mutate a list, say by appending an extra item, you've effectively changed all the lists and dictionaries that contain that list. And we'll introduce the notion of deep as opposed to shallow copies of nested data structures. Finally, we'll introduce the understand, extract, repeat method of incrementally figuring out how to extract data from a nested data structure. At the end of this lesson, you should be able to write nested for loops to traverse a nested data structure, recognize when an inner list or dictionary is shared among multiple outer objects and, if necessary, make a list make deep copies instead of shallow copies to avoid that. And you should be able to incrementally develop code to extract the data you want from a complex nested data structure using the understand, extract, repeat process. Bye for now, we'll see you at the end. Welcome back. With nested data structures, we'll often want to traverse all the contents and print or extract some of the items. We do that by nested iteration with one for loop indented inside another. Here we have a list called nested one. We've seen it before. And on line two, we iterate. We go through all of the items in nested one. We're gonna call this the outer loop because it goes through the outer list. The loop variable or iterator variable X will be bound to one of the items each time. So the first time that we execute lines three through five, X will be bound to the list A, B, C. So we'll have X bound to A, B, C. And then we execute lines three through five then we'll change X and make it be bound to DE and we'll execute lines three through five again. As part of the loop, I just print level one on line three because we're gonna have a lot of stuff printing. It's a way of making things look nice in the output. And then we're gonna do an inner loop on lines four and five. So on line four, we started another iteration. We have another iterator variable or loop variable. It's called Y and Y will get bound first to A and then to B and then to C. So the first time we execute line five, Y will be bound to A and it'll print out a few spaces, level two, colon and then it'll print out A. We'll go on, Y will get a new value, it'll be B and we'll print out level two, B. We get all the way through A, B, C and then we're done with the for loop for the first value X bound to ABC. Then we go on to let X have a new value, X is gonna be bound to D and E and we do the inner loop again. When we do the inner loop again, Y will get bound to D and then Y will get bound to E. So let's see how the whole thing unfolds when we run it. So you'll notice that line three, print level one, that gets executed three times. And the reason is that we have three items in the outer list. So we execute the outer loop three times. Each time we print level one and then the first time we go through A, B and C. So line five executes three times once with Y bound to A, once with Y bound to B and once with Y bound to C. Now one thing that's important to keep track of is which variable names that we're using for the outer loop and which we're using for the inner loop. You'll notice that the variable X is used as the iterator variable or loop variable in the outer loop and then we see it again on line four when we're creating the inner iteration. And that's a very common pattern because when we do this nested iteration, we're letting X be bound to each of the inner lists and then in the inner loop, we're gonna wanna have an iteration through each of those lists. So if you're ever tempted to say, for X in nested one and then for X in Y, that's not gonna work. The reason it's not gonna work is that for X is defining the iterator variable is saying here's a new variable name that's gonna get bound to something. What you really wanted in this inner spot is for Y in X. You want the X to refer to a variable that's bound as part of running the outer loop. Here we have a traversal where we're supposed to accumulate something from the inner lists rather than just printing them. So we're supposed to write code to create a new list that contains every person's last name and save that list as last names. So each item in the outer list is itself a list and those lists contain last names like Turner and Damon and Whig. This requires an accumulation pattern. What makes me think that it's gonna be an accumulation? Well, we had create a new list and it contains every person's last name. Those are kind of triggers that suggest to me that we're gonna have an accumulation. So to solve this, let's set up our usual template for an accumulation pattern. Since I'm accumulating a list, I start with an empty list for something in something. I'm going to refer to a CUME here and at the end, I will have my list. Now, instead of calling it a CUME, I'm gonna call it last names because they told us we're supposed to accumulate it into a variable called last names. Now, what am I gonna iterate over? Well, I'm gonna iterate over the whole giant list. So that's called info and I can name my loop variable anything I want to. But since these are all entertainers, I'm gonna call my variable entertainer. Now, what am I gonna do with each of these entertainers? Well, I'm going to extract just the last name and this is a very well-structured piece of data because every inner list has the last name in position one. You'll notice it's the same always. It's always position one where we have the last name and that means we can always just do something with entertainer position one. What are we gonna do? We're gonna append that entertainer to the end of the existing list of last names and I see that I misspelled last names and just to help me see what's happening, I'm gonna print last names each time. So when we execute this, I will see as the list accumulates. Oops, it looks like I accidentally deleted my print statement as I was getting rid of all the markings on the screen. Let me see if I can undo it and make it come back. Well, I can't make it come back so I'm gonna type it again and now we can see on each iteration when we get to line six, I'm printing and the first time I only have Turner but the next time I have Turner with Damon appended and so on. So that's how I would go about answering an exercise like this. I'd first figure out, oh, I'm gonna need an accumulation pattern. I would make the little template for that and I would start filling in parts of the template and I would have to come up with the insight that the last name is always in the same position in each of the inner lists and therefore I can always refer to position square bracket one knowing that the entertainer variable was gonna be bound to a different one of these inner lists each time I went through here. The first time I get to line five, it's bound to the first inner list and I get Turner. The second time entertainer is bound to that second inner list, Matt Damon 1970 and we get Damon's last name appended to our list of last names. Now note that we don't have an inner loop in this problem. We have for entertainer and info but we don't have another for loop listed here but we do have an outer loop and we have square brackets to extract an item from an inner list. So we have an outer loop here on line four and then we have another square brackets on line five to extract a name from one of the inner lists. If we're extracting things from two levels into a nested data structure like Turner and Damon are getting extracted from info, we will always need a combination of two iterations and square bracket index operations. We could have info square bracket one square bracket zero. So that would be using two square brackets. And if I do that, I will, I'm gonna get rid of the print statement on line six so that we don't get confused by it in the output. I will just get Matt directly. So if I wanna extract Matt or I can extract Damon by doing square bracket one, square bracket one or I can have a for loop as I do on line four with a square bracket on line five, I could have two for loops and I'll print a whole lot of stuff but it's all the inner values, Tina Turner, 1939 and so on. So that's two for loops or two square brackets. We've already seen a for loop with an inner square bracket. You can also have a weird thing where we could say for val in info square bracket one print val. I'm just gonna write pass here so that it knows we don't wanna do anything in that for loop. So here on lines 12 and 13, I am only looking at val square bracket one. That's the inner list Matt Damon 1970 actor and I'm printing that. I'm printing each of those values. So in any case, if I'm trying to get values like Tina and Turner and Matt and Damon out of this nested data structure, I have to do some combination. I either need two square brackets or I need two for loops or I need one for loop and a square bracket. One for loop and a square bracket. Two levels of nesting, some combination of two for loops and square brackets. Here's another problem where we accumulate. In this case, we do need nested iteration because we're not always gonna extract the same thing from each inner list. Our task is to use nested iteration to save every string containing B into a new list named B strings. So this is another accumulation pattern. We're gonna traverse this nested data structure finding each of the vegetables and fruits and accumulating all of the ones that contain the letter B. So this is an accumulation pattern. B strings is eventually gonna be a list of words that all contain the letter B like bananas and blueberries, but not lemons. Initially it's empty and I'm gonna iterate through the outer list. So for each of these lists in the outer list, I'm gonna check whether the items in it are B words. So I'm gonna say for word in list B strings dot append of word. And when I finish, I'll print out B strings. Now this isn't quite right because I'm gonna actually append all of the words even when they don't contain the letter B. So I've got apples when I shouldn't have apples. But I do now have just a simple list, not a nested list. I just have all of the items that appeared anywhere at the second level in the list and now they're in this flat list. So what do I have to do instead of automatically appending every word? I have to append it only if it has some property. If something, something. If the word has some nice property, I will append it and otherwise I won't bother. Now what do I wanna put into this if expression? At this point, when people start learning about nested iteration, they sometimes forget about the nice built in things that Python has given us. I'm just first gonna remind you and then I'll show you the hard way. But first I'll show you the easy way. If we wanna check on whether word has the letter B, we have the in operator. So if B is in the word, then we append it and otherwise we don't append it. Now we managed to pass the test. We just got bananas and blueberries and green beans and so on. Now there's a lot of ways that you can go wrong when you start writing this kind of code. It might be attempted to say things like append list instead of append word. Then what do you get? You get a bunch of lists rather than getting single words. You might be tempted to append the letter B instead. Then you just get a list of the letter B a whole bunch of times. But what we really want is to append the word if we do this. And by the way, it gets easier if you use mnemonic variable names like word. If I had called this Y and this one X, it'd be a lot harder to remember what exactly I ought to be printing here. I'll be more tempted to do something like appending X. Yeah, well for B in, you know, I'll say X instead of for B and Y. I should be saying B and Y. This gets a lot more complicated. I can get the right answer, but it's a lot harder. It was a lot easier when I used mnemonic names for list and L for word in list. If B is in the word, then accumulate that word into our final value. Now, what's the complicated way that I sometimes see students do things if they don't remember that we have this nice in operator? Well, they start to iterate a third time for character in word if character equals B, then we append the word. And that almost works. If you don't remember about the in operator, it almost works, but it fails when a word has more than one B in it. You'll see bananas came out fine, but blueberries got in here twice because it got in there once for the first B and a second time when we got another B in the word blueberries. So there's a way to get around this. If you remember the fancy break command that ends the inner iteration, the inner for loop, this will break us out of the innermost for loop that we're in. So this actually should work. I think, yes, it does, but it's the complicated way. The simple way is to just check here if B is inward. So that's nested iteration to accumulate some of the values from the inner lists. We'll see you next time. Welcome back for this way of the programmer segment on good design practice for creating nested data. Basically, be consistent. Don't mix two levels of nesting with three, for example. Consider this example. This is bad design practice. On line one, we've got a variable nested one, which is assigned to a list. It has a couple elements that are numbers and then it's got a few elements that are lists. Having this mixture is going to make it hard to traverse the nested data structure. So for example, if I try to run this now, I'm going to get an error. The error tells us that we're trying to iterate through an integer object on line number four. But to understand what's going on, we need to go up a couple lines first. So on line number two, we are iterating through all of the items in nested one. So X is going to be bound to one and the next time through X is going to be bound to two and then the third time through X will be bound to the list ABC. Each time we're going to print out level one just to remind us that we've made it through one more iteration. But then the action is supposed to happen at line four. We've seen something like this before. If we really had lists all the way through as all of the elements of nested one being lists, then we could iterate through each of those items where X was going to be a list like ABC. We could iterate through it and print out each of the items saying something like level two A. Our problem is happening when we get to this for the very first time. We've printed level one and then we're getting an error on line four because X is bound not to a list but to the integer one. So when we now say for Y in X, we're saying for Y in this, so that's the integer object and it is not iterable. It is not a list. And the problem is really coming because we wanted to get to items like A and B and C which does require this nested iteration but our data wasn't structured in a clean way. So we couldn't just always assume that X was going to be a list. Sometimes X was going to be a number. So let's see how to solve that problem. The basic solution is to use some special case logic. In our code, we'll check whether the type of X is a list, if it is, we'll do a nested iteration and if it's not, we'll do something else. So that's what's going on on line four. We've started iterating X this first time is bound to the value one and we check is the type of X list. If it is, then we would iterate through it but since it's not, we're just going to print it. So I would expect one to show up and then we'll do our second iteration. X will be bound to two and we'll get two to show up in the output window. The third time, X is going to be bound to a list, A, B and C. Now when we check is the type of X list the answer will be true and so we'll iterate and for each Y, the Y will be bound to A and then to B and then to C. For each of those, we'll get something that says level two A and level two B and so on. So let's just check my reasoning there and see if that's what actually happens and indeed that is what's happening. Each time we get to line three, we print out this level one and the first time X was bound to the value one so we just print it. The second time we print two. The third time we iterate through the list A, B, C and we execute line six three times once printing out A, once printing out B and once printing out C. Now you can imagine this would get pretty complicated with lots of special cases if we had more levels of data nesting and we were not consistent about the level of nesting. The solution when we're in control of the data structure is to make it very regular with always the same kinds of items and the same level of nesting. In other words, don't mix integers, lists and dictionaries as items in a single list. Sometimes, however, we aren't in control of the data structure. Maybe our evil predecessor on the project structured the data in a not very careful way or we get it from an external source that structured it however they wanted to. When that happens, you'll have to notice and add some special case logic like the if statement that I've got here on line four. So that's my advice for following the way of the programmer. Don't put trip hazards in your way. Make the nested data structures you're gonna traverse be as regular as possible. But if somebody else puts the trip hazards in your way be prepared to jump over them with some if then statements to handle differently structured data in different ways. See you next time. Welcome back. Earlier in this specialization we took a deep dive into aliasing and the confusions that can occur when you mutate a list or a dictionary that has many aliases. Many different variable names that are pointing to the same list or dictionary. The same thing is true if you have instead of variable names pointing to a single list if you have a single list that is included as an element in multiple other lists. If you mutate the inner list that is part of several outer lists all of those outer lists will look like their contents have changed. The best way to keep track of these confusing situations will be with a reference diagram, of course. So let's take a look at an example. One way that you can get a list included in multiple lists by accident and thus get this confusing situation is when you copy a list as in this code snippet. Let's start by just running it. So you can see that on line one we create a list called original. It's value is a nested list. The outer list has two inner lists as its elements. Let's set ourselves up with a reference diagram. We're gonna say that original is a variable name and its value is a list. That list has two items. The first item is itself a list and that list has dogs as an element and it has puppies as an element. The second element of original is another list which is as its values cats and kittens. So that's all that happens on line one. On line two we make a copy of original. We use the slice operator colon inside of square brackets. You may recall that always creates a slice. If I said original square brackets one colon four that would make a slice beginning with the item at index one and going up to but not including position four. Of course I can't do that for this list because it only has two items. But if we leave off the value before the colon we start at the beginning of the list. If we leave off the value after the colon we go all the way to the end of the list. So this says take a slice beginning at the beginning of original and going to the end of original. So that makes a new list and that new list has as its elements the same elements that original had. The first element is dogs and puppies. The second element is cats and kittens. And we assign that to a variable called copied version. So that's what happens on line two. Now just to show you what the output looks like. On line three we print out copied version and sure enough it looks like it has exactly the same stuff that original has in it. On line four we check is copied version the same object as original. And the answer to that is false. Copied version is pointing to one object. Original is pointing to a different object. On line five we check whether they have the same contents. The answer to that is true. So original and copied are not the same object. They're different objects but they're both lists. They both have two elements and those elements are these inner lists dogs, puppies and cats, kittens. It's true that they are equal, equal. It's not true that they are is each other. So copied version is not original but it is equal to the original. It has the same contents. On line six we see where our confusions are gonna happen when we have multiple outer lists that point to the same inner list. So original square bracket zero is the first inner list. That's dogs, puppies. And on line six we say dot append of canines. So that is going to append to the end of this list another item. But notice that this item is canines in square brackets. So this item is actually another list. That list has only one item canines but it is a list. It's not just the string canines. So now on line seven if we print original we get this list here where the first element is dogs, puppies like before but we've appended to the end of that list, the list canines. And then the second element is still cats and kittens. On line nine we print the copied version. Now on line six we just said to change original. We didn't say anything about changing copied version but on line nine you can see that this square bracket canines has magically appeared after dogs and puppies. And that's because copied version even though we didn't say to change it copied version has as its first element this list and we mutated that list. That's shallow copies for you. See you next time. Welcome back. Previously I showed you that a shallow copy can lead to some confusions because the same inner list is included in multiple outer lists. What would we do if we wanted to make a copy that was completely independent of the original list so that no change anywhere even to an inner list would change the copied version? That's what's called a deep copy. We can do it with nested iteration accumulating a copy of each inner list. So let's take a look at this code. I have the same original list from before. It has two inner lists, the first one dogs and puppies the second one cats and kittens. I'm gonna use our old friend the accumulation pattern. I start by making copied outer list be an empty list. So this is our accumulator variable. I iterate through the sequence original by loop variable or iterator variable. I'm calling inner list and then I'm doing this nested iteration and I'm doing a nested accumulation here. So I make another accumulator variable copied inner list and it's set to be empty. I iterate through the items in inner list here my second iterator variable is called item and I append it, append item to the copied inner list. So copied inner list starts as an empty list and we keep adding additional items to it. When we're done with the inner list, copied inner list has all of the same elements that it had. And so we then take copied inner list and we append it to the outer list. Let's run that and see what happens. At the end of lines three through seven, we have copied outer list as this deep copy. If we print it, it looks like it has the same contents as original because this is a deep copy. Even if I append canines to the end of the first inner list in the original list as I'm doing on line nine, it's not gonna have any impact on the deep copy on the copied outer list. So we print original and it does the same thing that happened in our previous example, but when I print the copied outer list, there's no canines here. The reason is that I have made a deep copy. So let's just make the reference diagram that'll go with this. We have original, it's pointing to this list of two things. The one is the dogs and puppies. The other is the cats and kittens. When we make the copied outer list, we end up again with a list of two items. But the first item is now not pointing to dogs and puppies. It's pointing to a copy of it with the same contents, but it's a different list. So then on line nine, when I append something here as an extra item with the canines in it, it does not have any effect on the inner list from copied outer list. Doesn't change anything in the deep copy. So we have successfully managed to make copied outer list be a deep copy that is completely independent from the original list. Even if I change something deep inside the original list, it has no impact on the copied outer list. Now you might have noticed lines four through seven are a little bit more complicated than they need to. I reminded you previously that we can use the slice operator and we can do that here to make a copy of the inner list. So instead of doing a manual accumulation, we can make the copied inner list just be inner list square bracket colon and get rid of all of this. And this maybe makes the code a little easier to read if you remember that the slice operator with the colon makes a copy of the list. So this has exactly the same effect as the code that we did before. Nested iteration where we copy each of the inner lists works fine for making deep copies if we know how many levels of nesting we have and we have a parallel structure everywhere with the same levels of nesting. But what about this one? There are three levels in some places, but not everywhere. We've got an outer list which has as its first element this inner list, but this inner list has a string canines as its first element but a third level of list as its second element. So how would you make a deep copy of this one? This is a case where a technique called recursion would allow us to write fairly elegant code but we're not gonna teach recursion in this specialization. Turns out you can get pretty far in writing applications and data analysis programs without learning recursion because in the most common cases where it's useful there's a built-in function available to you that will do the recursion behind the scenes. This is one of those cases. There's a built-in function called deep copy in a module called copy. So I've imported a module called copy and inside the copy module there is a deep copy function and it just takes a sequence as its value and it produces a deep copy where as many levels of nesting as there are it goes in and always copies things that never shares an inner list between the original and the deep copy. So I have my original version. I have a shallow copied version and I have a deeply copied version and if I were to print those out they would all look the same but now I'm going to append to the original an extra string at the very end so it's going to have another item at the end and then I'm also on line six going to the first inner list and I'm appending the list marsupials to it. At this point we're going to have original, deep copy and shallow copy all have different values and let's see what those are. So you can see now that original has canines, dog's puppies and an extra element marsupials. That happened because on line six we appended the list marsupials to the end of its first inner list. The shallow copy because it only did the one level of copying also was changed. It got marsupials as well but in the deep copy no marsupials. The other difference we can see is that the original and the shallow copy diverge because the original gets high there at the end. So we appended to the top level list and there the original is different from the shallow copy so high there got appended to original did not get appended to shallow copy. I encourage you to really work through this in detail make a reference diagram go through it in code lens make sure you understand why these things are coming out with three different printouts at the end. If you can follow it you'll be able to reason your way through lots of things where you get puzzling answers and you'll be able to debug your code. So to summarize a deep copy of a list not only makes a copy of the outer list but also makes copies of all the inner lists and their inner lists all the way down. A deep copy is completely decoupled from the original. You can make a deep copy with your own code if the structure is simple enough or use the built-in deep copy function in the copy module. See you next time. Welcome back for this way of the programmer segment on how to approach a big complicated nested data structure. The answer is to take it one step at a time. I gave you a little preview of my approach previously which I call understand, extract, repeat. To illustrate this we'll walk through extracting information from data formatted in a way that's returned by the Twitter API. This nested dictionary results from querying Twitter asking for three tweets matching University of Michigan. As you'll see it's quite a daunting data structure even when printed with nice indentation. Here we'll just take a little tour through it. You can see just to get the information about three tweets, there's a lot of stuff and there's a lot of indentation. How many levels of nesting do we have here? Maybe five or six. So this can be pretty daunting. But we're going to take it one step at a time. And the mantra is understand, extract, repeat. So first we will understand it. So to understand it you might want to just print out the first few characters of it. I'm using that JSON.dumps to pretty print it. And I'm printing only the first 100 characters here so that doesn't take up too much space. And we can see at least from this that it's a dictionary and that one of its keys is called search metadata. If you don't want to depend on looking at it and seeing the curly brace, here I've printed out type of res and it is a dictionary. And whenever I have a dictionary the first thing to do in order to understand that dictionary is to ask for what are the keys? In this case there are only two keys. There's one that's search metadata and there's other statuses. When I printed out the first 100 characters you could see search metadata, but you couldn't see statuses because that was way more than 100 characters in before that second key was going to show up because it was first going to show all of the nested data that's part of the value of the search metadata key. Once we've done our first level of understanding we're going to descend one level we will extract. But I want to show you another tool that can be pretty helpful for getting sort of an overview of what the whole data structure is going to be which is sort of an outliner view of the whole nested data structure. If I dump this data into JSON format I can copy it to an external site that lets me sort of look at it in outline view. So I'm going to get rid of these other print statements and I'm just going to print out the whole thing in JSON format instead of printing out just the first 100 characters so I'm going to print out the whole thing and now I'm going to copy the whole contents which is quite a long thing. This is all data just for three tweets, believe it or not. And finally I've got it all, I'm going to copy it. We're now visiting a site called jsoneditoronline.org and on the left side I let you paste in a string that's in JSON format. I've copied that big long string and I'm now pasting it all 600 and some lines of it. And I can click on this arrow here and it's now going to give me an outliner view of it. Remember there were two keys, search metadata and statuses telling me that search metadata is a dictionary with nine keys. One of them is called count, completed in, maxIDster and so on. Under statuses it's got a square bracket three which is telling me that it's a list with three items in it. And I can see that the first element in that list is a dictionary with 24 items, 24 keys in it. And I can keep descending down there. My goal is actually going to be to get the authors of the tweets so I'm going to guess that that's in the user key and sure enough there's a screen name or a name. And so if I'm very careful about this I could just write a very complicated expression to go in four levels of nesting and grab a screen name of 31 Brooks. But I'm going to do this one step at a time. I've gotten myself oriented and I might come back here to help me stay oriented but then I'm going to work with the code and I'm going to build up my code one step at a time or at each step I understand what I have at the current level of nesting and then I extract to get something one more level of nesting in. So let's go back to our code. I figured out that there were two keys and that the information that I wanted to extract the author names were in the second of those they were in the value associated with the statuses. So I've just done something to extract the value associated with the statuses key. So this is my first extract. I did an understand and then an extract and then I'm going to repeat. So I have extracted something at level two and now I want to print some stuff out to help me understand what I've got. And so I'm printing out the word level two and I'm checking what is the type? It's a list. Well, if it's a list the first thing that I always want to do is check its length and it has three items. Now from what we looked at in the JSON online editor we also would have gotten that same information. So this is, you can tell that we've sort of descended to that level of the data. And since I know that this was a query that returned three tweets I'm going to guess that each of the items in this list is representing one tweet from Twitter. So I've now understood at this level it's a list with three items and so I'm now ready to extract. So I could either extract a single item or I could iterate through all the items. What I'm going to want to do is extract all of the authors. So I'm going to iterate for each thing in res two. I'm going to do something with it. So that's why I'm iterating here on line eight. But as I'm developing my code I don't want to have to have lots of stuff showing up in the output window. So I actually want to deal initially with just one item at a time. And that's why I've done what I've done here on line eight. I'm building up a template for my code that I'm going to iterate. But in fact I'm only taking a slice containing one item from the result of level two. So I'm only actually going to execute lines nine and 10 one time. So at level three we're printing out that we've got some information about a tweet. And then I'm just getting the first 30 characters of whatever the thing is. I dumped it to be JSON and I've asked for the first 30 characters it looks like it's a dictionary. So to help in the understand phase I'm printing out the type of res three and its type is that it's a dictionary. And because it's a dictionary I like to print out the keys. In this case there's quite a few keys. And I got to look through these and try to guess which of these is going to have the author of the tweet. And it's the user key. So therefore I'm going to extract the user key. I'm going to extract the user key from the res three dictionary. And that's going to be my level four result. We've extracted into the variable res four and now it's time to make a few print statements to figure out what's in res four. Turns out that what's in res four is a dictionary and it has even more keys. But we're going to have to look in here and decide maybe we want to extract screen name. So our next extract operation is going to be to take screen name from res four. Here I've got res four and I'm extracting the screen name. I also decided to just find out the time when the tweet was created. And I've commented out a bunch of the other print statements so that we're not going to have quite as busy of a print out. So now we are printing out 31 Brooks and the time at which that tweet was created. So at this point, if I get rid of some of the print statements that maybe are that are commented out and are distracting in the code, I would start to have something that's reasonably compact that prints out the screen name and the creation time for the first tweet. My next step, once I've sort of got enough of the code to do this is that now I'm ready to generalize and say, hey, I don't want just the one item. I'd really like to get all of them. So here the change is that instead of only getting one item from res two, we're going to get all of them. And I get 31 Brooks, but I also get for Oyoho and Duncan because there were three different tweets. And once I've gotten this far, now I really have built up my code one step at a time. I can now simplify the code. And we could actually do something as simple as this where we have combined some things into more complex expressions. So instead of making a new variable name for each level that we descend, we're just going to say res square bracket statuses and we're going to iterate through that. Our loop variable becomes res three. And with each of those, res three is still a pretty complicated dictionary, but we can do res three square bracket user square bracket screen name to get the screen name. And we can do square bracket user square bracket created at to get the time that it was published. So I can run this, I get the same results as the more complicated code. Now you could try just writing this code, these two lines of code like this from the very beginning, but if it doesn't work out, it would be really hard to debug. So I really recommend building it up one step at a time where you just descend one level at a time into the data. So in summary, my suggestion is that if you need to extract something from a complicated deeply nested structure, develop your code one layer at a time. At each step, print out what you have, either the keys of the dictionary or the first few characters in the printed representation of the first item. Then extract a little more. At the end, you can remove all the print statements and collapse some of the code into more complex expressions that give you something compact like we see on the screen now. See you next time. Welcome back. I'm going to work a problem that involves traversing a nested data structure and accumulating a value. I've got here a variable called big list. As you can see, it has some nested lists, some inner lists inside of the big list. In fact, it's got three levels of nesting that we can see right away because we've got the three square brackets here. So it's going to create a list that has some elements in it. That first element of that list is itself going to be a list that has some elements in it. And the first element of that has a list containing one and two. So in order to traverse this and pull out a word like one or seven or nine, I'm going to have to go three levels into this. So some combination of three square brackets and three for loops. In this case, I really want to go through every single word because I'm supposed to create a dictionary word counts that contains all the words in big list as keys. So I'm going to have as a key the word one and as a value, however many times it appears anywhere deep in this nested list. And two, however many times it shows up. So that's the dictionary I'm supposed to create. To do that, I'm going to have to find all of the words anywhere in the big list. So let's get started on this. As always, I like to make a little template I'm supposed to create a dictionary word counts. I'm going to accumulate that dictionary. So I'm going to start with an empty one. And at the end, I'm going to print word counts just so I can check on my work. I'm going to clear out all of the markings that I have here. And I can try running this. Of course it's not going to give me the right answer but at least it's going to get me started. And you can see that I've failed the test because my actual value was not what was expected but we have a good start. I'm just going to make my code window narrower so that it doesn't overlap with the output window. All right, so we can see that what we've output is just the dictionary that's empty. We need to traverse through big list. And since we're going to have multiple levels here I'm going to use a little mnemonic where I say for inner list, level one in big list, IL one is going to be bound to one of the inner lists. Initially it's going to be bound to one that begins here and keeps going off the screen. So that's an inner list. I'm going to have to descend into that as well. So I'm going to say for each second level list in the first level list I'm going to do something and because I need to descend to third level I think I'm going to need three, four loops here. And I see that I needed a one instead of a two there. Now, I think that this is the right thing but I always like to check my work. So I'm just going to see if this gets me the thing that I'm looking for and sure enough that prints out all the words. And I'm realizing now that I really shouldn't have called my iterator variable at level three IL three. I really should have called it word because that's going to help me remember what kind of thing I have there. Okay, I changed my variable name from IL three to word and I still got exactly the same output. Now, instead of printing out all these words this is where I want to use them to update the word counts dictionary. So instead of printing the word I'm going to check if the word is already in word counts and this is a pattern we've seen quite a bit before. If we've already seen the word then we want to update its count. We want to increment it by one. If we haven't seen this word before then we have to put it into the dictionary for the very first time. And its initial value ought to be one because we've now seen that word one time. So if we've seen it before, we increment its count. If we haven't seen it before, we set its count to one. I think that should get us what we want. So associated with the word one is the count of four. That means it's happening four times somewhere in here. And you can see it is showing up more than once somewhere in there. So that's how to work your way up to a solution for a problem. Remember that if the data you want to extract is nested three levels deep you're going to need to descend three levels with your code either square brackets or four loops. In this case we have three indented four loops, one inside the other. And we have the first level of descent, the second level. And notice that the iterator variable for the first time becomes the sequence for the second one. And the iterator variable at the second level becomes the sequence at the third level. And so on. Hopefully you'll be able to do some of the other exercises on this exercises page at the end of the chapter and you'll be able to do some of these on your own. Good luck. See you next time. Welcome back everybody. At this point you have the machinery for extracting data from complex nested data structures. One of the biggest challenges is just understanding the structure of the data you have. Hopefully it'll have a consistent structure. For example, if there's a list of dictionaries everything will be a lot easier if all the dictionaries have the same keys. Now once you understand the structure you can figure out how to write the code. In the end you might not need much code but if you try to just write it all out all at once it'll be very frustrating. You'll take a long time to debug it. So go through the understand extract repeat process testing your partial code at each stage and you'll solve your coding problems a lot faster. One valuable heuristic you may want to remember is that if the data you want to access is nested three levels deep, a list inside a list inside another list then you'll need that same number three of extraction operations. Some combination of square brackets and for loops. I'll leave you with a silly joke and some serious pearls of wisdom. Since we talked about deep and shallow copies what's a tiger running a copy machine called? A copycat of course. And some deeper wisdom about complexity inspired by the more complicated nested data that we've learned how to process. Confucius said, life is really simple but we insist on making it complicated. Folk musician Pete Seager said any darn fool can make something complex. It takes a genius to make something simple. And management guru Tom Peters writing about chaos in organizations said if you're not confused you're not paying attention. I'm not sure sure about that last one. I hope you're not confused about nested data structures no matter how complex they get. I'll see you next time. Welcome back. We've seen a lot of the accumulation pattern already in the specialization. You have a list and you go through the items one by one updating an accumulator variable at the end the accumulator variable as your result. Two particular kinds of accumulation are so common that Python includes special language features for expressing them. One is a map operation. It transforms each item in the original and accumulates the transformed item into a new list of the same length. The other is a filter operation. It selects a subset of the items from the original list. This week you'll learn to think abstractly about these two accumulation operations. In this lesson you'll learn to use the built-in Python functions map and filter. And then in the next lesson you'll learn about Python list comprehensions which provide a nice syntax for doing a combination of map and filter operations. At the end of this lesson you should be able to read and write code using the map and filter functions. In particular you should be able to write the functions that are passed in as arguments to the map and filter function. Bye for now. We'll see you at the end. Welcome back. The map function makes a new list where each item in the original list is transformed in some way. For example, suppose we want to transform each item by doubling it. So we make a new list that has doubles of all the values in the original list. We've done that kind of thing before with a manual accumulation. As you see in this active code window, the new list is going to be our accumulator. We bind it to an empty list. We iterate through a list. Our loop variable or iterator variable is called value. And with each of those items we transform it by multiplying it by two and appending that new element into our accumulator list. Now I've put this inside a function. A list is a parameter and we return the new list. So this is a function that takes a list and returns a new list where each of the items is transformed by being doubled. Let's see what happens when we run it. We start with the list 259 and I then pass that list to the double stuff function. I get back a new value for things which is the transformation of the list. So we print it on the first line. We print the doubled version on the second line and you can see two has been transformed into four, five has been transformed into 10, nine has been transformed into 18. This manual accumulation where we transform each of the items is something that programmers want to do a lot. And so Python has provided a function that lets us express it in a more compact way that's easier to read, clear what's going on. A map is a function. It takes a sequence as its second input and it takes a function as its first input. So take a look at line five. The sequence is the second input to the map function. The first input to the map function is itself a function. We've defined triple. Triple is a function that takes a value and transforms it into three times that value and returns the tripled value. So I'm going to call this first input to the map function, the transformer function. It's going to be applied to each item in the sequence, each item in A list, and the results will get accumulated and we get back the accumulated list. In this case, I'm just binding it to the variable new list. So this is analogous to how the sorted function worked. If you remember, the sorted function, we passed another function as an input. That was the key function. We passed it as the key parameter. And behind the scenes, when the sorted function was running, it would invoke the key function on each item. And the results of invoking that key function were used to determine the sort order. Here, we invoke the transformer function behind the scenes when the map function is running and it gets invoked one time for each item. But the resulting values are accumulated into a list. So let's just see what happens when that runs. There's more stuff happening on lines eight to 10, which I'll explain in a minute. I'm just going to comment those out so they don't confuse us in the output. So when I run this, I start with 259. I call triple stuff on line 13, passing in 259. And I'm going to get back 615 and 27. So again, we started with 25 and nine. We passed that whole list 25 and nine as the value for A list and we mapped the triple function, which means we took the value two, we passed it into the triple function, we got back the value six, then we passed five in and we got 15 back. We passed nine in and we got 27 back. We never gave a for loop here that says for item in A list, run the triple function on each of those items. That's all behind the scenes inside the map function. Map just lets us specify, hey, apply this transformer function to every single item in the sequence. Now, map expects a transformer function. We can have a named function like triple, but we can also have a lambda expression, just like we did when we were passing a function to the sorted function. On line nine, we're passing as the transformer function, the value of a lambda expression, lambda value, four times value. So that's a function that takes one input, we're naming it value, and it transforms that input with this transformer expression, four star value. If I run this one, I think you can guess what's gonna happen. I'm gonna get two times four, five times four and nine times four. So I get eight, 20 and 36. Let's work an example together where you have to write some code that uses map. In this case, we're asking you to transform each of the items in a briefs. So a briefs is a list. Each of the items in a briefs is the abbreviation for a country, USA, or the United States, Espana, China, JPN for Japan and so on. And we're supposed to produce an output that looks like capitalized version, USA, ESP and so on. So that's our job. We could do it with our manual accumulation, but our challenge here is to figure out how we can use the map function. As always, I'd like to start with a template. Remind myself, okay, I'm gonna try to make a new variable called the briefs upper, and it's gonna be mapping something and something. So you first have to remind yourself, I'm gonna use map. It's gonna have some kind of transformer and it's gonna have some kind of list. The first thing you have to remember is which one of these is the transformer and which is the sequence? It turns out that the second one is the sequence. And the first one is the transformer. And let's do the easy one first. The sequence is just a briefs and then the transformer is the hard part. Remember that the transformer has to be a function. That function has to take one string as input and return the capitalization version of that string. So let's call it transformer and we're gonna make it be a function. So we def transformer. It takes one a brief as an input. Now let's just call it one string as an input and it's gonna have to return some other string. So how do I transform something like USA into capital USA? For this one, you have to remember that Python has a built-in method for this. So I actually just need to return st.upper. And if I didn't remember this, it may turn out that I've misremembered it. I would have to go to Stack Overflow or some documentation or something. I think I remembered it correctly here. So the transformer has to take one string like USA and it has to return a transformed version of that string. So I'm gonna call, I have to tell the map function which transformer to use. In this case, I've called it transformer but I could have called it F if I wanted to. And I run it and sure enough, I pass the tests. Let's actually print out our briefs to see what we got as our result. So USA, ESP, all of them are capitalized. So this is one way I could have done it or I could have done it using a lambda expression. So everything is still the same except that instead of referring to the function F, I'm gonna say lambda of the string return the uppercase of the string. I get the same result. Let's say I wanted to take only the first two letters of each abbreviation. Well, I could take a slice and now I'll have US and ES and CH as my outputs. So you can have a lot of fun doing things like this. You just have to change your transformer function and you get slightly different outputs. If I wanted only the first letter, just do it that way. So that's an example of how to think about using the map function. I recommend as always starting with a template and fill in the easy parts first until you've just isolated the hard part. I think the hard part we call to the map function is the transformer function that you're gonna have to pass in. So I sort of save that to do it last. We'll see you next time. Welcome back. Another common processing pattern we've seen before is to filter a sequence. You start with some items and you end up with fewer of them. For example, here we're doing an accumulation that filters for only the even numbers in a sequence of numbers. So we've defined a function keep evens and then we're invoking it on line eight by passing it in a list and we get back just the numbers four, six and zero. So three got filtered out and seven got filtered out and one got filtered out. So let's look at the code. This is a style of code you've seen before with manual accumulation. So we'll create an accumulator variable. I'm calling it new list and it starts as an empty list. Eventually we're gonna return that accumulator but first we're gonna do an accumulation where we append some things onto it. So we iterate through the sequence nums. That's the list of numbers that are passed into this function. We have a loop variable or iterator variable. We're calling it num. And then this is an important part. I'm gonna call this the filtration expression. You have to remember that the percent operator is just returning the remainder when you divide by something. So if I say 9%, two, I'm gonna get back one and if I say 10%, two, I'm gonna get back zero. So all the even numbers are gonna have a remainder of zero. So this filtration expression is checking is the number even, is the remainder equal to zero. If the current number is even, then we append it. Otherwise we don't do anything and we just leave it out. So this filters out anything that doesn't pass the filtration expression. Now this is the way we would have done this operation before learning this lesson. Now we're gonna see that Python provides a function that will do all of this accumulation in a much more compact way. So the abstraction that Python provides is called the filter function. Here's code that does the same thing we saw before but now it's gonna use the filter function. You can see on line two that we call the function filter. It's taking two arguments. The first parameter that we pass in is that whole expression. The second parameter that we're passing in is the sequence. So just like with the map function where we pass in a sequence and we pass in a function, we're doing that a similar thing here. I'm gonna call the function the filtration function because it's filtering the contents. This lambda expression is producing a function. That function takes as an input a number and it's producing an output either true or false. So unlike with map where the function was a transformer it was taking the input and transforming it into something else. Here we're not transforming the input. We're just making a binary decision about it. Is it true meaning keep it in or is it false meaning filter it out? So the next thing to notice is what's actually in this lambda expression. We're saying lambda num colon num percent two equals zero. This num percent two equal equals zero. That's our same filtration expression that we saw before when we were doing the manual accumulation. The filter function returns a list like object in our runestone environment. It actually returns a list but in a full Python implementation it would return one of those iterators just like the dot keys method for dictionaries. And so I'm wrapping that in a call to the list function to make it into a real list rather than just a list like object. So if I run this, I'm gonna get the same output that I did four, six and zero. We've only kept the even numbers. We got rid of the odd ones. All of the invocations of this filtration function are happening behind the scenes. It's automatically calling this function once on each of the numbers in the list. Let's try an example where we have to construct an invocation of the filter function. Our instructions tell us that we're supposed to take our original variable list and it's got a sequence that is assigned to it and we're gonna create a new variable list two that's gonna have only some of the words from the original list. Only those words that contain the letter O. So as always, I like to get started on a problem like this by creating a template for what I'm gonna do. I'm supposed to create a new variable list two and it's going to be the result of invoking filter using some filtration function on the original list. Of course, if I run this, it's gonna fail because some filtration function is not good Python here. I gotta fix that. So what is it that I wanna do? I want to have a function. That function has to take as an input one of those words like pumpkin or wagon. And for wagon, we want it to return true and for pumpkin, which doesn't have an O, we want it to return false. So I'm going to create a lambda expression that is going to do that for me and it's going to evaluate to being that function. I'm gonna say lambda word. It's gonna take one word as input and it's supposed to return true if the letter O is in the word. And here you have to remember that we have the convenient in operator. I just have to say is O in the word and that lambda expression creates the function that I was just trying to describe down here. Let's try running it, see if I got it right. And sure enough, I did. That's the filter function. It's an abstract way of describing a common programming pattern, one where you iterate through a list and accumulate a potentially shorter list with some of the items filtered out. You pass in a filtration function that gets called behind the scenes one item at a time. The filtration function returns true if that item should be kept and false if it should be filtered out. See you next time. Welcome back everybody. You've now learned about the map and filter functions which perform particularly common accumulation patterns. With map, you pass in a transformer function which takes one item and transforms it into its new value. With filter, you pass in a filtration function which takes an item and returns a boolean. True if the item should be included, false if the item should be filtered out. Here's an apocryphal story about a filter that Socrates applied to gossip. In ancient Greece, as you may know, Socrates was widely lauded for his wisdom. One day, the great philosopher came upon an acquaintance who ran up to him excitedly and said, Socrates, do you know what I just heard about one of your students, Plato? Wait a moment, Socrates replied. Before you tell me, I'd like you to pass a little test. I call it the triple filter test. What, triple filter? That's right, Socrates continued. Before you talk to me about what Plato did, let's take a moment to filter what you're going to say. The first filter is truth. Have you made absolutely sure that what you're about to tell me is true? Lambda X, X is true. No, the man said, actually, I just heard about it and all right, all right, said Socrates, so you don't really know whether it's true or not. Let's try the second filter, the filter of goodness. It's what you're about to tell me about, about Plato, is that something good? Lambda X, X is good. No, on the contrary, so Socrates continued. You want to tell me something bad about Plato, even though you're not certain it's true. The man shrugged a little embarrassed. Socrates continued. Well, let's try the third filter, the filter of usefulness, is what you want to tell me about Plato going to be useful to me? Lambda X, X is useful. No, not really. Well, concluded Socrates, if what you want to tell me is neither true nor good nor even useful, why tell it to me at all? The man was defeated and ashamed. And this is the reason why Socrates was a great philosopher and held in such high esteem. It also explains why Socrates never found out that Plato was sleeping with his wife. I'll see you next time. Hi, everyone. I have to apologize because the previous lesson forced you to learn some pretty conceptually challenging stuff, passing a transformer function to the map function and passing a filtration function to the filter function. And it turns out that you won't actually use them. Python programmers rarely do. It was just a pedagogical tool to give you a way to really think through the map and filter programming patterns. In Python, there's actually a simpler syntax for doing the map and filter patterns and even for combining them. It's called list comprehensions. And now that you understand what they're for, I think you're gonna love them. The syntax can be a little confusing because it reuses elements of the for loop syntax. But with an understanding of what list comprehensions do, based on the map and filter lesson, I think you'll find that it's pretty intuitive. List comprehensions are how Python programmers typically write code that does transformation and filtration. At the end of this lesson, you should be able to read and write list comprehensions and you should be able to describe a list comprehension in terms of the transformation and filtration operations that it's performing. So let's go see how they work. See you at the end. Welcome back. A list comprehension is a convenient syntax for doing map and filter operations. One that I think you'll find is a little more intuitive than the map and filter functions. So here's a simple example that does a map operation, transforming every item by doubling it. We started with the list two, five, nine. Two got doubled to be four. Five became 10 and nine became 18. So let's pull apart the syntax on line three that constitutes the list comprehension. We have, first of all, square brackets. So that should be familiar to you. We use square brackets to create a list. But what we're putting inside it is not the actual objects but a Python expression that's going to evaluate that's gonna produce all of the contents of the list. So the next thing to notice is for value in things. That's how we did for loops before but there are some things that are a little different. So the part that's the same is we have the word for and we have the word in. Between, we have a variable name. That's gonna be our loop variable or iterator variable. And after the word in, we have some sequence. In this case, the variable things whose value is the list two, five, nine. The things that are different are that we don't have a colon at the end. No colon. And we don't have an indented block of code underneath. Instead, what we have is some code before the word for. And I'm gonna call this the transformer expression. So we have a general format for this that is square brackets, a transformer expression. Then the word for will have some variable name, the word in and then some sequence. And we'll have a closing square bracket. I left a little space for something that's gonna come later. So compare this to the version that we had using map. I'm just gonna erase a few things here. Instead of saying your list equals that list comprehension, we could have said your list equals the results of calling map where we had some function, a transformer function that would take a value, return value times two. And we would apply that transformer function to things. You can see a lot of correspondences between these two approaches to doing it. Let's just make sure to test my code. Yeah, it still works. I managed to do the same thing. We have this value times two. That was the transformer expression in our list comprehension. But we also had that transformer expression as part of the lambda expression that produced our transformer function when we were using map. We can also use a list comprehension to do filtering. As in this example, we're defining a function that filters keeping only the even numbers. We did that previously with the filter function. In this list comprehension, we have an extra clause at the end, the if clause. So after we had the four variable name and sequence, we have the word if, and then we have a filtration expression. So in this case, only keep the number if it's even, only if it has a zero remainder when we divide it by two. Note that when we're doing the filter operation, we don't really wanna transform the items that we keep. So we have this somewhat funny wording here, num for num in nums. Well, you can parse that and then it won't seem so crazy. Nums is our list. That's the list that was passed into the function. This num between the word four and in is our iterator variable. And this is our transformer expression, except we don't wanna transform the number. We just want the number as it was. And so we say num rather than num star two or some other transformation. We just say num for num in nums. And the real action is happening in the if clause where we're filtering out those things that aren't even. So that's why I left this space here because the full version of a list comprehension can have an if clause. There can be a filtration expression that goes afterwards. As we did before, let's compare this version of keep evens to a version of keep evens that doesn't use a list comprehension that instead uses the filter function. And you'll be able to see some correspondences between them. First, I'm gonna erase a few of these marks. So let's make an alternative definition of keep evens. I'm gonna filter my original nums and I'm gonna give it a filtration function now. That's gonna be lambda of num num percent two equal zero. I'm gonna apply that filter to nums and this is a new definition of keep evens. This one will have the same effect as the original one did. Let's check to make sure. I didn't quite get it right because I forgot to return my new list. Now really, because in a true Python environment, the filter operation doesn't return a true list. It returns this list like object. We can get away with it in Runestone. It's a better habit to convert it into a list if we really want it to be a list. So now let's do a little comparison of the two versions. In the version using a list comprehension, we had the filtration expression after the word if. Here, we had that same filtration expression as part of the lambda function. In both cases, I chose to name my variable num and we are applying the filter in both cases to the same sequence. So there's a lot of things that match up syntactically. In either case, you want to be able to read this and make sure that you understand what each of the words is referring to. The word num comes in here twice and you have to realize that the first time you have the word num, that's defining a variable. That's the parameter for this anonymous lambda function. The second time you have the word num, it's referring to the item itself, that variable will have a value when we invoke the function. And so we're defining the variable the first time and we're looking up the variable the second time. By contrast, in the list comprehension, the first time we have the variable name, we are looking it up and the second time we have that same variable name between the word for and the word in, that's where we're creating the variable name num. That's all for now. Let's do an exercise together. Our challenge here is we're given a dictionary tester that has some nested data inside of it and we'll get a little practice with extracting from nested data. Our job is to accumulate all of the names, all the values associated with the key name from any of the inside dictionaries. So one of those names is going to be Lauren and other name is going to be IO and we should end up at the end with a list of Lauren and IO and maybe some others. Now, as always, I like to get started by reminding myself of what I'm supposed to produce. I can't run this yet because that's obviously not going to be the correct answer. I've got some work to do. When I have this nested data structure, I like to do the understand, extract, repeat process. So you can see that those names are not at top level. There's a top level dictionary that has the key info whose value is a list and then I'm going to have some dictionaries inside of that list. The dictionaries that I want are all associated with this value info. So my first level of extraction is going to be, I'm going to call it inner list equals tester square bracket info. So I've done my first extraction but it's always a good idea to check whether I got it right. So I'm going to print that inner list. I'll print it and I got bad input on line six because it's not good Python. So I'm going to comment that out. It's still there to remind me that that's what I'm supposed to produce at the end. And there we go. We got a whole bunch of stuff. I'm going to get rid of my markings here and it is a list with some dictionaries in it. You can see Lauren is here and there's IO. If I wanted to make this a little easier to read, I might use my JSON.dump.s to pretty print it. I'm going to import JSON and I'm going to say inner list.dump.s. Sorry. It's JSON.dump.s of inner list. And I give it the indent parameter. There we go. That's a little easier to read. I've got a list of dictionaries. The first dictionary has name colon Lauren. The second dictionary has name colon IO, name colon Catherine and so on. So I've done my first extraction and now I'm going to iterate through each of those dictionaries. I'm now ready to do my list comprehension that is going to iterate through those dictionaries and just grab the names. It's going to grab Lauren and IO and Catherine. And it's going to be a list comprehension. I'm going to just make our little template here. There's a transformer expression for some variable name in a sequence if filtration expression. Now I'm going to fill in some of those things. What is the sequence that I'm going to iterate through? Well, it's this inner list. That's the thing that has the list of dictionaries. Each of the items in that inner list is a dictionary. So I'm going to choose to call it D, D for dictionary. And now I just have the filtration expression and the transformer expression that I need to fill in. In our case, we've been asked to get all of the names. We don't really have a filter. You could imagine a filter that says, only get me the students in sociology or only get me the students in information science and then I would need the if clause. But I really don't need that if clause here. I could say if true, that would get me all the students or I can just get rid of the if clause entirely. So now I just need the transformer expression. If I have a dictionary D and D is bound, let's say, to this dictionary that has the keys name, class standing and major. And I want to get just IO out of D. How would I do that? Well, I would just say D square bracket name. So let's print compree at the end of this and let's stop printing that whole big list. And let's see what we've managed to get. I'm gonna get rid of my marking so we can see it better. Sure enough, we got a list of Lauren and IO and Catherine and Nick and so on. And you can see that we've nicely passed all of the built-in tests that we're checking to make sure that we created the right list. So that's all there is to list comprehensions. So just remember the basic template for a list comprehension. You have square brackets. You have a transformer expression. Then the word for, a variable name, the word in, some expression that evaluates to a sequence and then optionally, a filtration expression. And close the square brackets. We'll see you next time. Welcome back. Practice is really helpful for reinforcing how to do map and filter operations, whether you do them with manual accumulation or the map and filter functions or with list comprehensions. Let's work our way through defining a function long lengths using all three ways of doing the map and filter operations. Long lengths is gonna take as input a list of strings and it should return a list of numbers. The lengths of only those strings that have at least four characters. So there's a filtering operation. Any string that's too short. We don't wanna do anything with it. We don't get any output for it. And there's also a transformation operation. We're mapping, we're taking the strings and turning them into numbers that represent the length of those strings. So the first thing I would do is first make an indication of this function so that I'll be able to tell once I implement it whether it's working correctly. So I'm gonna pass it some list of strings for which I know what the correct output ought to be. And I'm gonna print that out. And what list do I wanna give it? Well, let's give it, let's say a list of some short strings, three character string. Whoops, let's give it a four character string and a five character string. I, J, K, L, M, and O. What I'm expecting to get back is a list of four and five. The strings that were too short, A, B, C, D, E, F, those strings are all too short so I shouldn't get any output for them. And then I should get four for the G-H-I-J and five for the K, L, M, and O. If I run it, of course, I'm gonna get the value none. I'm not even gonna get a list. Even if I make it a list, I'm not getting the right thing. So I got a little work to do. I'm gonna write a list comprehension. Instead of returning this empty list, I'm gonna do some transform expression for each string in strings if some filtration expression is true. So if I have the string S is bound to G-H-I-J, what I would wanna return is the length of S. But I only wanna do it if that length is at least four. So that's my filtration expression. Now if I run this, hopefully I'm gonna get four and five back. I did get four and five back. And in fact, the built-in tests which called long lengths with some other lists of strings, I passed those tests as well. So that is one correct answer, but I also wanna do this same one using manual accumulation and using map and filter. Let's try it using a manual accumulation. I'm gonna start my accumulator variable having an empty list. And at the end, I'm gonna return it. In between, I'm gonna iterate through the strings and I'm deliberately using some of the same variable names just so we can sort of see how the syntax maps between them. For each S in strings, if the length of S is at least four, then I will append something to my accumulator. If the length isn't long enough, I will ignore it. So that gets me the actual strings, GHIJ and KLMNO rather than the transformed version of those strings. So instead of appending S, I have to append LENOVS. Rinse don't wanna degrade me on how well I did on that. All right, the third way that we can do this is using map and filter. So I'm going to return the result of mapping my transformer onto the set of strings. If I do that, I will get the lengths of all of the strings. So that's not quite right yet. And by the way, LENOVS is really just the LEN function. So I could map LEN onto strings. I would get the same thing, but I don't wanna do it on all of the strings. I wanna do it on the filtered strings is the result of filtering the strings using a filtration function. So that gets me the filtered strings. I can pass those to the map function. And I only got the one because I did the wrong filtration expression should be whenever the length is greater than or equal to four. Let's see how some of the same expressions show up in these different ways of implementing the map and filter. In our list comprehension, we have a transformer expression, LENOVS, and we also have a filtration expression, LENOVS greater than or equal to four. In our manual accumulation, we have a transformer expression that the LENOVS, how we pass S through the LEN function to get its length, and we do that before we append it into our accumulator variable. We also have our filtration expression, LENOVS greater than or equal to four. Similarly, if we look at how we do this with map and filter, clearly the filtration expression should show up when we invoke the filter function and the transformer expression should show up when we use map. So we have the filtration function, LENOVS greater than or equal to four, but we've made it be part of the filtration function. And with map, we actually don't have the full LENOVS expression, but if I do it with this sort of more complicated, where I write out the whole lambda expression, lambda S colon LENOVS, I do that version, then I do get the corresponding expression LENOVS and it's being part of our transformer function. So the places where we have to specify exactly how we're transforming, exactly how we're filtering, which sequence we're doing the transformations and the filtering on, all of those things, they have to show up no matter which way we express it. What's different is that we are plugging those expressions in in different ways that make it easier or harder to read and write these. And I think the Pythonic way is clearly the first one that uses the list comprehension. And this is what I would encourage you to use most because you can look at it and in one expression, in one line of code on line three, you can read it off. You can see where the transformer expression is, you can see where the filtration expression is, and you can easily see which sequence it is that we're doing these operations on. I encourage you to try some other exercises on this exercises page to get more practice. It walks you through doing some of the same operations doing both with map and with list comprehensions or both with filter and with list comprehensions. I'll see you next time. Welcome back everybody. You should now be able to read and write list comprehensions. You should be able to describe the operations that a list comprehension is performing in terms of transformation and filtration. And if you had to, you should be able to translate a list comprehension into equivalent invocations of the map and filter functions. Here's a whole list of puns for you. I hope you'll comprehend them all. What does a clock do when it's hungry? It goes back four seconds. Wanna hear a pizza joke? Nevermind, it's too cheesy. I was up late last night reading a book about anti-gravity. I just couldn't put it down. That's a lot of puns. I actually once did a whole theatrical performance of puns. It was a play on words. See you next time. Welcome back. Python provides one more function that performs a common accumulation. It's called zip. It takes two sequences and zips them together, matching their first items together, their second items together, and so on. It makes it easy to do operations where you have to make pairwise comparisons or pairwise combinations. At the end of this lesson, you should be able to read and write code using the zip function. Bye for now. I'll see you at the end. The zip lets you match up the items in multiple sequences positionally so that you can do something with all the first items, something with all the second items, and so on. For example, suppose we wanted to add two vectors, three, four, five, and one, two, three. At the end, we wanna get a new vector where we've added the first two items together. Three plus one gets you four, four plus two gets you six, and five plus three gets you eight. We're sort of zipping these two things together that we're combining the three and the one, we're combining the four and the two, we're combining the five and the three. We can do that by iterating through the indexes, the positions. Remember that the range function produces a sequence and in this case we're saying how many items should be in the sequence, it's the length of the list so it should be the numbers beginning at zero and going up to but not including three. So i is gonna be bound to zero and then to one and then to two and we're iterating through those indexes or positions. Let's run this in code lens and see what happens. So we create the first list, we create the second list, we have an accumulator that's gonna accumulate things pairwise and now we're gonna iterate through those indexes zero, one, two, and three. The first time we get to line six, the index is zero so we're gonna stick into our new list, something that combines the value of position zero from the first list with the thing that's at position zero in the second list. So it combines the three and the one to get four. The next time we get to position six, i is now bound to one and so we're gonna combine the two things that are at position one in the two lists. So you can sort of think of this as that we zipped the two lists together and then combined the elements at position zero, combine the elements at position one and so on. Now iterating through the indexes or positions can be done in Python but it's not really the Pythonic way of doing things. It obscures the structure of what's going on which is that we're matching the two lists together position by position. Instead, Python provides the zip function which makes that structure clearer. So here's doing the same thing using the zip function. The zip function takes two or more sequences and it makes a list of tuples. Each tuple gathers together the items from the corresponding position in the list. So you can see on line three that I've invoked the zip function, passing it L1 and L2. The zip function actually returns an iterator, that list-like object that we've seen before that map and filter also return and seen it with the dictionary method.keys. If we're just gonna iterate through the results that zip produces, we don't need to do anything but if we actually want to see it as a list, we have to pass it to the list function. So let's see this in code lens. This isn't actually gonna combine the values. It's not gonna add up the three and the one to get four and add up the four and the two. It's gonna produce a zip-together list. So we start with our two lists and then we make a list that zips the two together. It's still gonna have three elements but now each of those elements is a tuple that is stuck together the three and the one from position zero. It's stuck together the four and the two from position one and stuck together the five and the three from position two. Once we have this zip-together list, then we can do an additional operation to combine the three and the one to combine the four and the two. So let's see how that happens. So here I've created that zip-together list and then on line six and seven, I'm going through each of those tuples and for each of them I'm adding the two elements together and appending that result into my accumulator list, L3. Let's see how that one looks in code lens. We create our two lists, we zip them together. So now the first element is a tuple of three and one and we're now gonna iterate through those tuples and we're doing that clever little unpacking thing so that on our first iteration, X1 is gonna be bound to three and X2 is bound to one. So we've unpacked this tuple into two different variables, X1 and X2. X1 gets the value three and X2 gets the value one. Having done that, we're gonna add the three and the one together to make four and append that into L3. You can see the four is there and we'll go on the next time X1 and X2 have the values four and two and we've appended the value six into L3 and now when X1 and X2 have five and three we end up appending eight. So we get to the end and we print out four, six, eight. Now having developed this code, we could combine it and simplify some things. So I'm gonna show you how we can do it with a list comprehension that puts it all together. Instead of doing that manual accumulation where we start L3 having an empty list and each time we append to it, I'm gonna use a list comprehension. This was our L4, our zipped list and we are iterating through it each time we're unpacking the tuple into the two variables X1 and X2 and our transformer expression takes those variables X1 and X2, adds them together and the sum is the thing that we accumulate into our final list. If I run this, I'll get the same result. This is the pythonic way of doing things. We zip the two lists together then we use a list comprehension to iterate through that zipped list with each tuple, we transform the tuple by adding the two elements together and we accumulate that set of sums into our final list. So that's the zip function. It takes multiple lists all of the same length and creates a new list of tuples. Each tuple groups together one element from each of the original lists. We'll see you next time. Welcome back. We're gonna write a program today using zip that you might use as part of a program that plays the game hangman. Just to make it clear what we're talking about, I'm gonna play part of a hangman game with Andre, my videographer. Are you ready to play Andre? Yes, I am. All right, well, okay, I've got a gallows here. It's just set up all waiting to hang Andre and I've thought of a word but I'm gonna leave it blanked for now. So the blank word has six letters in it and I'm also gonna keep track of whatever guesses he makes. He's gonna make a guess. If his guess is in the word, I will unblank from the appropriate spot. If not, then one of his body parts is gonna get added to our gallows and hopefully he will guess the word before he runs out of health. So are you ready to play Andre? Yes, I am. Okay, give me a letter. Let's go with A. Going with A. Well, that's a reasonable guess. It's in a lot of words in English, however, it is not in my word. So we have a guess of A and we've added a body part to the gallows. Give it another shot. I'm thinking E. It's a good idea to stick with vowels to start. Yeah, another good idea, but it's getting dangerous for you here. Try again. That's not looking good for me. Let's go with S. S is a nice, solid consonant. Oh boy. I have a feeling you're just avoiding all of the common letters with this one. No, no. It's just a regular word. I'm thinking U. Let's go with a U. Oh boy. Man, I am not doing so hot. I'd like to take a guess from our audience out there, but we're not live, so. All right, I'll take a B. A B, oh boy. So. I will go P. Oh, he's got one. I know what you did there. I see what you did. I'll go with a T. A T. Let's go H. And H. I think Andre was well on his way to guessing that word. Let's step back and think about how we might write a program to do what Andre did, that is make guesses. Imagine that we were gonna use a kind of brute force approach. We have a dictionary that has every possible word in English. And the computer program at every stage of the game is gonna go through every single word and figure out is this word still possible? And given that set of words that is still possible, it will figure out which letter should I guess next? We're not gonna write that whole program together. I'm just gonna write a component of it, a function called possible, that will take three inputs and will determine if it's gonna return true or false, whether the word that we pass in, down to the parameter word, whether that word is still possible, given what things have been unblanked so far and what guesses have already been made. So rather than our example that I was playing with, Andre, imagine that our blanked word is a little bit longer. The hidden word has actually 10 letters in it. And some letters have already been unblanked. So we have blank, O-N, blank, blank, R, blank, blank, L-L. We have a couple of different possibilities for what guesses have already been made. In one version, O-T-N, Q-U-R and I'll have been guessed and in the other letter W has also been guessed. So is Wonderwall still possible, given what has already been guessed? So basically we can rule out the word Wonderwall if it has a letter that's already been guessed but hasn't been revealed, like W. Since W has been guessed but it hasn't been revealed here, Wonderwall is not possible. This ought to return false. Another possibility not shown here is where a letter at some position has been revealed but that letter doesn't match the letter that's in Wonderwall. That would be another reason to rule out Wonderwall. So let's actually run this code in code lens. I think it's gonna be clear what it's doing. So we define this function. Now we're gonna invoke the function the first time, checking whether Wonderwall is still possible and we have the values for blanked and for guesses made. Our first check on lines two and three is just to make sure that the word that we're interested in has the same number of letters as blanked because if it doesn't, it definitely isn't possible and we should just return false and give up but in this case it does have the right number of letters. The next thing we're gonna do is index through the positions. We're gonna check at position zero does W match blank. W match blank. We're gonna check whether O matches O at position one and so on. So remember that line of word just tells us how many letters are in word and range of that produces a sequence. So I is gonna get the value zero and then one and then two and so on. When I is zero, the character from blanked is underscore and WC, the character from the word is W and now we do our first check. Is the blanked character compatible with the word character? If the blank character is an underscore and W, the word character is one of the letters that's been guessed, then we have an incompatibility and we would return false. That's not what's gonna happen here because WC is not in guesses. It hasn't been guessed yet. So we also check if the blanked character is not the underscore. Well that's not the case here because the blank character is the underscore. We passed the test for I equals zero. Now we're gonna go on to the second position. Now I is one and we're gonna have both BC and WC are both the letter O. Well that's compatible so we're not gonna fail there and we keep going. Now they're both N, we don't fail there. We actually get all the way through this. We're gonna get all the way through all the positions. None of them has an incompatibility and now we've finished the iteration. We've gone through all of the positions since we didn't find an incompatibility anywhere we're okay and we can now return true. So the first invocation gives us the value true. But our second invocation is gonna come out differently. Now we have one more guess that's been made and when we get to position zero with a blanked character of underscore, the word character is W and W is one of the guesses that's been made. We return false and as soon as you hit a return false it's an immediate return from the function so that breaks out of the whole iteration. We don't go on to anything else we just return false and we're done. Now I wanna show you how we can simplify and clarify the code by introducing a named helper function and by switching to using zip rather than indexing through the positions in word. So I'm first going to define a little helper function called compatible and it's gonna take a character from the blanked word and a character from the word that we're trying to check on and the guesses that have been made. And it's gonna return either true or false and it's basically gonna take this code. So that's one simplification that I'm gonna make that makes it just a little easier to read the code. Now you know those four lines of code we're really trying to just check whether BC the character from the blanked and the character from the word were compatible. If they're not compatible we're returning false. Let's just test this to make sure that I haven't made any syntax errors or anything and it looks like I did do something wrong. Yes I did which is that if I didn't find incompatibility I needed to return true. So if the pair of characters was not incompatible for one of our two reasons I need to say that they were compatible and now everything is working as it did before. The second simplification that I'm going to make is that I'm going to use the zip function rather than going through the positions in word and blanked in parallel. Because I was really trying to go through them in parallel that should suggest to you that zip is the right thing. So I'm going to say for BCWC in zip of blanked with word and that is equivalent to the code that I had above but once you get used to it I think you'll find is much easier to read. We are first zipping together blanked in word at each position we're going to have a tuple and then I'm iterating through those tuples unpacking each tuple into the character from blanked and the character from the word. And this should also work let's check it and sure enough we get the same answer. So that's a little example showing you how you can use zip to make it more clear what your code is doing. You can implement the blank function without it as I did at the beginning but the switch to zip along with making a name function for this compatible character code makes it easier to understand what the code is doing. See you next time. Welcome back. That was the zip function. You can use it whenever you want to do pairwise operations. For example, if you wanted to take three lists of words and generate a single list that had the longest of the three words in each position you could first zip all three lists together to generate a list of tuples and you could write a list comprehension to do a mapping operation transforming each tuple of three words into a single string, the longest of the three words. I'll close with a little zipper joke. One day I was picking up my son from preschool. It was winter. Hey, this is Ann Arbor and he was having trouble zipping up his coat. I explained it as I was showing him. The secret is you have to get the left part of the zipper to fit in the other side before you try to zip it up. He looked at me quizzically and said, why does it have to be a secret? I'll see you next time. Hey everybody, welcome back. REST API is this lesson. API, it's an acronym that stands for Application Programming Interface. It's an interface that lets one computer program interact with another computer program and REST. Not REST is in take a deep breath and relax. It's another acronym. It stands for Representational State Transfer, which actually isn't all that helpful either. Purists will be happy to explain to you at length the more abstract meaning for a representational state transfer. But in everyday parlance, a REST API just means a website that produces data intended for another computer program to consume rather than something intended to be displayed to people in a browser. A REST API will respond to requests in a particular format. Some things are common across many APIs about the format, but some things are specific to the particular API. This week, you'll learn how to write programs in Python that interact with REST APIs. You'll learn to read the documentation in order to figure out the specific format that the API is expecting. And you'll learn how to use the Python request module to make the API of calls. Finally, you'll learn about caching as a way to reduce the total number of calls that you have to make to the APIs. Before we dive into details of requests to APIs, though, we're going to take a little detour to give you some background on what goes on behind the scenes when two computers talk to each other over the internet. It'll give you a mental model of what's going on so that if things don't work perfectly the first time and they never do, you'll know how to think about what might be going wrong and have a good path toward fixing it. At the end of this lesson, you'll be able to parse a URL into its component parts, the protocol HTTP or HTTPS, the host, typically a domain name, and the path. And you'll also be able to use the Python request module to fetch the contents of a URL. Let's get started. I'll see you at the end. Welcome back. Before I show you how to write a program that will fetch data from a REST API from another website on the internet, let's take a little time to figure out how the internet really works and what URLs are and all that kind of stuff. So you're, I'm sure I'll use to using a web browser. Like we see here, this is a web browser where we're fetching a page from the umich.edu website. You see in the URL bar that URL HTTP umich.edu slash about. You can think about parsing that as that there's a certain part before the colon. And we'll call that, we'll call that the protocol. In this case, HTTP is our protocol. Then we have a colon slash slash. That's just a separator. And then all the stuff until the next forward slash is called the domain or the server. Then we have another slash as a separator. And then we can have some other stuff which we'll call the path or the arguments. So the server says where to fetch something from and the argument says what to fetch from that remote server. The protocol says how to communicate with that remote server. Next, let's talk about those servers. Those servers are referred to by name. Usually you get a name like umich.edu or si.umich.edu or www.google.com. Those are called domain names and they say what remote computer you're trying to talk to. Every computer that's attached to the internet has a unique identifier and address. It's called the IP address. IP stands for internet protocol. And no two computers that are connected to the internet have the same IP address at the same time. When the computer disconnects, you might get another computer getting to reuse that IP address. When we have a domain name like www.si.umich.edu, that's sort of a more permanent name, but the actual server that is responding to that name may change over time. And each server will have an IP address. So the internet has something called the DNS, the domain name system, that is set up to resolve names like www.si.umich.edu and turn them into these unique identifiers, the IP addresses. An IP address will look something like 159.89.239.247. The dots are just there to help us divide it up and think about it as four chunks. In each of those chunks we have to have a number between zero and 255. Do you ever see an IP address that has a number bigger than 255? That's not really an IP address. The numbers go from zero to 255. It turns out that you can represent that with eight bits, eight zeros and ones. We're not going to go into the details of binary arithmetic here, but we get these four sets of eight bits and so we have a total of 32 bits. Sometimes these are called 32-bit addresses and we have the decimal representations with the dots 159.89.239.247 just to make it a little easier for people to speak them and write them. So how is it that www.si.umich.edu gets converted into that number and how does every computer on the internet know to find that number whenever they're trying to communicate with www.si.umich.edu? Turns out there's a distributed lookup system, the domain name system, and you can go to various sites that will let you do this lookup. Your computer has a way of doing that lookup. It has a server that it talks to in the background, but here's one that's public. So I've gone to look. I've already set this up here and you can see I've looked up www.si.umich.edu and I actually got back two possible IP addresses. The main IP address is the first one, so it's that 159.89.239.247. That's the basics of URLs, domain names, and IP addresses. We'll see you next time. Welcome back. Let's talk about how data gets from one computer to another on the internet. When your browser talks to some remote server, say www.si.umich.edu, it establishes a connection and then it sends some data, pieces of text or other formatted data, and each of those pieces of content gets chopped up into a bunch of packets. Each of those small packets gets sent out onto this big routing system that consists of a whole bunch of routers. Routers are just computers whose sole job is to receive these data packets and pass them on. And each packet will go along, goes from one router to the next. Each of those routers has a big lookup table that says if this packet that I've received is destined for some particular address, what's the next hop? What's the next router I should send it to? And magically, all of these routers have managed to coordinate their routing tables so that when a packet gets sent here, it eventually makes its way to the destination. It's a rather remarkable system that there are trillions of these packets getting routed around and they all mostly managed to get to their destination. Occasionally, things go wrong and packets don't make it. Here's the first few lines of a trace route that I did. This is something that gives some diagnostics about if I'm trying to send data packets from my computer to si.umich.edu, what happens to them? And the first line is telling me where does that packet go first? Well, first it goes to a computer that has the address 100.68.0.5 and it took 71 milliseconds to get it there. In fact, when I run this trace route program, it does this three times and so one time it took 71 milliseconds, the next time it took somewhat less, the third time somewhere in the middle at 58.4 milliseconds. The second line is saying where does a data packet go if it's eventually trying to get to si.umich.edu? Where does it go next? Well, it went from a computer at address 100.68.0.5 to a computer at 172.19.242.241. And we can see this is called two hops. The first hop went to 100.68, the second hop to 172.19. The third hop was another router at 10.255.255.253. And so on, I haven't shown you all the way, but after eight or 10 hops, the packets eventually get to si.umich.edu. And typically to go two hops, it takes a little longer than to go one hop. So that gives you a little sense of what's going on behind the scenes. We're not gonna have to deal with all of this technical detail. I just wanted to give you a sense of what's happening when one computer is talking to another. So that's the basics of internet routing, how one computer can send data to another computer by way of intermediate routers. We'll see you next time. Welcome back. Let's talk about what happens behind the scenes when your browser fetches data from a remote server. The particular way that we're gonna be most interested in having one computer talk to another is using the HTTP protocol. So that's when we have a URL that begins HTTP colon slash slash, something like www.si.umich.edu and maybe something after the slash. The first thing that's gonna happen is my computer is gonna take this domain name www.si.umich.edu, convert it into an IP address as I illustrated a minute ago. Then my computer makes a connection to the remote computer. If I were using HTTPS, that's the secure version, then the first thing that my computer would do was send some information back and forth to the si.umich.edu server that would establish encryption keys so that all of the rest of the communication would be encrypted and nobody who intercepted that communication would be able to figure out what we were saying to each other. After that setup has happened, now my browser will start to send messages and it'll actually send text. It'll send the word get and then it'll send whatever arguments were after the slash. It'll say what path we're looking for. It'll also send some headers saying things like hey, I'm a Chrome browser or I'm an internet explorer browser. The current time is a certain time. It'll give a timestamp and a few other headers. We will then receive back from that server some response headers. Those response headers will say things like, I'm sending you HTML and the current timestamp is such and such and there are a few others. I'm gonna show you this in a minute. And then most importantly, si.umich.edu is gonna send me some HTML. And that HTML my browser is gonna take and turn into what we are used to seeing in a browser, a web page. So the browser renders that HTML. Let's take a look and see what this looks like for the umich.edu website. Here I have a web page open in my browser. The URL is https colon slash slash dub dub dub dot si.umich.edu. And you can see that it's all pretty on the screen. But what's really happening in the background? What's happening in the background, I'm using the Chrome Debugger tool and I've got it set up to show us that when I made that request, my browser opened a connection and the first thing it did was that it made a get request. And this is the full URL that it asked for. We can see all of the request headers. So we asked for a get. We sent the path was backslash was just slash because I didn't have anything after the slash. If I had asked for a particular page, then we would see something more in the path down here. The scheme or protocol was https. Then there's a whole bunch of other things that we won't go into. The user agent was compatible with all of these things. It's actually the Chrome browser that I'm using. So these are all things that my browser sent to the server and the server responded by sending back a whole bunch of response headers. There's the timestamp of when it sent it back. There's various things here including, you can see a little information that the website is powered by PHP as its back end. And the content type is text slash html in the UTF-8 encoding and so on. So these are all things that my browser uses. And then the most important stuff that we get back was actually all of this html. And all of this stuff got sent back and the browser managed to parse all of this. This is just text. This text is not what's normally shown on the webpage. What's shown on the webpage is this pretty thing. And that's the browser rendering all of that html. If you don't know how to use the Chrome debugger, it can be a little confusing, but if you just want to see what the html is for any page, you can do a right click and U-page source and see all of that stuff that got sent back from the server. So to summarize, when your browser fetches data from a remote web server, it makes a connection. It sends a few headers and asks for the path, the arguments that it would like to receive. It gets back some headers and especially it gets back html, which your browser renders and makes a pretty web page out of. See you next time. Welcome back. Let's talk about a particular URL structure that's very common. We're gonna talk about what happens after the slash. And a very common format is to have some path name, in this case, just slash list. And then we have the character question mark. And after the question mark, we get query parameters. So as before, we have a protocol, HTTPS, we have a host, in this case, events.umich.edu, and then we have the arguments, everything after the slash. Now, any web server gets to decide what format of arguments it's willing to accept, but this is a very common one where you'll have some path name and then question mark and then these query parameters. And these query parameters are very often formatted in these key equals value, key equals value. So we have two of these keys. One key is called filter and the other one is called range. The value of filter is tags, colon, arch, comma, and the range has the value 2018-10-01, first of October, 2018. The server is going to interpret this request as a request to search on the server to find out those events that are happening on 2018, or beginning 2018, October 1st, and that somehow are tagged with the word art. So if we visit that URL, the web server is interpreting the request as a query and it is returning information just about artistic activities that are happening around campus starting on October 1st. And you can see some of the interesting artistic things that are happening, especially in the university hospitals, but also elsewhere around campus. So this URL structure, where we have as arguments a path and then a question mark and key equals value pairs, this is a very common URL pattern and one that we'll be using in our coming lessons to write programs that fetch data from the internet. We'll see you next time. Welcome back. This lesson takes us deep into using REST APIs. First, we'll see how to use the request module to generate long URLs with query parameters and automatically substitute for text characters that aren't permitted in URLs. Next, I'll show you what to look for in the documentation for REST API. Third, I'll give you some hints for debugging requests when they don't work right the first time and we'll conclude by introducing the caching pattern. It's especially useful when using the request module so that you don't run into rate limits or problems of servers being inaccessible from the Roonstone textbook. At the end of this lesson, you should be able to read the documentation for REST API, extract the key information you need to use the request module to interact with it, debug your invocations of request.get and understand how our request with caching module works. Bye for now, I'll see you at the end. Welcome back. Remember this URL that we used before to find out what words rhyme with funny? And it's got the base part and then a question mark and then key equals value. Well, how did we know what to put into that base part? And how did we know to use rl underscore rh y in order to ask for things that rhyme? The answer is you use the documentation for the API. So let's take a look at the documentation of that API and we'll see they have lots of things in here. So in this documentation, they tell us that the main URL is going to be api.datamuse.com and then they tell us some things that might go after that. You might have slash words, question mark something or we'll see a little later on that there's some other things. There's one other endpoint besides words. So we'll describe slash words as being one endpoint and then the things after the question mark as being query parameters. So you can ask for ml and give a value, ring in the ears. So in their documentation, they tell you if you want to find out words with a meaning that's similar to ringing in the ears, you would give the query over on the right hand side. And we can scroll down and see some other things that you can do, lots of opportunities, lots of different things you can do. And here's that rally that I found. That's for words that rhyme with forgetful. Now they've started by giving us a bunch of examples but then down below, they'll give you even more detail. So they're saying you can access most things at datamuse.com slash words. So that's going to end up being our base URL. And then we have to figure out what are we going to put into that params dictionary. And we can use the examples above or we can try to look at some of the things below that are giving more detailed information. So we can have the key, the ml, we could have a key, sl for sounds like or there's a whole set of r-e-l underscore things. There's r-e-l underscore j-j-a or r-e-l underscore t-r-g or r-e-l underscore r-h-y for rhymes with. So that's how I figured out what some of the possible keys could be in my params dictionary. And then for the value, I really got it from my examples where I said, oh, well the value is just the word that I'm trying to pass as an input. I'm trying to get things that rhyme with forgetful or things that rhyme with funny. So that's telling me what kind of queries I can give to this REST API. And at the bottom, they'll tell me a little bit more about interpreting the results. So I've passed over the other endpoint slash sug and I've found something about what the responses will look like. And it's gonna be a list. That's what the square brackets tell me. And they tell me it's always gonna be in the JSON format, which you've been seeing already. And it's a list of dictionaries. Each dictionary is telling us information about one of the words that is related to our query. If our query was for words that rhyme with funny, then we would get each word that is a good rhyme for funny would have one dictionary in this list. So this is a relatively simple API and the documentation is pretty clear. You can figure out pretty well how to use it. The other thing I wanna show you with this API is just a nice practice. If you know you're gonna make a bunch of calls to this API to get a bunch of rhyming words, maybe you're using this as an aid to your next wrap hit. You might make a lot of calls to get rhymes. And so I've defined a little function here that I can call multiple times. So I can call get rhymes on a word and I'll get back a list of some other words. So with funny, I'm getting back three things and I can get rhymes on something else. I wanna get rhymes with dash, I get cash flash and crash. Now how did I implement this get rhymes? Well, get rhymes is gonna take as an input a word like funny or dash. So that's gonna be bound to my word parameter and it's going to return a list of three rhyming words but you saw that what we actually get back is not a list of words, it's a list of dictionaries with information, each dictionary has information about one rhyming word. So the base URL is always the same and I'm gonna create a blank parameters dictionary and then I'm gonna fill in relry should equal word and max should be three. So we're asking it to only give us three results back and then I call request.get passing in that params dictionary. I get a response object. I turn that into a Python object by calling the JSON method on it and now I'm gonna use the list comprehension which you learned about recently. We're iterating through each of the list of dictionaries and for each of those dictionaries we're just extracting the word. So that produces a list like this and then I'm just returning it. I just noticed that on line 13 I have another return statement where it's returning something different but of course once we execute the return on line 12 we won't ever do line 13. That's left over from a previous version I was trying to do of this where instead of extracting just the words I was returning the whole dictionaries. In fact, it might be instructive to look at that. Let's comment at line 12 and see what happens if we return the dictionaries rather than extracting just the words from those dictionaries. So now we get a list of the three dictionaries instead of just getting money, honey and sunny we're getting this whole dictionary instead of just the word money. So this is the kind of thing that's pretty helpful if you know that you're gonna do the same thing make a bunch of requests to the same API just passing a little different parameter each time you might wanna make a little function out of it like I did here on lines four through 13. That's an introduction to reading the documentation for a REST API and also to making a function to repeatedly make similar kinds of requests to a single API. We'll see you next time. Welcome back. I'm gonna introduce you to a programming pattern called caching. We have some expensive or maybe unreliable operation. We'll have a little function, takes an input produces an output but it might take a long time to run it. Each time we run it, we're gonna take the results and we'll stick them into a cache and that cache is gonna save our previous results. You can think of it sort of like a squirrel's cache of nuts. It's spelled C-A-C-A-G pronounced cache and it just associates some inputs with outputs. I'll call them keys and results. And now the next time I was gonna ask this expensive operation to do the very same operation that's been done before, I would first check and see, hey, is this result already available to me in the cache? If it is, just send back the same result I would have gotten if I'd run that expensive operation but don't bother, just take the result from the cache. Now in our case, the expensive operation is request.get. It's expensive because it takes a little bit of time to go out over the internet to make a connection to another server. It's also a little unreliable. Sometimes you don't have a good internet connection, sometimes the server that you're connecting to doesn't respond in the way that it did yesterday. In our textbook, there are some sites that have restrictions on cross-site scripting and some days it works to connect to them, some days it doesn't. There's another problem that sometimes when we call request.get, we're doing some debugging and we run our code a bunch of times and the site that we're talking to has a rate limit. It says you can only make 15 calls every 15 minutes and you've run this more than 15 times in the last 15 minutes. So these are all reasons why it's gonna be a good idea to save our results when we run a request.get, put them in a cache and get the results from the cache the next time rather than calling request.get again. So we've implemented this caching pattern in a module. It's available only in the textbook and it's called, as you might guess, requests with caching. So in the code window here, I've got an import statement, import requests with caching and then request with caching is available to us and we're going to call the method get. This get method and request with caching is going to return exactly the same result, a response object just like if I were to call request.get. But the way it works is that it's going to first look in the cache and see if it can find the result there. If so, it'll give us the results from the cache. If it can't find it in the cache, it calls the real request.get and it returns that but it also saves the result in the cache so that the next time we'll get the result from the cache. There's one little twist is that I've actually got in this request.caching module we've implemented two caches. There's stuff that we've provided as part of the textbook and that goes into a permanent cache file. And then there's a temporary cache. You can sort of think of us having two caches, the permanent cache, and then there's a temporary cache. If you can think of it as a second little database. And that's stuff that's saved between code runs while you're on the current page but it disappears when you reload the page. So when you call requests with caching.get it checks in both of the caches. If it's found in either place it returns that. If it doesn't find it in either place then it does call request.get and it saves the result in this temporary page specific cache. So let's see what happens when we run this code. So we have a cache file, a permanent cache file called datamusecache.txt. And our first call to request with caching.get asks for the datamuse API and it asks for words that rhyme with happy. And then we can see the results. The request with caching.get tells us whether it found the result in the caches or whether it didn't. In this case it didn't find it in the cache and so it says it's adding it to the cache. Then on line four I've printed the first 100 characters and you can see the things that are rhyming are snappy and nappy and so on. On line six I'm making exactly the same request. I'm asking for the same word for words that rhyme with happy again and this time we're told that it found it in the page specific cache because it saved it from the first time that we made the call. The last call that we make we're asking for a different word. We're asking for words that rhyme with funny and we have a cached result in the permanent cache file for words that rhyme with funny so it found that in the permanent cache. Now if I were to run this whole thing again now it's gonna run where it already has the page specific cache and by the way you can see the results of that page specific cache. It's telling us it's stored in this data file called thispagecache.txt and it's the things that rhyme with happy like snappy and nappy and scrappy. If I run this again instead of showing new here it's gonna tell us that it found it in the cache so let's do that. Takes a little while to run some inefficiencies but we're gonna see that new adding to cache is gonna change because it's gonna find it in the cache. So now it says it found it in the page specific cache. If I were to reload this whole page which I'll do now I'm gonna clear my markings and I'm gonna reload the page. Now I've reloaded the page and that got rid of the page specific cache when I run this again the first time it's gonna have to do a call to request.get and add it to the cache. That's the request with caching module that we've provided. It's really easy to use just import request with caching and then you call requestwithcaching.get the same way as you would call request.get and it just makes it so that when you make the same request multiple times the additional times that you make the same request you'll get data from the cache. See you next time. Welcome back. Let's take a look at the code for the request with caching module. It's not very complicated and it's pretty instructive plus it's good practice to read slightly longer code chunks and try to make sense of them. Let's start here with the definition of the get function. It takes the same parameters as request.get a base URL and the optional parameters dictionary but it also has a few extra optional parameters like private keys to ignore. We'll come back to those. In the previous video I described conceptually how a cache works. Now we've got the data store where we're gonna store the results of our previous invocations of request.get but how are we actually gonna implement that in Python? The answer is we're gonna do it as a dictionary where each key will represent one invocation of request.get. The key has to somehow encode what the base URL and what the parameters were of that invocation of request.get and then the results will be some text that was what we got back from running request.get. So in our code here we're going to compute a cache key. We're gonna make this value k by calling a function, a little helper function that I'll show you in a minute called make cache key. And it's gonna take the base URL and the params and turn it into a string. We'll take a look at that make cache key function in a minute. But just think of it for now is that it's producing some string that we'll use as a key in the cache dictionary. Now the heart of this get function is this if, l, if, else. So we're gonna check whether this cache key is in either of our two caches, the page specific cache or the permanent cache. So if it finds it in the temporary cache, let's say it's here, then we're going to take these results and return them. So if we found them in temp cache that is the cache key that we're looking for is in the temp cache, then we go get the value associated with that cache key and that's what we're gonna return. And you may recall that we're trying to produce the same kind of object that request.get produces and that was a response object that has not only the text but also the URL. Now, if we actually called request.get, this would be the URL that we fetched from the remote server. In this case, it's gonna be the URL that we would have fetched if we had actually gone to the remote server, but instead we'll produce that URL that we would have used if we had to go out to the remote server. So that was computed up here using another little helper function. If it didn't find it in the temporary cache, it'll do the same thing. It'll check in the permanent cache. If it was in the permanent cache, return the text and the URL. If it didn't find it in either of the two caches, then we will actually call request.get, passing the base URL in the params. We will add those, whatever the result was, gets added into the cache. And again, this is another helper function that we'll look at in a minute called add to cache. So the basic structure is we calculate what the cache key oughta be and then we check in the two caches, is that key there? Is it in either spot? If it's in either spot, we return the results from that cache dictionary and if it's not there, then we call request.get and we add it to the cache so that it will be there the next time. And the only other little tricky part in here that I glossed over is that we want these caches to live beyond the current run of the active code window. And so to make it be a little more permanent, at least to live a little longer than the current run, we want to save it in a file. I've shown the cache as if it's just a dictionary, but actually this dictionary we're going to store in a file. So we have to read this dictionary from a file and I'll show you this underscore read from file as another little helper function. So let's take a look at those helper functions now now that we understand the basic structure. The first one is this make cache key. So it's taking the base URL and the params dictionary and it has to make a string out of that request, a text string that we can use as a key in the cache dictionary. Now the URL itself, we could just make the full URL out of this base URL and the parameters dictionary, but we're making something slightly different in our case because there are some keys that we want to exclude in particular there may be private data that's used for authentication and we'll see more about this when we get to the Flickr API. The other thing that we want to do here is to make sure that we always get the parameters in the same order. If there's a couple of different parameters, remember this dictionary, the params D, the keys could come out in any order and if one time we call it, we get the keys in one order and another time we get them in a different order, we might fail in our lookup trying to find something in the cache. So we've created a canonical order. We alphabetize the keys by sorting them and the default order is to sort them in an alphabetic order and then we go through each of the keys and we just create a little string that puts together the key and the value. So we're getting all of the keys and values from params D and we're sticking them together into one giant string separated by underscore characters. So what we're gonna get is something that looks a lot like the full URL, but might be a little bit different. Let's look at the other helper functions. First, let's look at this read from file. Basically, we're just taking the contents of a file and reading them into a Python dictionary. So most of the action is right here. We open the file and then we read it and we call JSON.loads, which takes the text of the file and turns it into a Python object. In our case, it's gonna be a dictionary. When I've wrapped this in something called try and accept, if you haven't seen that before, don't worry about it too much. It's covered in the next course in the specialization. It just lets the code fail gracefully. So if it can't find this file or if it can't load it as JSON, then we'll just return an empty dictionary. So it'll fail gracefully. So that's reading it from the file and turning it into a Python dictionary. The other helper function is this add to cache. So we've got some cache file name and we've got some new key and results that we wanna save into the cache. So we first read that dictionary using read from file. We add a key to it with the value that's been provided and then we just write that back out to the file. And write to file, you can see up here is another helper function that just opens the file and writes the dictionary out. So that's a simple caching pattern that we've implemented. It's a wrapper around the expensive and unreliable operation request.get. You can just call request with caching.get. It returns the same thing as request.get. If it has a saved previous version of the request, it returns those contents and if not, then it calls request.get. It's useful because it helps you avoid problems with rate limits when you're debugging your code that's working with external REST APIs. I'll see you next time. Welcome back. At this point, you should be able to read API documentation in order to formulate and debug invocations of the request.get function. And you should be able to understand and appreciate the request with caching module. And you may be wondering where I get all my corny jokes from. Well, some of them are favorites I've accumulated over the years and retold probably too many times. And of course, there's a REST API to search for these jokes at icanhasdadjoke.com. That's has with a Z. The base endpoint gives you a random dad joke. Here's what I got back the first time I invoked the API. What do you call a crowd of chess players bragging about their wins in a hotel lobby? Chestnuts boasting in an open foyer. That one's an old chestnut. Here's a more profound one I got another time. What was a more important invention than the first telephone? The second one. The API also lets you do searches when I ask for slash search question mark term equals hipster, I got back three. How much does a hipster weigh? An Instagram. How many hipsters does it take to change a light bulb? Oh, it's a really obscure number. You've probably never heard of it. How did the hipster burn the roof of his mouth? He ate the pizza before it was cool. Okay, okay, I can hear you groaning through the screen enough with the puns. I hear you. I'll give it a rest. See you next time. Hi, welcome back. This lesson just gives you more practice at reading API documentation and constructing calls to request.get or requestwithcaching.get. Along the way, we'll also learn about API keys which are the simplest form of authentication that are sometimes required when your program tries to interact with the REST API. There's no new learning objective for this lesson. The goal is just to increase your fluency so you'll be ready for the final project where you have to decipher the documentation for two different REST APIs and create a mashup that combines results from both of them. So, bye for now. I'll see you at the end. Welcome back. Let's take a look at another REST API, iTunes. iTunes is Apple's media store. The REST API lets us make queries asking what's in the store. This page is their documentation. There's a base URL which goes up through search, so, HTTPS colon slash slash itunes.apple.com slash search. And then as we've seen before, they're query parameters. There's a question mark and then some key value pairs. What are the keys and values that iTunes understands for those query parameters? Let's scroll down and we'll see. We've got a whole table of them. There's a bunch. In the left column are the key names, things like term, country, media, or entity. In the right column, there's some information about what are the allowable values. So, the term, that's gonna be what we're searching for and you can put any text string in there. But for media, there's only certain allowable values like movie or podcast or music. So, in our URL, when we make a request to iTunes, we're gonna have something like media equals podcast. The other two columns tell us whether we are required to include this query parameter. Yes, you have to include term and you have to include country, but the other ones are optional. And then there's just description that gives you some hints about what you're getting. So, let's try to figure out what we would pass in for the params dictionary when we make a call to request.get. I'm gonna go back to Runestone and we'll come back and look at this documentation as we need it. In the textbook, we've got a couple exercises. Suppose we were trying to look for podcasts originating in Ann Arbor and we're gonna pass in a params dictionary to request.get. What value should we put here as the value associated with term? So, this is the thing we're searching for. And so, since we've said we're looking for podcasts that mention Ann Arbor, let me put Ann Arbor in here. Check me and it says, oh, don't forget the quotes. It needs to be a string. And we're gonna be putting a value here and it's gotta be a string. So, let's put that value in quotes. When it gets encoded into the URL, it won't have the quotes around it. But in order to be a value in this params dictionary, it does need the quotes. And that's all good. So, let's look at our next question. There's another parameter. If we're searching for podcasts, we're gonna have to give another query parameter whose value is gonna be podcast. Which query parameter is it? Well, you could try to guess. Perhaps you remember from the documentation page, but let's go look. And it turns out we can say podcast for the query parameter media. If we scroll down a little farther, it turns out we also can say entity equals podcast. If we say media equals podcast, then we'll be doing a little bit broader search. We can do a slightly narrower search where we say entity equals podcast and that will be limited to searching for Ann Arbor in the podcast title, excluding looking for it where Ann Arbor is part of the author. So we could say either media or entity. We've actually set this up so that it'll accept either media and says, yes, you can use that. But in our code below, we're gonna use entity, so entity is also okay. So here's a request that we can make. We're using requests with caching rather than requests just to make sure that if there's any problems getting to iTunes on a particular day that it'll still work. And we're passing in term Ann Arbor and entity podcast. It'll take us a minute and we'll get a result. It founded in the permanent cache. Now we haven't chosen to do anything with this, but I could also, for example, print what the full URL was that was created. And that's this URL. Now because we used caching, it didn't actually go and make the request to iTunes.Apple.com, but this is the request it would have made. It got the saved results, the cache results for that URL. Now at this point, I've loaded the response text into a Python object. I've saved that in the variable called pydata. And at this point we would have to go through our usual understand, extract, repeat process to pull out the data that we want from whatever is in pydata. I'm going to skip ahead because I've already done that process of doing the understand, extract, and repeat and this is what I got to. So I found that the top level object was a dictionary. It had a key called results and I have extracted the value associated with results and that value is a list. It's a list of dictionaries, one dictionary for each podcast result. Each of those podcasts is a dictionary that has a key called track name and I've extracted the track name. So when I run this, I will get the names of podcasts that have the word Ann Arbor in them. So it's telling us that it's founded in permanent cache and then on each line, we have one track name. As you can see, there's one from the Ann Arbor District Library, the award-winning Ann Arbor District Library, I should say. We love our library in Ann Arbor and we love libraries in general at the School of Information. And there's a bunch of podcasts from churches. You can see the Harvest Mission Community Church, Grace Bible Church, and so on. And actually, there's a few more podcasts from the Ann Arbor District Library. So that's an example of deciphering the iTunes API documentation and using it to construct REST API queries. We'll see you next time. Welcome back. Let's try querying another REST API, this one from Flickr, one of the first popular photo sharing sites. Still used by a lot of excellent photographers. It's run by Yahoo now. Here's the overview documentation page for the Flickr API. We're going to use the photo searching endpoint. There are lots of other things you can do with the Flickr REST API, including posting photos and lots of other things. We'll just look at searching because you can do that even if you only like looking at photographs and you don't have a Flickr account where you've uploaded photos. So first, at the bottom of this page, notice that they give what they call an endpoint. That's what we will call the base URL. Everything after that is going to have question mark key equals value pairs. The next notice that they have a required parameter called the API key. So one of these is going to have to be API key equals something. That's a developer specific key that you get from Flickr by signing up for it. If you access one of our few cached queries, you actually won't need an API key. But if you want to run any other queries that we haven't put into a cache for you, you will need to get an API key from Flickr. A third thing to notice on this page is that they have another required query parameter called method. So we're going to have to have something that says method equals something. In our case, it's going to be something about search. And we'll find that on a more specific documentation page. So let's look at the documentation page for searching. I already have it loaded here. We're going to have to say method equals Flickr.photos.search. There are also a bunch of other keys that we can provide, what they call arguments. I'm just going to clear my marking here and scroll down to show you that there's a lot of options or arguments and it just keeps going. These are all things that are going to affect what gets retrieved. So we can ask for only things that are photos that are taken indoors or only photos that are taken outdoors. We can provide a latitude and longitude and it'll bring us back photos that are within some radius of that location on Earth. So there's a bunch of them here. We're not going to use them all in the examples that I show you, but we can come back to them whenever we need. It's worth noting that the API key is required. I talked about that already. And then we're going to search for tags. So photos on Flickr, the photographer, when they upload them, they can give them tags like this is indoors or this is still life or this has mountains in it or they can put a city name, any tag that they choose. So we're going to use that to search for photos of mountains and rivers. And then there's this tag mode. If you provide more than one tag, that can either be treated as you want to search for photos that have one of those tags, that's the or combination, or that you want to search for photos that have all of those tags, that's the and combination. Now if I scroll all the way down, we'll see something about what the results look like. Here's an example response. You'll notice that it doesn't look much like our favorite JSON format. This is actually XML format. So by default, Flickr returns data not in the JSON format but in XML. It's possible to parse XML in Python. It's not that difficult, but we're not going to go into that in this specialization. Most sites that provide XML also have some way to ask for JSON instead. Flickr does have a way to ask for JSON instead. They don't make it that easy to find out how. I didn't find it anywhere on this page. But I did do another Google search. And if I just search for Flickr API JSON, the first response that comes up is a documentation page about how to get things in JSON format. So the trick is, let me show you somewhere here. It's right here. Send a parameter called format with a value of JSON. So we're going to have in our URL, format equals JSON. There's one other little tricky thing. I'm just going to get rid of those markings and we'll scroll down. Turns out the bottom of the page, they have a little thing that says something about callback functions. Turns out we have to send nodeJSON callback equals one. It's a little more technical than we're ready for, but here's the basic idea. Without that nodeJSON callback equals one, you would get JSON results that are wrapped in basically a JavaScript function invocation called a callback. So that wrapping is part of a standard called JSONP. In any case, what we need to do is we need to include nodeJSON callback equals one in order to just get pure JSON without any extra characters around it. So with that basics from the documentation, let's look at some code for making API requests. As with the other APIs, we're going to use requests with caching so that we can avoid you having to actually get an API key from Flickr. You'll just get data from our cache where we won't check for needing an API key. And just like we did for the Datamuse API, I've defined a function getFlickerData. We'll pass in as a parameter, this tag string, that's saying what tags we want to search for. So when I call it on line 26, I'm going to pass in this comma separated list of tags, river comma mountains. Inside this getFlickerData function, you can see that I'm going to get set up to make a recall to the requestswithcaching.get. In order to set up for that, I set my base URL, I make an empty parameters dictionary, and then I set a bunch of key value pairs. I'm going to have API key equals whatever I have for my Flickr key, which is just a variable up here. For the tags key, I'm going to set it to be river comma mountains, or whatever the current value of tag string is, whatever it's been passed into this function. For the tag mode, I'm going to say all. I want photos that have both river and mountains as a tag. The method is flickr.photos.search. I only want to get five results back. I would like only photos, not videos, and I want the format to be JSON and no JSON callback equals one. So that creates that whole dictionary. I pass that in to my request.get or requestwithcaching.get, and I'm telling the request with caching to use the file called flickrcache.text as its caching file. So I get a response object back and I'm just going to print the URL from that. That's the URL that we would have gone to if we didn't find it in the cache. And then we return a Python object. We turn the text string into either a list or a dictionary by calling this .json method. So I'm going to invoke that on line 26 and then once I get the results back as a Python object, in this case, it's going to be a dictionary. I have to go through my understand, extract, repeat process to figure out how to pull out the data that I want. Again, I've already done that so I'm not going to walk you through that process but I've discovered that the result dictionary has a top level key called photos. As the value associated with the photos key, there's another dictionary and that dictionary has a key called photo. I find that a little confusing because that's the way the data comes back from Flickr. The value associated with that photo key is a list. It's a list of dictionaries, one dictionary for each photograph and the dictionary for one photograph has a key called owner and another key called ID. So for each of the photo dictionaries that's returned, I'm going to extract the owner and the photo ID and I can stick those two values into a URL because I've looked up what is the format that Flickr uses for making URLs to actually load a page with a single photo in it and I'm just substituting owner and photo ID in to make this URL straight. So if I run this, I'm going to get a printout of five different URLs. You can see the five URLs are here and above it, we got the URL that we would have fetched if we hadn't found the result in the cache. If this were a full Python environment, we would be able to import the web browser module and we could call web browser.open which would make this URL open up automatically in another tab. I don't have that option available to me so I'm just going to, we printed it out and I'll go and make a new tab. We will visit that URL and there we have a photo that's tagged with mountains and river. I can see the river. I'm not quite sure where the mountain is for this one and we got some more. Here's another one. I put that URL in instead. Here's one that does have a river and a mountain. Now notice that on line 23 I have the code printing out the URL. That's the URL that we would have visited if we hadn't found it in the cache. And if I copy this, I can go to another tab and see what I would get if I actually went to that URL for the API call. The answer is I get a message saying we failed because it's an invalid API key. I didn't provide a valid API key and so I can't actually make that query. That's one of the advantages of getting the stuff from the cache is that we're not requiring you to have that API key. Now if I were to change this and instead of asking for say river and mountains ask for cows and mountains. I've tried this before and we tend to get beautiful pictures from Switzerland and the Pyrenees in Europe. If I try to do this, it won't find it in that cache and so it will try to fetch it. So it's new and adding it to the cache. Here's what the URL is. But we get an error. And the error is on line 30. When we go to try to extract the photos and photo IDs on line 30, we're finding that we get a key error. In the dictionary of results, we don't have anything called photos. So let's see what the result really was and it's gonna be the same error message that we saw in the browser window. So we get the error message that it was an invalid API key. So you can make this work if you sign up for an API key and put your API key on line eight, you would be able to make new queries with this code. We didn't show it here, but there are other operations with Flickr that require user-specific authentication like if you wanna post photos to somebody's account. You'd need to prove that you have permission from them. APIs provide features for doing that kind of user authorization through a protocol called OAuth, but that's beyond the scope of this specialization. And once you've authorized, each request would get digitally signed and Python provides a module for doing that called requests with OAuth. Again, it's beyond the scope of this specialization. The key takeaways from this lesson, one sometimes you need to pass a lot of query parameters. Take a look at lines 13 through 20. Sometimes there are a lot of choices about what the query parameters are. Remember that long list in the documentation. Sometimes you have to specify that you want things in JSON format because that might not be the default. And sometimes you have to specify other idiosyncratic things like this no JSON callback equals one query parameter. Finally, many sites require you to provide an API key which you get by registering with that site. See you next time. Hi, welcome back. Hopefully by now you've seen enough examples of API documentation and constructing calls to request.get or request with caching.get that you'll be ready to try it on your own. There's nothing like practice and repetition for building fluency, which reminds me of one of my favorite old jokes. A young tourist from a small town in Michigan is wandering around Midtown Manhattan. He looks at his watch and realizes that he has tickets to a symphony orchestra concert at Carnegie Hall starting in half an hour, but he's not quite sure where Carnegie Hall is. So he sees an older woman carrying a violin and figures, she must know. And he asks her, excuse me, can you tell me how to get to Carnegie Hall? She looks at him slowly and says, young man, practice, practice, practice. I'll see you next time. Hello, and welcome to the project for course three. In this project, you'll be making what's called an API mashup. You'll be getting and using data from two different APIs. So ultimately in this project, you will get data from two different places on the internet, process all of that data in a step-by-step way and achieve a new result that comes from getting the data from two different places. Just like other APIs you've examined before in this course, you will need to read and understand the documentation that goes with each of these APIs. And your goal with reading and understanding that documentation will be to pull out the information from the documentation that you need in order to make a request to each API. So you might need to read through some stuff that's complicated or includes things that you don't necessarily need or words that you haven't heard before just to find those pieces of information that you really wanna find. Like for example, the base URL for the API request. You'll wanna keep in mind as you work on this project, what you know already about building functions because that's something else that you'll do throughout this process of building a result and processing data. You'll wanna remember that as you work on this project, you'll be using the special tools that we've built in this specialization to allow you to get and cache data from the internet and retrieve data from the cache. As you work through each step, make sure to take each slowly, think carefully about what exactly it is you want to achieve at a given moment, make your plan in words, you've probably heard me say this before, and then translate those words into code. And that will make the process of building up a result from these brand new APIs much easier and make it seem much more like the other things that you have already done in code before. This is a type of program, getting data from two different places and doing something new with it that is useful in all kinds of situations, in all kinds of work, in all kinds of different disciplines. I wish you a lot of luck and I hope you have fun.