 Welcome. If you're returning after learning the basics with us, we're delighted to have you back. And if you're just joining us for the first time because you already know the fundamentals of Python, like if-else statements, for loops, and strings and lists, then welcome for the first time. Just be sure that you watch the video introducing the Runestone textbook environment because we're going to be using that throughout the course. I'm Paul Resnick. And I'm Steve Oni, and we're both faculty here in the School of Information at the University of Michigan. In this course, we're going to cover the rest of Python fundamentals, still focusing on the execution model to help you reason about and debug your programs. We're also going to learn how to read and write files. We'll learn about the Python dictionary data structure, and we'll learn to use the accumulation pattern to accumulate results from complex data structures and to accumulate results into dictionaries and from dictionaries. You're going to learn how to define functions, which I think of as the dividing line between just playing around with programming and becoming a real programmer. And we'll delve into some of the subtleties of functions, including optional and default parameters, keyword-based parameter passing, and anonymous lambda functions. And you'll learn about sorting. You're not going to learn sorting algorithms, but you'll learn how to use Python's built-in sorting capability. You'll pass in the right parameters, and you'll get your items back in in exactly the order that you wanted. The final project will be a little sentiment analysis program. We're going to give you a pile of fake tweets, and you're going to write code that counts how many positive and negative words there are. You'll write out a comma-separated values, file, upload it into a spreadsheet, and generate a graph so that you can analyze. Do those tweets that have more positive words in them get more replies and more retweets. Sounds fun, and that's also going to synthesize all of the things that you learned throughout the course. Now, like the first course, we're mostly going to do screencasts with code examples, but we'll occasionally come on screen for words of wisdom and to introduce topics. I'll also be telling a few more dad jokes. So, let's get to it. Bye for now. Hi, everyone. I'm excited to show you some useful features of the free interactive textbook that will be available to you as part of this specialization. Content in the first four courses all track pretty closely to the textbook content, so whichever course you're starting with, you'll want to go through this video to see the important interactive features. You can skip it if you've already seen it in a previous course. The Runestone interactive textbook environment is the brainchild of my friend Brad Miller. I've made a few contributions to both the software environment and especially the textbook over the past four years, Brad deserves almost all the credit. Let's take a look. The first thing you'll need to do before accessing any of the textbook pages is to log in from Coursera. So, I just click on this open tool and I'm automatically logged in. You've already logged into Coursera and Coursera is passing the credentials to Runestone, so you'll be automatically logged in here. Once you're logged in, all of your work will be saved and we've deliberately disabled any other ways to log in, except by doing it through Coursera. So, when you first log in following that link, you'll be taken to this practice page in the textbook. It's our way of encouraging you to use the practice feature every day, but we'll come back to that later. Once you're logged in, you'll be able to click on any of the links for the readings and you'll be taken directly to the pages in the textbook for those readings. So, here's a link to the Runestone page for variables and I'll click on it and now I'm on a textbook page. In the textbook, you'll find text and images, diagrams, but you'll also find some interactive elements. For example, here's what we call an active code window. It's got some code in it and I can click Save and Run. It'll run and print something out over here in an output window. I can change that code and I can run it and all of your code versions, when you save and run them, will be saved. I have this little scrubber here and I can move it and I can see all of my old versions and they're not just saved while this page is open, they're saved permanently. For example, let's reload this page. When the page loads, we're back to the original window contents but I can click Load History and then I get the scrubber and it shows me my last code run. Now, if I rerun a previous version, it won't show in the scrubber as being the latest version but if I change it, set of 17, I do 18. Now, it becomes the latest version in the history. Show in Code Lens is a really useful feature of Active Code Windows. This is an amazing tool developed by Philip Guo, a professor at UC San Diego. It lets you step through the execution of a program one line at a time. I can click Forward and it'll just show me what happens after one line is executed and the next in the next can print out just the first message and so on. That's not such a big deal now but it'll be really useful for you when you start to do more complicated programs with conditional execution and iteration and defining your own functions. Part of our educational philosophy in this specialization is to reveal all the magic. We want to give you a way to reason about how your programs are executing because that's the foundation for being able to debug your code through understanding rather than through trial and error. Code Lens really helps with that. Now, sometimes these Code Lens examples are built right into the textbook but you can always get to Code Lens by hitting the Show Code Lens or Hide Code Lens for any active code. So here are some that are built in to that textbook page. There are also other interactive features. Here's a multiple-choice question. You can answer those and get immediate feedback by clicking on Check Me. I've actually already answered this one but suppose I said Thursday as the thing that would print out here because day is set to Thursday. I click Check Me and it gives me some feedback. It's true Thursday is the value of day but it gets overwritten later. So the correct answer is 19. Now, when you get to the bottom of the page, I suggest that you click on Mark as Completed. If you haven't clicked on it, this is what it will look like initially. If you click on Mark as Completed, a couple of good things will happen. One is you get the satisfaction of it says, yay, completed well done. But you get a couple other things too. First, some of the multiple-choice questions or other activities on the page get added to the practice tool, which I'm going to show you in a minute. That practice tool will help you review things so that you don't forget them. Sort of like vocabulary flashcards when you're learning a foreign language. Second, the pages that you've marked as completed will be marked in the table of contents. So you can keep track of the textbook of what you've read and what you haven't. Here's the table of contents. And you can see these orange dots indicate things that I've completed and I've marked as complete in the checkboxes. The checkmarks indicate things that I've opened but I haven't marked as complete. So this completed button at the bottom of the page gets separate from marking a reading as complete in Coursera. You may want to do both of those things. In Coursera, we'll generally provide you with links to particular pages. And so you can just read that one page. But if you want to, you can navigate through the textbook once you're on the Runestone site. We have these forward and back buttons. This goes to the next page in the book. Back to the previous page. If you click on the textbook title, as I showed you a second ago, you'll get to a table of contents that's very detailed with every single page and sometimes subsections within the pages. If you want a more overview look at it, you can click on this chapters and it'll show you the different chapters and you can just see the detail for one chapter at a time. Now notice that the orange dots aren't shown on this detailed view of just a single chapter. That's a little unfortunate and now that I've noticed it, I'll try to add that feature at some point. Finally, there's a search option. I can search for variable and it'll tell me lots of pages in the textbook where the word variable shows up. There's also an index. I want to look for various things and I can click on them and it'll take me to where they are in the textbook. Normally if you log in from Coursera, you'll be taken directly to the practice feature, but you can also get there from within the book by clicking on practice. What this practice feature does is it represents to you questions on topics that you've marked as already completed. That thing at the bottom of the page where you marked the page is completed. When you're here in the practice feature, you get to answer it again and if you get it right, you'll remember that and it won't ask you that same topic again for a long time. If you get it wrong, then it might ask you again tomorrow. This practice tool is the brainchild of my doctoral student, Iman Yakisare. He just implemented it last year and in the first semester where we made it available to students in our on-campus classes, those students who used it in the first semester where we made it available to students in our on-campus classes, those students who used it more did a lot better on the course exams than those who didn't. This is a striking result for me because I'd been monitoring for several years to see whether just spending more time in the textbook had a similar effect on student performance and it didn't. In my on-campus classes, use of this practice tool is now required and earns a few points towards the final grade. For the Coursera courses, it's not required, but based on the results I've seen with our on-campus students, I strongly encourage you to use it a little every day. I think you'll also find it rewarding. On-campus students love the fireworks that they get. So here I'm going to answer a couple of questions. I have only two left to practice for today and I'm going to say done. Ask me another question and it gives me one more. It says hang in there. Last question for today and what's going to print out? Oh, this is a review, the one we just looked at. I say check me and then I'm done and I get these fireworks, which are a little fun when you finish all the questions for the day. For those of you who are taking this course for a certificate, you'll also see links to graded assignments, usually at the end of each lesson or set of lessons. In the first four courses, the assessments and projects are in the Runestone textbook and they're all auto graded there. You'll only be able to see these in Coursera if you're paying to take the course for a certificate. If you're not paying, you can find similar questions in the end of chapter assessment pages in the Runestone textbook. Let's follow the link for this first assessment and this assessment just has two questions. I've actually already answered one of them correctly before. That was a multiple choice question and they want me to write some code. The answer to this one is Brantella World. I'll save and run it and I get some immediate feedback. There's an automatic test in here and it's telling me that I got the right output. If I said hello word instead, I would get feedback saying that I had failed. Now it actually, when I tell it to grade me, it'll use the best answer I've ever given. So if I ever manage to pass the test, I will pass this. We've set up the assessment so that you have to get, usually that you have to get 100% in order to pass the assessment, but you can keep trying and keep getting feedback until you get that 100%. We've done that because we think it's really important to master the early material because things keep building on each other. So I click grade me and it comes back. You can see now that it's updated the score to one instead of zero. I've gotten a total of two out of two for this assessment and if I go back to this page on Coursera and I refresh it, it'll tell me instead of trying again, it's going to tell me that I've passed. Passed with 100%. That's the runestone environment. It's been a labor of love for all of us who've worked on it as an open source project over the last few years, especially Brad Miller who started the project. I hope you'll find it really helpful to you as you master the fundamentals of Python. I usually end my on-camera segments with a little joke, so here's a bit of humorous advice. Procrastinate today, always today. Don't put it off until tomorrow. Okay then, don't listen to my advice. Don't procrastinate today. Go get started with the first lesson in this course. I'll see you next time. Here we go with processing files. Up until now, the only data that our programs have processed, the only inputs have either come from literals that we put into the program itself or things that the user typed during execution as a result of a call to the input function. The only place where outputs have gone is the output window, which doesn't persist after you go to another page in the textbook. A file contains data that persists between executions of your program. As a computer user, you're already familiar with the concept of files. You've probably worked with image files and spreadsheet files and word processing files. In this lesson, you'll learn how to manipulate files in a Python program. We'll only be working with text files, not audio or other binary formats. At the end of this lesson, you should be able to write a program that reads the file's contents, either as a single string or line by line. You should be able to use relative paths to specify the location of a file. You should be able to write new text to a file. We'll see you at the end with more wrap-up and more geeky humor. Welcome back. What is a file? It's just a collection of data saved on a hard disk or other storage that persists over time. A file has a name and files can be organized into folders or directories. We'll be working with text files as opposed to images or sounds or videos. Here, we have an example of a file called olympics.txt that is available to us from Roonstone, each line has information about one athlete's participation in the Olympics. It's actually a simulated file. Roonstone can't access the real files on your computer for security and privacy reasons, so we simulate the presence of a few files in the Roonstone environment so we can illustrate how file reading works. Python provides some functions for reading data from an existing file. There are two steps. First, you call the open function to open the file. So here we have an invocation of the open function and we pass in two arguments. One is a string, it's the name of the file, olympics.txt, and the other says what to do with the file. In our case, r for reading. Later on, we'll see w for writing. The open function returns an object. It's a file object and we are assigning it to this variable name called fileref. There's going to be an additional step that we're going to have to do to actually read the contents. So this step that we're showing so far just creates the file object. Then there are going to be some lines of code that we haven't written yet that will actually get the contents from the file and do something with it. And then there's a corresponding close operation that lets Python know that we're done working with this file object and it's okay to stop keeping track of it. So there's three fileref.close. Now if I run this, we're actually not going to see any output because all we've done is open the file and then close it. We haven't actually read the contents in and we certainly haven't printed anything out so you're not seeing anything in the output window. What if we did want to read the contents and print them out? There's a few different ways of working with file objects. The first method that we'll use is .read which is going to bring in the entire contents of the file as a single string. Let me show you what that would look like. So we call the .read method on the fileref object. That returns a string and I'm assigning that to the variable called contents and then I could just print out, oh let's print out the first 100 characters of it and now we'll see the output window. So we're seeing the first 100 characters from the file which got us three lines and a little bit of the fourth line. Now you'll rarely use this method of reading the entire contents of the file all at once as a big string. Partly because if you had a really big file it would be a problem for your computer to handle all of that in memory all at once. The only times we're going to use this .read method is if you wanted to sort of grab the whole file and as a string and pass it to some other function that parses it and even then there'll usually be some other function available that will directly read from the file object a little bit at a time and parse its contents. So the second method that I'm going to show you is instead of reading it all at once we have a .read lines method. Instead of getting everything as a single string it returns a list of strings one string for each line in the file. So let's print out let's say the first four lines of the file this way and I forgot to rename the file the variable that I'm referring to let's call it lines because I called it lines on line 2 and you can see now that we're printing out a list so we've got the square brackets and inside the list there are four strings here's the first string the second string begins here and is ending here and so on each of the strings notice is ending with this special backslash n character that's the new line character because in the file we have a bunch of lines of text so when we read these lines in each of the strings has a backslash n at the end of it instead of just printing out all these lines I could maybe get a slightly prettier print out if I iterate through them so for line in lines and maybe I'll just take the first four lines again the first five lines and I'm going to print the individual line so now when I run it it's going to iterate through these four lines and each one of them is going to go on its own line we're no longer going to get the square brackets to show up because we're not printing the whole list we're iterating through the individual strings we're also not going to get these quote marks because we're going to pass the strings and when we print those out we just show their contents in the output window so let's see how that looks when we run it and sure enough we get each of the lines separately now you might notice something a little strange here which is that we get these blank lines the reason for that is that each of the strings you'll remember add that new line character at the end which meant do a carriage return and the print function always does a carriage return and so we're getting two of those one is starting us on a new line and the other one is starting us on a new line again so we get a blank line what if we didn't want to have that extra blank line well you've seen the dot strip method before I can strip the white space from the beginning and ends of each of these lines so the dot strip method gets rid of any white space at the beginning or the end white space is the space character a tab character or a new line character so if I call this now I'm going to get the print out that doesn't have the blank lines and sure enough we've got the first four lines from the file now there's a shorter way to iterate over the lines if that's all we're going to do is iterate over all of them so let me show you that because it's the more pythonic way rather than reading the entire file into a list we can just directly iterate over all of the lines by saying for line and file ref so here it's a file object it's not a list but it knows how to be iterated over and each time we get one more line so this is going to do exactly the same thing that we had before except now we're going to get all the lines in the file so we can iterate over this file object directly we can't do this thing of taking a slice of it like we did with lists that gives us an error so a file object supports iteration it does not support taking slices so if we wanted to just do something with the first four lines we'd have to use the dot read lines rather than just iterating over the file object but if we're prepared to process all the lines which is the normal thing that you're going to do with a file this is the standard pythonic idiom now when should you actually call dot read lines or dot read well one reason to call dot read lines is if you wanted to take slices another reason might be that you wanted to just get a count of how many lines are in the file so if I get all of the lines and put them in a variable I could now print out the length of lines and that would tell me how many lines were in the file I'm going to comment out the other two turns out there are 60 lines in the file if I wanted to find out how many characters are in the file I could read the entire file as one character string and then I could ask for its length so except in those special cases the more common thing that you're going to want to do is to just iterate over the file object itself now we won't use dot read or dot read lines instead we'll just iterate over the file object itself this is the most common way that you'll be working with files so that's python code for reading from a file we'll see you next time so far we've pretended that all files live in a single folder or directory and it's the directory that your python program is connected to so your python program doesn't need to specify a location for the file when opening it with our simulated files in a single stone environment that worked fine but if you're ever running python in a local machine it probably won't be enough for you so let's see how you can have files organized into folders and directories and still find them in your python program so most people do organize their files into folders or directories because otherwise you just have hundreds of files in one directory and you can't find them so for example here's a diagram showing a hierarchy of directories and files there's a top level directory called my files and inside that directory we have two other sub directories other files and all projects all projects itself has two more sub directories called mydata and myproject inside of mydata we've got a couple of files one's called data2.txt we've got data1.txt and mypythonprogram.py by convention we put python programs into files whose name ends in .py so a python program when we run it is automatically connected to the directory where it's invoked from we are invoking python directly in the runestone book so we can't really see how that works but I can give you a way to think about it to use if and when you do install python on your own computer to run it locally in the open function you can pass just a file name as we've been doing like that open data1.txt for reading but we also can specify a complete path that says where to find the file as well as the file's name so normally we would use a relative path which specifies how to get to another directory from the directory that you're currently connected to so suppose that we're running this mypythonprogram.py and we're running it from the directory myproject and we want to open data2.txt we can't just say open of data2 we have to tell it how to find it so this is not going to work we have to instead say from the current directory which is myproject you've got to go up a level to get to all projects the way to say that you go up a level is to say dot dot dot dot says go to the containing or the parent directory within all projects we have to go down to send into the subfolder mydata we've got to go down there and now we've given the directions go up to the parent within that go down to mydata and it looks like I we've got to capitalize the D let me fix that and then now we're in the right directory and now we can say data2.txt everything else is just the same so that's called a relative path this part is called the path and then we have the file name there's also a way to specify an absolute path that is absolute meaning here's how to find this file on the computer rather than relative to the current directory that you're currently connected to when you do that you'll have a path that begins with slash so it would be something like slash user slash presnick slash my files and you'd have to give the whole path all projects after all projects you would go to mydata and finally the file name I don't recommend using these what are called absolute paths because it makes your code and data not portable if I use the relative path and I take this entire set of folders and I just copy it to someplace else on a different computer maybe somebody else's computer not slash user slash presnick then I can still find it so generally people prefer to use these relative paths because it makes their code and data more portable you can transfer it to other computers now if you've only been running python in the runestone textbook you haven't had an occasion yet to use these file paths when opening files and don't worry about it just make a mental note to come back to this video or the corresponding page in the book when you are executing in an environment with files grouped into directories see you next time welcome back writing to a file is pretty similar to reading from a file you still have to open a file object based on a name for the file but instead of reading from the file object you'll write to it let's see an example here we're printing out the squares of all the numbers from 0 up to but not including 13 if I run it we'll see we get 0 times 0 and 1 times 1 is 1 and then 4 is 9 and so on now suppose instead of writing those to the output window we wanted to write them to a file where they would be permanently stored let's start coding that up a normal little template for reading or writing from a file is that we have some file object equals open of some file name I'll call it squares.text say writing instead of reading and whenever I open a file object like that I tend to forget that I need to close it so I'll just put the close in right away now if I run this nothing different happens because I'm still printing to the output window so instead of printing to the output window I want to write to a file so I'm going to say fileobj.write instead of having a print statement and I'm going to have to turn that number 0, 1, 4 or whatever into a string in order to be able to write it the print function is pretty forgiving we could give it a number or a string and it would figure it out but here we have to actually give it a string so if I do this I will now have in my file object all those numbers 0, 1, 4, 9 and so on now we have a little simulator for written files just as we have a simulator for reading from files in Runestone as you recall we can't read or write files from the local file system for security reasons so we have built in this ability to read a few files that are built into each page and we can write files that will be available just until the page gets reloaded so we have this file called squares.txt and here's the output now that output may be a little different from what you were expecting because we have 0, 1, 4, 9 and so on it's not nice like it was in the output window we don't have 0, 1, 4 each on its own line the reason for that difference is that when you call print and you give it a string like 4 you'll automatically get 4 and a new line in the output window when we call .write we just get the contents so we just get the 4 but we don't get a new line character you have to decide for yourself when you want a new line so what I'm going to do is after I've written each square I'm also going to explicitly write a new line the backslash n character now if I save and run it we'll see something that looks a little nicer in the data file now we've got all the values each one on its own line of course we could combine these onto a single line we could have the string of square backslash n all on one line that would work just as well sometimes especially for students who are just learning I like to make the new line character be its own line because it's a real reminder that with .write you have to create that new line character explicitly unlike with the print function where it does it for you so that works just the same now when we have a file the iteration of this page being displayed that file is available so I could read it I can read that file it's called squares.txt so new file object equals open of squares.txt and this time I'm opening it for reading and let's just print out the dot read gets me all the characters if I just want the first ten characters I'll do that and now we'll see something in the output window let me clear all of my markings and there you see we've got the first ten characters showing up in the output window the first character is the zero and then there's a second character for the new line so that's two characters three, four, five, six, seven, eight nine and ten if I had asked for just the first nine characters I would have up to the one without the six and sure enough there it is and by the way this file really is there even if I don't recreate it each time with the code I could now just read it if I wanted to so let me get the first 14 characters let's say you'll see that we get something different sure enough we get the 25 in addition now as I said we've simulated the creation of this file it's there until we reload the page let me just demonstrate that for you if I reload the page but try to keep this code we're going to have a problem it won't be able to find the file so if I try to run this same code again it says that there is no such file the directory squares dot text so it can't open it that's how we write to files it's structurally similar to reading we open the file just as we do for reading but when we're writing we do it with a quote w instead of quote r we call dot write as needed but if we want a new line we have to explicitly write the backslash n character and then we have to close the file it's important for writing because otherwise the contents might not be fully written and you might lose some of them see you next time welcome back we're going to see a little shorthand that makes it even easier to work with files and avoids the need to remember to close them Python has an advanced feature called context managers we don't need to worry about it in all of its generality but it makes possible this nice recipe for working with files for example you start with a special word with and then there's another special word as in between you have the open statement so we say open and the file name and whether you want to read or write it after the word as you have a variable name so this is actually equivalent to saying md equals open of mydata .txt for reading it's equivalent to that in the code block that's indented under the word with we can refer to md and md will be bound to our file object but we get a behind the scenes action that happens at the end of the code block it's as if we have an md.close that gets executed behind the scenes we don't actually have to specify it it just gets done for us after everything else gets executed in that code block if I run this I just get a print out of the first line and the second line as you might expect since we've iterated printing each one we get the print out of those two lines that were in the file they're showing up in the output window let's get rid of our markings here and just generalize this a little bit suppose we didn't want to just open mydata.txt but we wanted this to work with with any file name so let's say fname equals mydata.txt we would open the file name here and then we would do stuff with the file object so we would have md.read or md.readlines or we might have for line in md do something with each line so now we have sort of a general recipe for reading data from a file we start with this with statement where we open a file object and we assign it to the variable name md and then we use one of our three methods of working with the file we either read all of its data in as a single character string with .read or we read all of the lines as a list of character strings with .readlines or we just iterate through the iteration over the file object on line 5 the same recipe will work just as well for writing files so if I change that R to a W I can do the same things that we would do for writing to a file so for example I might say for a number in range of 10 md.write or num and md.write a new line character and again the close happens automatically for us so when I run this unfortunately I'm not seeing the file contents and that reminds me of one little gotcha that you might want to remember if you're trying to write a file you can't write to a file that's the same name as one of the built-in files in the page we have a little protective measure from overwriting it so if I just change the name of the file now I get as my output what I was expecting you can survive a long time as a python programmer not using the width construct but experienced programmers will use it you'll see it on stack overflow and other help sites so it's a good idea to be able to read it and feel free to use it if you like see you next time we're chugging along here you've learned the basics of reading and writing files now there are two things to trip people up so just keep these in mind first you have to pass a string the file's name as the first parameter when you call the open function if you have a variable name whose value is the file name don't put it in quotes if you have a literal file name do put it in quotes second you have to keep track of the distinction between the file name, a string the file object which is the thing returned by the open function and the file's contents which you get by doing operations on the file object I'm glad you made it this far it's joke time reading I had plans to read a book about sinkholes but they fell through writing there was once a young man who professed his desire to become a great writer when asked to define grade he said I want to write stuff that the whole world will read stuff that people will react to on a truly emotional level stuff that will make them scream, cry howl and pain and anger he now writes the error messages for the python interpreter see you next time welcome back this lesson introduces the csv format csv is an acronym it stands for comma separated values csv a file in csv format is just a text file that follows certain conventions the csv format says that values are going to be separated by commas and it says that every line of the file will have the same structure so for example we've got two commas on every line one, two one, two and so we've got room for three things on each line the first comma between the first and second and after the second one usually you'll have the first line be special to give column names and then afterwards all the rest of the lines are similar to each other, they each have values the first value on line two is a name the second value 98 is a score a plus is a grade the reason this format is that by having this standard format you can have it read into lots of different programs so I've chosen to name this file grades.csv and it's saved on this computer it's saved in this folder on a Mac when you have a file name that ends in .csv unless you've configured your Mac some other way when you try to open it that you want to try to open it into an Excel spreadsheet and so here it's taken that and instead of showing it to me as a text file it's chopped it up and actually put the first values into column A and the second values into column B, third values into column C the same .csv format can be read not just by Excel but by all kinds of statistics programs, STATA R, CSS and so on and you can also open it in Google Sheets so here's my Google Drive folder where I have some of my notes for these recordings and I've put that grades at same grades.csv file here when I double click on it it gives me a little preview of what it would look like and it gives me an option to open it with Google Sheets and now we see it again in columns so the csv format is kind of an interchange format it's just a text file but if you follow the conventions and have the same number of commas on each line then it can be possible for a file to be read by any of these programs at the end of this lesson you'll be able to read a text file whose contents are in csv format and parse the lines by using the dot split method and you'll be able to write a text file whose contents are in csv format using either the dot format method with a string template or the dot join method see you at the end of the lesson when I'll explain the difference between a cat and a comma Welcome back since csv is just a special format you can read it like any other file in fact we've been reading a file that was in csv format already in this file it didn't follow the conventions of using dot csv as the ending for the file name but the actual contents were in csv format the advantage when we know that something's in csv format is that it's easy to parse it we can just chop up each line into its individual components by looking for where the commas are for example take a look at this code on lines one through four I'm just reminding you what the dimensions of this file are so you can see the first line is a header and then I'm printing out more lines up to line six each line has somebody's name and the other values they're separated by commas I want to show you how easy it is to process this contents because we can just use the dot split method looking for commas so on line six of our code we're looking just at the header line that's this one and we're saying well first of all get rid of the new line character at the end of that line and then split and split wherever you see a comma so where there's a comma we're going to chop up the text and we're going to get actually we can see the output here because on line eight we're just printing it out so we have name is the first value it's the characters that occur before the first comma and our next value is the string sex and then we have the string age all of these are coming from this one line of text but we've chopped it up to make a list and that's what the dot split command does for us we're doing something pretty similar with the rest of the lines we're looping through all of them and for each of them we're chopping it up wherever you find a comma on the line now once we take a line like a lemusi m23 china judo and a and we split it up into a list we can now use indexing I can ask for the value that's in index 5 the sixth element from that line and we can check is it's value na well sure enough it is it's value is na and we do one thing if it's na we do something else if it's not in this case if it's not na then we're going to print something out so we're going to print out something only for the people who actually want a medal we're going to skip the people who didn't win a medal and you can see that output that comes down here so only people who want a gold, a bronze or a silver will show up in our output and we're choosing here to not print the whole line we're printing three elements from that line the three curly braces and we're printing out val's square bracket zero that's the name it's the string that comes before the first comma and we're getting the thing from the position five so that's the name and the event and the medal that they won now note that we have to split on commas I think before when we've seen the split command we've tend to just split without specifying a value and when you don't specify a value it splits wherever it finds any white space a space or a tab or a new line we're going to see that we get something different we're not going to get this nice list here we're going to get a different list and sure enough what we get is a list with only one element in it it's one big string all of its commas and everything it hasn't split it up into seven different elements or six different elements it's just given us one big thing the reason is it was looking for white space and in this whole string there are no spaces, no tabs no character turns so we just get one item if we had split on something else let's say the letter E it would split wherever there was an E and we'll get some weird thing oops I have to split on the character E not the variable name E sure enough our first value is NAM and then there was an E and after that there's a comma and a capital S and then there was another E and so on so split will split on whatever you tell it to split on in our case we want to split on commas because the comma separated value format says commas are the things that separate the values by the way this is another more advanced version of the CSV format that separates with commas but encloses all of the values in quotes let's see what that looks like here you can see in this file format you can see that some events have commas in them while others don't for example we have speed skating comma 1500 meters whereas for tug of war or basketball there's no comma in it that's going to make it harder to parse because when there's a comma we don't know whether it's part of a value or separating values if we were to just split on comma like we did before vowels equals row.split on comma and then we said square bracket 5 like we did before the fifth element or the index 5 the sixth element of this row will be NA but the fifth element of this row will be the 1500 meters so life gets more complicated when we want to parse this more advanced comma separated format that also has quotes around each of the values it actually is still possible to unambiguously chop up the lines but that's a harder programming challenge I don't recommend trying it yourself instead when you encounter something in this format you would use python csv module to parse the lines for you we're not going to learn that module right now I found that it's good for students to learn how to parse simple csv by using the dot split method at this point for understanding what's really going on later you can learn to use the csv module for harder formats in summary when we have a simple csv format with comma separating and no quotes around all the values parsing is easy you just read in the file a line at a time and you use the split method specifying comma as the thing to split on that gives you a list of the individual values or the individual field names on the header line we'll see you next time welcome back to write a csv file you just need to write text strings that follow the comma separated values format here we have the basic structure you write a header line with field names separated by commas you iterate through your list of objects and for each one you generate a line of output what I've got here is just going to print the output to the output window but it's in the csv format as you can see we have one header row with the three headers name, age and sport and then each line has the same structure there's somebody's name and then a comma there's an age and then another comma and then a sport now if we want to change this instead of printing to the output window we'll do a file we'll do our usual transformation where instead of print we'll use dot write and we'll have to open the file and close it I've actually already got that code written out I'm going to switch to it so I've opened the file for writing I've assigned the file object to this variable name out file and then I'm writing the header on lines 8 and 9 with the row strings instead of printing them out I'm doing out file dot write and because we're writing to a file we have to explicitly put in the line breaks that we want instead of putting it in the output window it's now in this data file which is available until we reload the page again the key to outputting csv format is just generating a string that contains one line has commas that are separating either the field names or the values there are two options that work well for generating a line with the commas separating the values and a third one that I don't really recommend as shown here we're using the first method that I do like actually the one I like the best which is to use a format string it's easy to see with this format string that we're going to have three values because we have three pairs of curly braces one, two, three and it's easy to see that we've got a comma separating each of them and then the values are just going to get substituted in the first element John Alberg the second element goes there that's the 31 and cross country skiing being the third element a second possibility is to use the dot join method you may recall the dot split method for chopping up a string into component parts it's counterpart going the other way is the dot join method and I would write something like this row string equals comma dot join of some values and those values are going to be the same values that we used here use this instead I'll comment out the old version I always find this a little confusing and I think many students do too you might think that the natural thing is to call a join method and pass comma as a parameter like we did with dot split passing comma as a parameter but here we're saying join is an operation that we do on the comma object and we have to pass in as values the things that are going to be joined together you just have to remember that that's the opposite order of what you might expect so it's comma dot join there's a couple other tricky things about the join operator and I'll show them to you here first I'm going to get rid of our markings when I run this I'm going to get an error there's actually two problems the first problem is the join is expecting two arguments and I've given it four so it really wants a list of things not a bunch of different values so I have to give it a list I can put all of these values into a list one other tricky part about it is that join wants to have a list of strings or sequence of strings an Olympian square bracket one is the number 31 it isn't a string, it's an integer so we get an error that it expected a string but it actually got something different sequence item one that was Olympian square bracket one so if I turn it into a string I'll finally have something that works now I get the same output that I was getting before so this looks pretty complicated and you're probably thinking why would anyone ever want to do this well if my values were all strings then I might be tempted to do it now I don't need to say str of Olympian I can just say Olympian square bracket one there and it'll work and the thing that really makes this attractive in this situation is that I can just refer to Olympian which is already a sequence of strings it's a tuple with three strings in it I can call dot join and pass that tuple of strings and I still get this lovely compact code and the same output so if you have a list of strings then this dot join method might be pretty attractive if not you're going to have to do any kind of converting integers to strings or things like that in the version on line 13 that uses the format string a third possibility is to just use string concatenation but it gets really kind of hard to read it and it also is still going to require us to convert the number to a string so we're going to have something like this and I think I'm pretty unlikely to get it right the first time but I'll have row string equals Olympian square bracket 0 plus a comma plus Olympian square bracket 1 plus another comma plus Olympian square bracket 2 that might work let's see I got the right output again so a little hard to read my preference generally is to use the format string like on line 13 unless I really have a sequence of strings in which case I might be tempted to use the comma dot join from line 12 now suppose we had slightly different data where one of the event names now has a comma in it but not all of them do here you can see that cross country skiing the 15 kilometer is specified rather than the 100 kilometer and some of the other events don't have commas in them you may remember that one of the ways we can handle this kind of thing is with the advanced CSV format where we put all of the values in quotes this is one of the nice things about Python having both single quotes or double quotes as a way of delimiting a string is if we wanted to have double quotes as a character inside the string we can use the single quotes as the delimiter as I've done here on line 8 and with our format string this is not too bad we just have double quotes around each of the pairs of curly braces and we're still going to substitute in where the curly braces are so the value is going to be surrounded by the double quotes in this situation where I really appreciate the dot format method I wouldn't want to try this with dot join and I definitely wouldn't want to try it using concatenation with the plus sign you'll see that our outputs have those quotes around all of the values in particular we've got quotes around the whole cross country skiing comma 15 kilometers so there's a comma that's inside one of the values inside the double quotes and the other commas are separating the different values to summarize the overall structure for writing a CSV file is to write the header line that's what we did on lines 8 and 9 then iterate through all of your objects and for each of them you're going to write one line into the CSV file so I'm creating the row string I'm writing it tacking on the backslash end to indicate the new line we'll see you next time welcome back for a few tips on file names some older computer operating systems placed a lot of restrictions on file names for example early versions of windows only allowed 8 characters before the period and 3 after the period modern computer operating systems have fewer restrictions on your file names still it's a good idea for maximum portability of your files between computers to place a few restrictions on yourself here are our suggestions first of all don't use commas in a file name so don't say olympics comma winter don't make that be a file name even if your computer operating system allows it don't use more than one period like .txt .3.csv don't do that don't use a forward slash or a backward slash in your file name olympics slash winter that's usually a convention for making subdirectories you don't want to have a slash in your file name or a backslash in your file name don't use spaces even though most operating systems will permit that now olympics space winter also not a good idea do follow some conventions for what happens after the period .txt should be for plain text .py if you've got a file that's python code .csv if you've got text that's in the comma separated values format so we've actually been bad here we've got this file olympics .txt it really should have been olympics .csv another file ending is .xls that's for files that are in microsoft excel format .doc or .docx for things that are in microsoft word format the reason for following these conventions about what goes after the period is that computer operating systems will usually look at that extension .txt or .csv .py in the file they'll decide which program to open so if it's a .py file they'll try to open it in a text editor or they'll try to execute it as a python program if you have a .xls it'll try to open it in microsoft excel and so on so that's all for now just a few words on naming conventions for your files I'll see you next time welcome back you should now be able to write a program from a spreadsheet or a stats program and you should be able to write a text file whose contents are in .csv format you can then import it into a spreadsheet or a stats program .csv, comma separated values so a joke about commas what is the difference between a cat and a comma well a cat has claws at the end of its paws and a comma is a pause at the end of a clause which reminds me of another one in the same vein what's the difference between a prince, a bald-headed man a monkey and an orphan right what's the difference between a prince, a bald-headed man a monkey and an orphan well the first is an heir apparent the second has no apparent hair the third has a hairy parent and the last has nary apparent see you next time welcome back in this lesson we're going to learn about a new python type called dictionaries dictionaries like strings lists and tuples are a collection of items but unlike strings lists or tuples they're an unordered collection of items meaning that they don't have a first, second or third item they're kind of a bag of key value pairs in order to create a dictionary we use curly braces so this expression creates an empty dictionary we assign this empty dictionary to the variable English to Spanish or ing2sp so that's what line 1 does in order to assign one key value pair to this dictionary then we say ing2sp and then specify a key that we want and then we set it equal to whatever value we want so if I go forward to line 2 you can see that first ing2sp starts out as an empty dictionary then when we run line 2 then we set one key value pair so the key here is 1 and the value is uno unlike lists, strings or tuples dictionaries store key value pairs instead of just items and what that means is that every dictionary has items that contain one key and one value you can think of the key as the thing that you use to actually access the value so for example in a physical dictionary the keys would be words and values would be their definitions in the context of python dictionaries values can be any python object and keys can be almost any python object but we'll get to that more in a bit so here we have the key 1 associated with the value uno and that's because in this dictionary, ing2sp we're going to associate English words with their Spanish equivalent and we're going to use the English words as the keys and the Spanish equivalent as the value so again, in our code we first created an empty dictionary using curly braces and then we assigned the value associated with the key 1 to be uno by saying ing2sp sub 1 and notice that this is a string because we have quotation marks around it so the key string 1 is the value the string uno the next line associates the key 2 with the value dose and you can see that our key value pair here gets added to the dictionary again, one thing that I mentioned in the introduction was that dictionaries are unordered and that's actually kind of important so there's no notion of what's the first item, the second item the third item and so on instead just think of dictionaries as kind of a bag of key value pairs you don't know what order you're going to get them in but you know if you set 2 to dose and 1 to uno that these key value pairs will be associated with each other no matter what order they're in on line 4 here we set the value associated with the key 3 to be tris so you can see again that this kind of changes around in our dictionary but the important thing is that every key so 3 is associated with tris 2 is associated with dose and 1 is associated with uno now when we print out our dictionary then we print out a list of key value pairs so here you can tell that this is a dictionary because we have curly braces and then every key value pair is separated by a comma so we have 2 commas here and our key value pairs are 3 is associated with tris and we can tell because here we have a colon between the key and the value 2 is associated with dose and you can tell again because we have a colon between 2 and dose and 1 is associated with uno so here we always have the key first and then we have colon and then the value every one of these key value pairs is separated out by commas and all of this is wrapped in curly braces to specify that it's a dictionary you can also set these key value pairs in line so whereas in this code we set 1 to uno on line 2 on line 3 and so on we can also set them all on line 1 so if we instead declared our dictionary like this so here again we can tell that this is a dictionary because we have curly braces and then we have a list of key value pairs every one of these key value pairs is separated out by a comma and then we have the key so 3 associated with the value trace and we specify that using a colon so 1 colon uno 2 colon dose so now when we run line 1 and we look at our frames and objects and we'll see that ink2sp is now associated with the dictionary that has 3 key value pairs in order to look up the value associated with the particular key we use square brackets so here on line 1 we create a dictionary that has 3 key value pairs just like before and on line 3 we assign value to be ink2sp sub 2 so in order to get a particular value we first say the name of the dictionary ink2sp then we use square brackets and then inside of the square brackets we put the name of the key that we want to get the value for so here the key is 2 and if we look at our dictionary we can see the key 2 is associated with the value dose so the value of this overall expression ink2sp sub 2 is going to be the string dose and what that means is that when we print out value on line 4 then we're going to print out dose so you can see that running line 3 and 4 first on line 3 we assign value to be the string dose which was the value of this expression and then when we print out value in our program output we get dose so on line 5 we print out ink2sp sub 1 and the value of this expression is the value associated with the key 1 in our dictionary ink2sp so if I look at the dictionary here I can see that the value associated with the key 1 is uno so the value of this expression is the string uno and that's what gets printed when we print out ink2sp sub 1 that's all for now until next time so let's go over some questions so this is a true false boolean so true false the dictionary is an unordered collection of key value pairs that is absolutely true that is what a dictionary is then we have a multiple choice what's printed by the following statement so here we declare a dictionary with 3 key value pairs cat dog elephant and then we print out mydictionary sub dog so the value of this expression is whatever value is associated with the key dog in mydictionary so in order to answer this question we have to look at what's associated with this key dog we find the key value pair for dog here and we see that the value associated with the key dog is 6 and so we should expect 6 to be printed out then this question asks us to create a dictionary that keeps track of the USA's Olympic medal count each key of the dictionary should be the type of medal so gold, silver or bronze and each key's value should be the number type of medal that the USA has won so currently the US has 33 gold 17 silver and 12 bronze so again the keys here should be gold, silver and bronze and the values associated with those keys should be the number of the medal that the US has won then they want us to create that dictionary and save it in the variable medals so in the code I'm going to say gold equals a dictionary so I'll use curly braces and first I want to have the key gold associated with the value 33 if I just typed gold 33 which is common to do then I would actually get an error because when I say gold it's looking for a variable called gold instead I want to make this key a string so I want the string gold to be associated with the value integer 33 same thing with silver so silver is associated with the value 17 and bronze is associated with the value 12 and so here I just got each of these so 12 17 and 33 from the problem statement so now when I run my code then I should see that it passes so this question is very similar here we're told you're keeping track of Olympic medals for Italy in the 2016 Rio Summer Olympics at the moment Italy has seven gold eight silver and six bronze medals create a dictionary called Olympics where the keys are the type of medal and the values are the number of that type of medal that Italy is one so far so exact same idea I'm going to say Olympics equals and then we want to assign that to be a dictionary so we use curly braces and we have three key value pairs so we have gold associated with the value seven and then just for the sake of writing this out slightly differently I'm going to set silver on the next line by saying Olympics sub silver equals eight and so this is going to add a new key value pair associating silver with the value eight and then I need to also say Olympics sub bronze equals six so here I just did something slightly different I could have added three value pairs right in here like I did for the previous problem but I could also just set the key value pairs using an assignment statement like I did on lines two or on lines three and again the keys are all strings so here I'm using double quotes to create the string here I'm using single quotes but the effect is that we have three key value pairs all the keys are strings so gold silver and bronze all the values are integers seven eight and six that's all for now until next time welcome back in this lesson we're going to learn about some operations that we can use to modify dictionaries so here on line one we create a dictionary in this dictionary we have four key value pairs so the key pairs is associated with the integer 217 the key the string apples associated 430 and so on so overall we have four key value pairs and that's all created on line one now on line two we do an operation called dell so dell is short for delete and delete so key value pair from our dictionary so when we say dell and then we say the name of the dictionary and then in square brackets the key whose key value pair we want to delete that's going to get rid of that key value pair from our dictionary so here we're saying dell inventory sub pairs so here our key value pair pairs is associated with the value 217 and line three says get rid of this key value pair so when I actually run line three you'll notice the number of key value pairs will go from four to three if instead we created the same dictionary so again we're back to having four key value pairs and now on line three rather than deleting the value associated with pairs then we're going to set it to zero so if you remember from when we first introduced dictionaries we can set a key value pair by saying the name of the dictionary sub and then whatever key we want to set equals whatever value we want it to be associated with but the difference here is that by the time we run line three we'll already have a key value pair that associates the key pairs with a different value 217 so when we say sub pairs now equals zero that's going to update the value associated with the key pairs so in other words line three is going to say pairs is not 217 anymore it's now associated with the value zero so here if we now run line three then you'll see the value of pairs is now zero so let's reuse that dictionary in another example so on line one we create the same dictionary that has four key value pairs that we saw from the previous examples and now on line two we're setting that dictionary so inventory subbananas to inventory subbananas plus 200 now when we look at this this expression is a little confusing because here we're repeating this expression inventory subbananas but remember that when we do then what python does is it first evaluates the value that we're going to be assigning to so the first thing python does on line two is it asks what's the value of this expression inventory subbananas plus 200 after it computes this value then it's going to assign whatever that value is to inventory subbananas so in other words to figure out what this is going to do we first have to ask what's the value of this expression so to figure out the value of this overall expression let's break it down so we add inventory subbananas to 200 so we ask what's the value of inventory subbananas to figure that out we look at this dictionary and see that bananas is associated with 312 so this value is 312 and then we add 200 to that to get 512 and then python takes this integer 512 and it makes it the new value associated with the key bananas so this is going to now be 512 when we run line 2 so let's run line 2 and we see bananas is now 512 now if you remember the lend function from strings or lists or tuples remember that lend gives you the number of items in a collection so in the case of strings lend gives us the number of characters in that string lend also works with dictionaries so if we pass in the dictionary inventory then when we call lend on it the value of this expression gives us the number of key value pairs in this dictionary so the value of this expression is going to be 1 2 3 4 because there are 4 key value pairs so we'll see that num items in our frame right here is going to be assigned to the value the integer 4 so let's run line 4 and we see that num items is now 4 so let's do some more questions in this question we ask what is printed by the following statements so here we create a dictionary class which is a three key value pairs then we say my dictionary sub mouse equals my dictionary subcat plus my dictionary subdog so the value of this expression mydictionary subcat is 12 the value of this expression mydictionary subdog is 6 meaning that the value of this overall expression or 18. So then we assign the value 18 to being associated with the key mouse. And so by the time we print out my dictionary sub mouse, we're going to print out 18. So our answer is C. So this question asks us to update the value for Phelps in the dictionary swimmers to include his medals from the Rio Olympics by adding five to the current value. So Phelps will now have 28 total medals. And it asks us do not rewrite the dictionary. So here on line two, we assign swimmers to be a dictionary that has one, two, three, four, five, six key value pairs. But we want to update the value associated with the key Phelps. And the way that we want to update it is by adding five to the current value. So the way that we're going to do that is we're going to say swimmers sub Phelps is now its previous value. So swimmers sub Phelps equals swimmers sub Phelps plus five. So whatever it started out with, it's now going to be that plus five. That's all for now. Until next time, welcome back. So we've seen ways to get a value associated with a particular key in a dictionary. So for example, if we have this dictionary, and we assign it to the variable inventory, and it has four key value pairs, if we wanted the value associated with the key oranges, then we would say something like print inventory, sub oranges. But sometimes we don't want the value associated with just one key, we want the value associated with every key in that dictionary, or we want to iterate over every key value pair in that dictionary. In order to do that, dictionaries have a set of methods that we'll find useful. So the first method that we'll find useful is dot keys. So if we want to loop over every key in our dictionary by using a for loop. So if I say for key in inventory dot keys, and then for now, I'm just going to print out the key. So I'll say print key. When I run this code, then I'll see that I get all of the keys in my dictionary. Remember that dictionaries aren't ordered. So there are no guarantees what order I'll actually get these keys in. The only guarantee is that this for loop is going to run once for every key regardless of order. So when I just print out the key, then I can see that there's apples, bananas, oranges, and pairs are all the four keys in our dictionary. If I want to get the value associated with that key, I can say key has the value. And then to get the value associated with the key, I use inventory sub key. Now notice here that I'm not putting key in quotation marks. In other words, what I want is I want the value of the variable key. I don't want to look for the value associated with the key named key. So like if this dictionary had a key named key associated with the value 50, and I put key in quotation marks, then I would get 50. But here instead, I'm saying I want key the variables value to be the key that I'm fetching. So now we'll see apples has the value of 430 bananas has the value 312 oranges has the value of 525 and pairs has the value 217. So in order to actually get that list of keys into a list, one thing that I can do is I can just say keys equals a list of inventory dot keys. And if I now print out the value of keys, then I should get a list of keys in our dictionary inventory. But remember again, that there's no guarantee about ordering here. So these keys could be in any order. The only guarantee is that I'm going to get a list that has all of the keys in some order. Now notice here that when I got the list of keys, I cast it to a list by calling the list function. And the reason that we'll do that we're not going to get into too much yet. But on a high level, we always cast to a list, because in Python three inventory dot keys doesn't actually quite return a list directly. It returns something that we can actually iterate over. So we can put inventory dot keys in our for loop. But in order to actually get a list of keys, we're always going to have to cast it to a list by calling the list function. If we wanted to iterate over inventory in a slightly less verbose way, we could just say 4k in inventory. So whereas here, we're saying four key in inventory dot keys. Here, we're just saying for, I'll just rename this key to be consistent for every key in inventory. When we say that, then Python automatically assumes that we want to iterate over the keys. So when I run this code, then I see got key, and then every key in our dictionary. Again, no guarantees about order, just that we'll loop through every single key that we have. Dictionaries have two other methods that are somewhat similar to dot keys. So dot values, rather than getting a list of keys, gets a list of values. So 430 312 525 and 217. So dot values would give us a list that has all of these integers. But again, there are no guarantees about ordering. So we know that it's going to be a list with these four items, we just don't know what order inventory dot values is going to be in. Inventory dot items instead gives us a list of key value pairs as tuples. So inventory dot items is going to give us a list. And the first item might be apples and 430. Again, I say might be because we don't know what the ordering is. But the second item might be oranges associated with the value 525. And then we might have bananas and so on. So dot item gives us a list of tuples, where every tuple is a key value pair. So on line three, we print out the value of inventory dot values. And on line four, we print out the value of inventory dot items. So let's see what these two lines output. And I'm going to for now comment out lines six and seven. So we can see that when we printed out inventory dot values, then we got all of the values. And they aren't necessarily going to be in the order that we actually created the dictionary in. But in this case, they just happen to be in that order. When we called inventory dot items, then you can see that we have a list of tuples of key value pairs. So here, this first tuple says that the value of the key apples is 430. The value of the key bananas is 312 and so on. We can also use the same in operator that we saw on lists and strings on dictionaries. So if we print out the value of this expression, apples in inventory, then the value of this expression is going to be a Boolean. The value of that Boolean is going to be true. If this is a key in our dictionary, so if apples is a key in inventory and false otherwise. Now it's crucial here that I mentioned that it's true only if apples is a key in our dictionary. It can't be a value. It has to be a key. So when we print out the value of apples in inventory, then we can see here that we have a key value pair where the key is apples. So this should be true. If we print out the value of cherries in inventory, then if we look at our key value pairs, apples, bananas, oranges, pairs, we don't see anything that has the key cherries. So this is going to be false. So if we comment out this code and just run lines two and three, then we should see the value of apples in inventory is true. The value of cherries in inventory is false. Now we can also write code that depends on the values of these Boolean expressions. So we can say if our dictionary has the key bananas by saying if bananas in inventory, again, this expression is a Boolean, that's true if bananas is a key in our dictionary. In this case, bananas is a key. So if that key is in our dictionary, which in this case it is, then we print out the value of inventory subbananas. Otherwise, if bananas is not a key, we say we have no bananas. So here we're going to print out 312, which is the value associated with bananas. If I modified this key to be something else, I'll literally call it something else. Then we would see we have no bananas because bananas is no longer a key in our dictionary. Another method that we can use on dictionaries is dot get. So dot get works almost just like indexing. So we can say inventory dot get apples in the value of this expression is going to be the same as the value of inventory subapples, which in this case is going to be 430. So if I just run line three and comment out lines four and six here, then I see that I get 430. Now on line four, we say print out inventory dot get cherries. Now notice here that cherries is not in our dictionary. So we don't have any key whose value is cherry. So when we run this code, then we get the value none. This is the difference between dot get and actually indexing, because if we indexed here, so if we said print out inventory sub cherries, we would instead get a runtime error because cherries was not a key in our dictionary. If we instead call dot get, then we get kind of a softer error. So instead of actually giving us a runtime error that stops our program dot get says the value is none dot get also takes an optional second argument, which is to say that if this key isn't there, then this is what the value of that expression should be. So here when we say inventory dot get cherries, and then we pass in a second argument of zero, then this is going to say if cherries is in our dictionary as a key, then get the value associated with it. If cherries is not a key in our dictionary, then just use this as the value. This isn't going to add a key value pair with cherries. It's instead just going to save the value of this overall expression should be zero. So when we run our code, we see that now when we call inventory dot get cherries with the optional argument zero here, we instead get zero because cherries isn't a key in our dictionary. If we had 999, then we would get 999. And if we had a key cherries in our dictionary, let's say it's five, then we would instead get five. That's all for now until next time. So in this question, we asked what's printed by the following statements. So my dict is a dictionary with four key value pairs. And we print out my dict get cat divided by my dict that get dog. Here we're using division without a remainder, which just gives us an integer. My dict that get cat is 12. And my dict that get dog is six. And so the value of 12 over six is going to be two. So in this question, we're asked what's printed by the following statement. So we create that same dictionary. And now we print out the value dog in my dict. So that asks is dog a key in my dictionary. And we can see that it is it has the value six. But all we care about is that it is a key. Here, we create that same dictionary. And we print out the value 23 in my dict. And here we have a key elephant that has the value 23. But remember that in only asks is this is 23, a key in my dictionary. And in this case, it's not our only keys are cat, dog, elephant, and bear. It doesn't matter that it just so happens to be a value, the value here of this overall expression is going to be false. So here we first assign total to be zero, and then we create the same dictionary that we had before. And we loop through all of the keys in our dictionary. So a key is going to be cat, dog, elephant, and bear, not necessarily in that order. And we say if the length of that key, so every one of these keys is a string. So if that string is longer than three characters, then add its value to total. So the first thing I would ask are what are the keys that are longer than three characters, and that's just going to be elephant and bear. And so for cat and dog, this statement is not going to run because the key cat is not longer than three characters and same thing for the key dog. So for elephant and bear, then we're going to say total equals total plus my dictionary plus that value. So we're going to first assign total to be zero plus, let's suppose that elephant comes before bear. So zero plus my dict sub elephant, so zero plus 23. So total gets the value 23. And then by the time we get to the key bear, then we're going to say total equals its old value. So 23 plus the value associated with the key bear, which is 20. So 23 plus 20, which is going to leave total at 43. So the answer here is going to be B. That's all for now. Until next time, welcome back. One more thing to note about dictionaries is aliasing. So on line one, we create a dictionary that has three key value pairs. And on line two, we assign a new variable called alias to be that dictionary. On line four, we print out is alias opposites. And this is asking, are they pointing to the same object? So if we run code lines on lines one through four, then we'll see again line one creates a dictionary. And line two assigns alias to point to the exact same object of that dictionary. Now, this is important, because here, if we say alias sub right equals left, reassigning right from its previous value, which was wrong. And then we print out opposites sub right. Then we're actually going to print out left on line seven. So here we can see line four says true because alias is the same object as opposites. But now when we print out opposites sub right, we now get left. And that's despite the fact that we first assigned opposites sub right to be wrong. And in all of these lines, we never directly changed opposites sub right. But the culprit here is that on line six, when we changed alias sub right to be left, then that also changed opposites sub right. Again, we can use code lengths to see why that is. So on line one, we create a dictionary. And then we assign alias to be that exact same dictionary. And we can see alias is opposites. And now when we assign alias sub right to now be left, then that modifies that dictionary. So that right is now left. And that changed the value associated with the key right to be left for both opposites and alias. So let's go through a question related to this. So here on the first line, we create a dictionary with four key value pairs. And then we assign your dict to be my dictionary. So your dict is pointing to this same dictionary. And then we say your dict sub elephant equals 999. Now remember your dict and my dict are the same object here. So when we assign it elephant to be 999, that erases the value 23 and replaces it with 999 for both of these dictionaries, because they're pointing at the same object. So now when we print out my dictionary sub elephant, then we're going to get the value 999. That's all for now until next time. Welcome back. So we've already gone over the accumulation pattern where you iterate over a sequence and update an accumulator variable as you iterate through every item in that sequence. In this lesson, we're going to go over the dictionary accumulation pattern, which is the same idea, except our accumulator variable is going to be a dictionary that has multiple key value pairs. So let's start out with the standard accumulation pattern example. So online one here, we open a file named scarlet dot txt and we open it to read it. And then we say text is f dot read. So in other words, txt is a variable, which is a string, and it's the contents of the file scarlet dot txt. So suppose scarlet dot txt represents the text of the book, the scarlet letter. So let's say that we want to keep track of how many t's are in scarlet dot txt. The way that we do that is with the accumulation pattern. So here, t count is our accumulator variable. We initialize it to be zero at first to say that we've seen zero t's so far. And then we iterate through every character in our text file. And we say, if that character is the letter t, then say t count equals t count plus one. In other words, we just saw one more character t. So by the time we're on line eight, and we're done with our for loop, if we print out the value of t count, then we should have the number of t's in scarlet dot txt. So if I run my code, then I'll see that here there are 17,584 occurrences of the character t in scarlet dot txt. So that's great. That's the standard accumulation pattern. But let's suppose that we wanted to count more letters than just the letter t. So suppose that we also wanted to keep track of how many s's are in scarlet dot txt. Well, we could do that in almost the same way. So here we open up scarlet dot txt again. And then we read it. Except now on lines four and five, we create two different accumulator variables. We have t count to keep track of the number of t's. And we initialize that to zero. And then we have s count to keep track of the number of s's. Like before, we still iterate over every character in txt. But now we say if that character is a t, then t count is t count plus one. If the character is an s, then we instead update s count. So by the end of our for loop, t count is going to be the number of t's and s count is going to be the number of s's. So you could imagine doing this for any number of characters, but for every character that we would want to accumulate, we would have to create a new accumulator variable here. So we might have a count, then b count, then c count, and so on. And you can imagine that if we wanted to count every character in the alphabet, initializing 26 accumulator variables might be just a little bit lengthy code wise. So one alternative way to do this, and it's going to seem a little bit weird at first, but I'm going to get to why we want to do this, is by instead using a dictionary. So like the code before, we open up the file scarlet.txt. And then we read it in. And now I'm going to have one accumulator variable, which is a dictionary. So I say x equals an empty dictionary. Inside of that dictionary, we're going to have multiple key value pairs. So if we want to still only count t's and s's, then rather than saying t count equals zero, I'm going to say x sub t equals zero. And rather than saying s count equals zero, I'm going to say x sub s equals zero. Now these are just different key value pairs in this same dictionary x. So now what we can do is we can loop through every character in our file once again. And again, we say if that character is a t, then our dictionary x sub t equals its previous value plus one. If that character is an s, then x sub s gets incremented by one instead. And again, by the time our for loop is done, then we're going to have x sub t as the number of t's in our dictionary. And x sub s is going to be the number of s's in scarlet dot txt. So when we run our code, then we can see the number of t's and the number of s's. So now I'm going to make one really small change to our code. So here on line nine, this statement is inside of if c equals the character t. So what we can do is we can replace x sub t here. So we can replace the hard coded t with the variable c. We know that this is going to be the same because here we only run this code if c is the character t. So we can say x sub c equals x sub c plus one. And here we can say x sub c equals x sub c plus one, because this line is inside of this L if c equals equals the character s. So in other words, what we're going to just do in the next piece of code is we're going to replace the hard coded s and hard coded t with the value of the variable c. So if we do that, we get something that looks like this. So we say if the character c is t, then say x sub c equals its previous value plus one. Now, because again, this is inside of an if statement, we know that c is going to be t here, but we'll get to why we actually want to make this change in a little bit. Same thing with this Lf. So we know that c is going to be the character s here, but we just replaced the hard coded s with the value of the variable c. If we run our code, we're going to get the exact same result as before. So now I want to go into why we actually wanted to replace that hard coded t and hard coded s with the value of the variable c. So let's suppose that rather than just counting the number of t's and the number of s's in scarlet.txt, we wanted to count the number of every single character. So the number of a's, b's, c's, s's and t's, spaces, exclamation points and so on. So we could do that by replacing line four with a whole bunch of accumulator variables. So a count, b count, c count, exclamation point count, etc. But then our code would get really long and really repetitive because we would need to initialize a separate accumulator variable for every single character that might be in scarlet.txt. Instead, what we're going to do is the dictionary accumulation pattern. So we're going to have one accumulator variable, which is a dictionary. So on line four, we say x equals an empty dictionary. And then like before, we loop through every character in txt. And what we do is we have an if statement to say, if the character c is not in our accumulator dictionary, so if c is not an x. So in other words, if we haven't encountered this new character c yet, then we initialize x sub c to be zero. What that means is that the first time we see the character a, then we're going to initialize x sub a to be zero. The first time we see the character t, then we initialize x sub t to be zero and so on. Now here on line 11, still inside of this for loop, we say x sub c equals x sub c plus one. In other words, we add one to its previous value for whatever this character c is. So in other words, if the character c is the letter a, then we say x sub a equals x sub a plus one. And that says that we saw one more a. If the character is the letter t, then we say x sub t equals its previous value plus one, and so on. So here on line 13, we just print out the number of characters t and the number of characters s on line 14. So what we should find is that we actually get the exact same value for t and s as before. So if I run my code, then we'll see that we get the correct number of t's and s's. But what's great here is that we have more than just t's and s's collected. We can print out the number of any character we want. So I can print out the number of a's just by saying number of a's is x sub a and the number of b's is x sub b. You'll see that our dictionary is keeping track of every single letter that might occur in scarlet.txt. So to illustrate how this works, I'm going to go with just a slightly simpler example. So rather than assigning txt to be the value in scarlet.txt, I'm just going to say txt equals the string michigan. And I'm going to leave the rest of the code the same, except I'm going to comment lines 12 through 15. Now I'm going to run my code in code lines. So you can see line one initializes txt to be michigan. Line three initializes our accumulator variable x to start out as an empty dictionary. And then we're going to loop through every character inside of our string michigan. So c is first assigned to the character m because that's the first character. And we say is c in our dictionary already. In our case, it's not. So we assign x sub c or x sub m to be zero. And now on line 10, we immediately increment the value associated with x sub m. So we say we've seen one m so far. With the character i, then we see that i is not in our dictionary. So we initialize x sub i to be zero. And we set its value to its previous value plus one. With c, it's not in our dictionary. So we initialize it to zero, and then increment it with h, we set it to zero, and then increment it. So you can see that our dictionary keeps building up the number of times that we've seen every given letter. So at this point, we've seen m ic h. And so far, none of these characters were in our dictionary. And so our dictionary was just adding a new key value pair for every letter that we saw. Now the next character is going to be this letter i. So you can see c is the character i. Now the key point here is that i is already in our dictionary. So here this if is not going to execute because i is in our dictionary. And so rather than setting x sub i to be zero, we just increment it on line 10. So you'll see x sub i go from one to two to say that we've seen two eyes so far. In the case of g, it's not in our dictionary. So we add a new key value pair and increment its value and so on. So by the time our for loop is done, then we end up with the dictionary where every key is a character in our text txt and every value is the number of times that we've actually seen that letter. So we'll see dictionary accumulation a lot. And really, I find that it just takes a lot of practice to get used to. It's a little bit counterintuitive at first, but we're going to go through more examples with dictionary accumulation. And I think it's going to make more and more sense with more practice. That's all for now until next time. Welcome back. So let's go over some questions that use dictionary accumulation. In this question, we're provided a string sentence. And the question asks us to split the string into a list of words, and then to create a dictionary that contains each word in the number of times it occurs. And that dictionary should be named word counts. So the first thing that I'm going to do is I'm going to split our sentence into words. We can do that by calling the split method on string. I'll say words equals sentence dot split. And so words is going to be a list of strings. The first item is going to be the second item is going to be dog. The third item is going to be chased and so on. So then we want to use dictionary accumulation. And in this question, we're told to use the dictionary word counts. So I'm going to name my accumulator variable word counts. I'm going to initialize it as an empty dictionary. Then we need to loop through all of the words. So I'll say for every word in words. And we want to say if that word isn't in our dictionary so far. So if word is not in word counts. And if it's not there, we initialize its value to zero. So word counts sub word equals zero. And then still inside of our for loop, but outside of the if statement, we're going to increment the value of word counts sub word. So word counts sub word equals word counts sub word plus one. So let's run our code. And we can see that it produces the correct value by actually print out the value of word counts. So if I say print word counts, then we can see that the word the with a capital T appeared one time. The word the with the lowercase t appeared three times into appeared once, rabbit appeared twice, and so on. This question we're asked to create a dictionary called car D from the string stri so that the key is a character and its value is how many times that character occurred. So this is again a straightforward application of the dictionary accumulation pattern. So I'm going to name my dictionary accumulator car D, then I loop through every character in stri. So I'll say four C in stri. And if I haven't seen that character before, so if C is not in car D, then car D sub C equals zero. And then inside of the four, but outside of the if I say car D sub C equals its previous value plus one. So when I run my code, then I can see that it worked correctly. But let me run this code in code lens just to see again what's going on here. So online one, we initialize stri to be the string. What can I do online three, we initialize our accumulator variable or accumulator dictionary to be an empty dictionary. Then we loop through every character in stri. So we first say C is the character w or the first character. And then we say if C is not in car D, which it is not, so we assign car D sub C or car D sub w to be zero, and then we increment it. Same thing with the letter H. So H isn't in our dictionary. So we're going to add it and initialize its value to one. Same thing with a the character a was not in our dictionary. So we initialize it to one. Same thing with T. Same thing with the character space in C. So at this point, we're at this a, and we're going to see that a isn't our dictionary. We're going to skip this if and we're going to say car D sub a equals its previous value plus one. So we'll see this update to be two. And then we keep adding characters as we go on the second time we get to a space. So at this point, we're at this space, then we'll see this value increment by one, and so on. That's all for now until next time. Welcome back. So we've learned how to do the dictionary accumulation pattern. But now we want to learn some things that we might want to do with our accumulated dictionary. So rarely do we just want to do something like accumulate the number of every character in a dictionary. We typically want to actually do something with that data. So here we have the same code as before. So we open up the file scarlet dot txt, load its contents into the variable txt and initialize our accumulator variable letter counts to be an empty dictionary. And then through this for loop, we associate every key in letter counts, which is going to be a character with the number of times that character appears in scarlet dot txt. So in other words, by the time we get to line 12, then letter counts is going to be a dictionary where every key is going to be a character such as t, and every value is going to be the number of times that character appeared. So in the case of t, it's 17,584. So in our code before, we just printed out the number of teas by printing out there are and then something like letter count sub t teas in scarlet dot txt. We can do the same thing with other characters, of course. So we can say that there are this many A's and B's and so on. But suppose that we wanted to print out the number of every single character in letter counts. Well, we could do that by using a for loop. So we could say for every character. So for every, I'll just call my iterator variable y in letter counts. And then inside of our for loop, we can just say there are and then letter counts sub y. And we can say that there are that many of the character y. So what this code is going to do is it's going to loop through all of the keys in letter counts. And for every key y is going to be the value of that key. And then we're going to get the number of times that character y appears by printing out letter counts sub y. And then we say there are that many. And then whatever that character y is. So when we run our code, we can see that there are this many teas, this many S's, this many spaces, this many capital T's and so on. And you can see that there are quite a few characters in scarlet dot txt. So let's suppose that we had called our dictionary x instead of letter counts. In this question, we're asked which of the following will print out true if there are more occurrences of the character e than the character t in the text to study in scarlet. And false if t occurred more frequently. So this is assuming that our previous code is run, except our dictionary is called x instead of letter counts. So we want to write an expression that's going to be true if there are more ease than teas. So the way that we do that, we get the number of the character e by saying x sub e. This is going to get the value associated with the character e in the dictionary x. And then we get the number of teas by saying x sub t. If we want to know if there are more ease than teas, then we can write the expression x sub e is greater than x sub t. So in other words, the answer here is going to be B. So if you've ever played the game Scrabble, then you know that different letters in Scrabble have different scores. So letters that are more rare have a higher score than letters that are more common. So let's suppose that we want to open up scarlet to dot txt. And we want to figure out what's the quote unquote Scrabble score for this. So in other words, for every character, suppose that we want to know not just how many times that character appeared, but what the Scrabble score for that character would be. So from lines one through 11, we have the dictionary accumulation pattern and it accumulates the frequency of every character in scarlet to dot txt into this dictionary x. And then online 13, we have a different dictionary, which represents the Scrabble letter value of every character. So in a which is a really common letter only has a score of one, whereas a Z, which is a lot less common, has a Scrabble value of 10. So if we wanted to get the Scrabble score in scarlet to dot txt, then what we could do is we could loop through every character inside of the dictionary letter values. So I'll say for every character, I'll call it y in x. And then because not every character has a Scrabble score, so for instance, numbers or exclamation points don't have Scrabble scores, but they're in our dictionary, we want to first check to see is that character y in letter values. So I'll say if y is in letter values. And we want to keep track of what's our letter value so far. So the way that we do that is with the standard accumulation pattern. So I'm going to initialize an accumulator variable, I'll call it Scrabble score, we initialize it to zero. And then we say, if this character y has a letter value, then Scrabble score equals its previous score. So Scrabble score equals Scrabble score plus. And then we want to add the score for that letter. So in other words, you know, a has value one, b has value three, c has value three, and so on. So we get that score by saying letter values sub y. Again, y is going to be a, b, c, d, e, f, g, and so on. And then we want to multiply that by the number of times that that character appears. So I'll say letter values sub y times x sub y. X is again our dictionary accumulator variable from the previous problem. So in other words, we're saying the score is the previous score plus the value of that letter times the number of times that that letter appeared. So this is using the standard accumulation pattern here. Our accumulator variable is Scrabble score. And we update it by saying Scrabble score equals Scrabble score plus letter value sub y times x sub y. So if I print out the Scrabble score at the end of this for loop, then I should expect it to be the actual Scrabble score of Scarlet two dot txt. So you can see that this has a pretty high Scrabble score overall. Let's do some more questions that involve dictionary accumulation and doing something after we accumulate the results from the dictionary. So in this question we're told the dictionary travel contains the number of countries within each continent that Jackie has traveled to. And we're asked to find the total number of countries that Jackie has spent to. In other words, our result is going to be two plus eight plus three plus four and so on. And we're asked to save that into the variable named total. And so in this question we're given a dictionary and it doesn't matter how we arrived at this dictionary. So we might have 20 lines before this that compute this dictionary's value. Or in this case, we're just given the dictionary as a literal object. But regardless of how we get that dictionary, what we want to do is we want to accumulate the sum of every value associated with every key in that dictionary. So we can do that with the standard accumulation pattern. So I'm going to first initialize an accumulator variable total to be zero. So this is initializing our accumulator variable. Then we want to loop through every single continent. So I'll say for continent in travel. So continent is going to be North America, Europe, South America, Asia, Africa, etc. So for every continent in travel, we get the number of countries within that continent that Jackie's been to by saying travel subcontinent. And we want to say total equals total plus travel subcontinent. So this is us updating our accumulator variable. So again, what we're doing is we're looping through every key in our dictionary, and we're getting the value associated with that key. So two, eight, three, and so on, and adding that value to our previous total. And by the time we're done with our for loop, we should have the total number of countries that Jackie has been to. In this question, we're told that schedule is a dictionary where a class name is the key, and its value is how many credits it's worth. We're asked to go through and accumulate the total number of credits that have been earned so far and assign that to the variable total credits. So this is the same idea with just a slightly different dictionary. So our accumulator variable is going to be named total credits, we're going to initialize it to zero. Then we want to loop through every course in our schedule. So I'll say for course in schedule. So again, here, course is going to be you arts 150 Spanish 103 English 125 and so on. We get the number of credits for that course by saying schedule sub course. And we add that to the total number of credits by saying total credits equals total credits plus schedule sub course. That's all for now. Until next time. Welcome back. In this question, we're asked to create a dictionary called D that keeps track of all of the characters in the string placement and notes how many times each character was seen. So in order to do this, we're going to kind of use the standard dictionary accumulation pattern. But this isn't all that this question asks us to do. It says then find the key with the lowest value in the dictionary and assign that key to be min value. And in order to do this part of the question, then we're going to use the standard accumulation pattern. More specifically, actually, we're going to use min or max value accumulation. So let's do this first part, which involves dictionary accumulation first. So I'm going to create a dictionary D, initialize it to an empty dictionary. Now loop through every character in placement, so I'll say for every character C in placement. If that character is not in our dictionary, so C is not in D, then initialize D sub C to be zero. And then regardless, we're going to say D sub C equals its previous value plus one. And now by the time we get to line nine here, we've accomplished this first part of the question. So we have a dictionary D where every character is associated with the number of times that that character appears in placement. Now let's do the second part where we want to find the key with the lowest value in this dictionary and assign that key to be min value. So the first thing that I'm going to do is I'm actually just going to get a list of keys. So I'm going to say keys equals list of D dot keys. So keys is going to be a list in every item in that list is going to be a character. So we might have one item is the character A, one item is the character E, one item is the character C, one might be a space, and so on. So keys is a list of the characters in placement or a list of all of the keys in our dictionary D. And now we want to get the key that has the lowest value in our dictionary D. So we're going to use max value accumulation. In this case, we actually are searching for the minimum value, but it's the same concept. So I'm going to initialize min value to be keys sub zero. I'm just going to do this arbitrarily to give us a starting point. So in other words, min value is going to start out as the first key in our list of keys. So it might start out as something like the character A. And then we're going to search for any key that has a lower value associated with it. So I'll say for every key in keys. Now I'll ask, is the value associated with this key. So if D sub key is less than the value associated with min value. So if D sub key is less than D sub min value, then we say our new lowest key is this new key. So I'll say min value equals key. So again, we kind of have two parts in this question. The code from line three through eight addresses the first part where we create a dictionary with character frequencies. And then lines 10 through 15 address the second part of finding the key with the lowest value. We use information from this first part. In other words, we use this dictionary D in the second part, but we kind of apply the accumulation pattern two different times to solve this problem. So when I save and run my code, you can see that I get the correct values. So in this question, we're asked to create a dictionary called let D that keeps track of all of the characters in the string product and notes how many times each character was seen. And then we find the key with the highest value in that dictionary and assign that to be max value. So again, here we have kind of two different parts. The first part uses dictionary accumulation. The second part uses max value accumulation. So to do the first part, we're going to create a dictionary, let underscore D initialize it to an empty dictionary. Then we're going to loop through every character in product. And we're going to say, if that character is not in let D, then let D sub C is initialized to zero. Then we say let D sub C equals its previous value plus one. So if I print out the value of let D at this point, then I should get a dictionary where keys are associated with the number of times that character appears. Now we want to find the key with the highest value in this dictionary. So in this case, it's going to be n. So I'm going to get a list of keys. I'll say keys equals list of let D that keys. And for every key in keys, actually, first, I'm going to say max value equals the first key. And for every key in keys, if the value associated with that key, so if let D sub key is greater than let D sub max value, then I've seen a new largest value. So I'll say max value equals key. Now, when I run this code, and I should see that I pass every test, and max value ends up being n, which was the correct answer. That's all for now until next time. Welcome back. Take a look in your calendar and mark this date. Today is the day you are crossing over the line from someone who can just write code that does something to being a real programmer, someone who can abstract from a bit of code that works on one piece of data to writing a function that will operate on any similar piece of data. Today is your day. I hope you'll celebrate in some way. Have a special treat, brag to your friends, or at least make a celebratory post in the forums. At the end of this lesson, you will be able to define functions with appropriate names for formal parameters, identify formal parameters and parameter values in a code sample, and you'll be able to predict the return value of a function given sample parameter values. See you next time. Welcome back. The basic syntax for defining a function is the word def, def, it's short for define. Then you get any Python variable name, and then you have an open and closed parenthesis. Inside the parentheses, we're going to see later that you could have more variable names, but we don't have any for this function. Then we've got a colon. Below that, you've got an indented block of code. You've already seen indented blocks of code with the while loop, the for loop, or if statements. It works the same way for function definitions. All the lines that are indented by that same number of spaces, they're all part of the function definition for the hello function. And when we get another line of code down here on line five that's outdented at the same level as def, that's going to be the end of the function definition. We have an optional comment on the first line of the function. If it's included, it's called a doc string. And there's some tools in Python for automatically generating documentation of a program. They'll show the doc strings that are associated with functions. It's often a multi line string. That's why we use this triple quotes. We could have some more doc string here. The triple quotes lets you have a string that goes onto multiple lines. Now when we execute just the code that we have here, nothing is going to print out. Even though there are two print statements on lines four and five, they won't actually be executed. All that happens from lines one through five is that the function object gets created. That function doesn't get executed. Lines four and five here are only going to execute if we invoke the function. I'm going to have some code that's outdented at the same level as the def. And I'm going to invoke the function hello. Now when we execute it, lines four and five will run. It says hello, glad to meet you. Suppose I print something that says we are here, and then another invocation of hello. See if you can predict what we're going to get in the output window. So what we get is the first two lines come from our first invocation of hello, and then we get that we are here, and then we get two more lines that come from the second invocation of hello. Let's see that in code lens. So you can see that executing lines one through five just creates a variable called hello, whose value is a function object. It doesn't execute that function object. When I invoke the function on line seven, it passes control to the function. So we then actually execute the contents that are inside of it. So we execute line four, which prints out hello, we execute line five, and now we're done. So we're going to resume back here after the execution of hello. So we get to line eight, and we print, we're here, and then we have another invocation. So that passes control to the function. It does its stuff, and then it passes control back after line nine, and we're at the end of the function execution. So that's the basics of function definitions. The syntax, we start with the word def, and then we have a function name, and then parentheses, and then a colon. We've got an indented block of code for the contents of the function. Executing the def statement, that just creates the function. It doesn't execute it. We need other code afterwards, like we have on line seven. When we do a function invocation, it passes control to the function, and then line seven, when we do a function invocation, it passes control to the code for the function. So those lines of code inside the function get executed, and then we resume right after the spot where the function invocation happens. See you next time when we add formal parameters. Welcome back. A function can be defined so that it does different things, depending on parameter values that are passed to it. For example, here's a definition of the function hello2. The only difference in this function versus our previous hello is that we have another variable name inside the parentheses. We've called it s. That's called a formal parameter name, any variable name inside those parentheses. So sometimes we call it a formal parameter or a parameter name. Then when we invoke the function down on line five, we'll pass some string into it. In this case, we're going to pass the string iman. By the way, that's iman yakizare. He's the guy who developed the practice tool that I hope you're using a lot. When we do that invocation, the variable s gets the value iman at the beginning of the execution of the function. It's like having a behind the scenes assignment statement. It's as if we have a line of code that says s equals iman. And then in the rest of the execution, we'll be able to refer to the variable s. So let's actually do this using code lens. We first create the function and assign it to the name hello2. Then we get to line five. We're going to invoke the function. But behind the scenes, there was this assignment. s is now bound to iman so that when we print hello plus s, we get hello iman in the output window. And then it says glad to meet you. And we're done with that invocation. The second time we invoke it, we're going to get s bound to jacky because jacky is the value that is that's passed in. So now s is bound to jacky and we'll get jacky's greeting appearing in the output window. Hello, jacky. Often we'll refer to the parameter values as inputs. Don't confuse that with the input function. Remember, the input function asks the user to type in a value. Here, we're talking about a value that is passed into a function as an input to it. So this jacky, it's an input to the hello2 function or sometimes we'll call it an input parameter or a parameter value. A function can take more than one input parameter. Here, the function hello3 has two formal parameters s and n. We're going to refer to both of those parameters. s we refer to on line two and we refer to on line three. Remember that star when applied to strings means to repeatedly concatenate the string together. So the greeting is going to be something like hello way or hello kitty and we're going to concatenate that together to itself a bunch of times. So it just repeats that string n times. Let's step through that code. We create the string that we create the function rather and then we call it. The first time we call it, we're passing in two parameter values, way and four. The very first parameter value automatically gets matched with the first formal parameter name. So we get s bound to way. The second value goes to the second parameter name. So n gets the value four. We call this positional parameter passing. The first parameter value goes with the first parameter name. Then when we execute on line two we'll get a value for greeting that's hello way and on line three we will print that out a bunch of times. You can see down here we have said hello way four times. The next time we execute this we have s is bound to an empty string and n is one. So our greeting is just going to be hello with a couple of spaces. We're going to print that out one time and it looks just like hello on the output because we can't see the spaces. The last time we invoke this we've got kitty and eleven as our two parameter values, our two inputs and so we get hello kitty, hello kitty, hello kitty, eleven times. In summary, formal parameters sometimes called parameter names or input parameters those are inside the parentheses in the function definitions. So s and n are our formal parameters and then we have parameter values or sometimes they're called arguments or actual parameters. Those go inside the parentheses on the function invocation. They get matched up positionally first parameter value with first parameter name, second parameter value with second parameter name. The values are bound to those parameter names at the beginning of the function execution with this behind the scenes assignment. It's as if we have s equals way and n equals four and that those assignment statements are executing behind the scenes at the beginning of the invocation of hello three. We'll see you next time for functions that produce outputs via return values. Welcome back. One metaphor I find helpful is to think of a function as a machine. It may take some inputs, the parameters and then when you run it that's the kind of shaking that you see in this in this animation it can produce an output. So let's look at how to make a function return a value. So here's a definition of the square function. Notice line three. We have a special word return and then afterwards any Python expression in this case just a reference to the variable y. The value of that variable is some Python object and that's the value that will be returned from the function. Let's take a look at line six. We're invoking the function square. We're passing in an input, an argument. That's going to be bound to the first formal parameter name x and then we'll execute this function. We're going to multiply whatever x is. x is going to be whatever just square was and we'll multiply it by itself on line two and assign it to the variable y. We're going to return a value and whatever is returned becomes the value of the entire expression. So that is what is going to be bound to the variable result. Note that the word result here is not any kind of special word that's just a variable name. Return is a special word. That means to return a value from the function. So let's see that in code lens. So we first define the square function. Then we get to square is bound to the value 10. That happened on line five. Then on line six we're going to invoke the square function. So behind the scenes we're going to get an assignment that the variable x is going to get whatever value to square had. So to square was 10. And so x is going to have the value 10. We then execute the code that's inside the square function. We get y is 10 times 10 or 100. And then we return y. So the value of this entire expression square of two square becomes 100. And that is the value that gets assigned to the variable square result. Now if you were paying close attention there you will have seen that when we finish the execution of the square function some of the variables disappeared over here. We had them and then they disappeared when we finished. We're going to talk more about that in later screencasts. So hold on for some of those details. Now if a function doesn't have a return statement it will automatically return a special value called none. Let's see that. Here's an example illustrating a common confusion for students. Printing a value doesn't return that value. So on line three we really should have a return statement in our definition of the square function. We've calculated 10 times 10 or whatever it is and we should be returning it but instead we're printing it. If we don't have a return statement at all in the function definition it says if we had a statement at the end saying return the special value none. So let's see what's going to happen in this execution. We define the function square. We set to square to have the value 10 and now we invoke the function. X gets as its value and our behind the scenes assignment X is getting whatever the square had. So X gets the 10 that to square was. So X has the value 10 and we execute Y gets the value 100 and then we print Y. So we can see in the output window we've already printed Y but we didn't return anything and so it's as if we have returned the value none. And that means that square result is going to have as its value none. Now you see here square result has the value none and that means that when we go to print this formatted statement the result of something squared is something we're going to get that the result of 10 squared is none rather than the result of 10 squared being 100. What we really needed to do was not print but return Y and then we would have gotten that the result of 10 squared is 100. One other important thing about the return statement it interrupts the execution of the function. No other code in the function executes after a return statement is executed. If you've played the board game monopoly it's like the go to jail move do not pass go do not collect $200. In our case do not execute any more code inside this function even if you're in the middle of a for loop or an if statement a return statement makes you skip the rest of the code in that function. So here's an example to find this function weird not actually useful but it's going to be very illustrative of how this immediate return works. So lines one through five to find the function line six invokes it. So when we invoke it and we get to line two we'll print out the word here and then we'll return the value five. As soon as we return means we are not going to do anything with the remaining lines of code they are orphaned we will not pass go we will not execute print there we will return the value five and X will have the value five and we will print five. So we do get the here printing we don't get there there never shows up in our output and the five is coming from line eight. That's not a very useful function which is why I called it weird but let's look at a common programming pattern where return does occur in the middle of a function and it's a good idea. This function called longer than five returns a Boolean value true or false. It takes as an input this one formal parameter called list of names it's expecting that to be a list of strings if any of those strings has more than five characters the function returns true and otherwise it returns false. So with list one all the names Sam, Tara, Sal, Amida they're all five characters or less return false because none of them have a name longer than five characters with list two both Lauren and Natalie are names that have more than five characters and so longer than five will return true. How does it work? Well we loop through all of the names that's what's happening on lines two through five and if we ever get a long one if the current name is long then we will execute line four and we will return the value true. As soon as we do that we're done with the for loop we're done with everything do not pass go don't collect $200 get out of this function and just return the value true. If however we managed to get through the entire iteration without ever exiting the function without ever executing line four then we will execute line six which says to return false. So the reason this logic works is that we're never going to get to line six if any of the names are long. We'll only get to line six if we've processed all of the names and none of them caused us to return true. So that means that none of the names were long and it's safe to return false. So let's see this in code lens we'll go step by step. So we define this function we invoke it the first time on the list one Sam, Tara, Sal and Amita. So behind the scenes we get list of names getting bound to to that list then we're going to start iterating through that list of names and the first iteration our variable name is bound to Sam. Is it longer than five characters? No it's not. So we go on and get a new value for name. Name is now bound to Tara. Tara is not long. Sal is not long. Amita is also not long and now we've iterated through all of the list of names and we get to line six where it says I guess none of them are long so let's return false. The second time we invoke it we get the other list of names and now when we iterate through Ray is not longer than five, Io is not longer than five but Lauren is longer than five and so finally line four executes we return true we never even get to look at Natalie but Lauren was long and that's enough to know that one of them was too long and we get an answer of true. So in summary to return a value from a function we have the word return and then some expression that evaluates to a value in this case the value false. If we're in the middle of the function and we have a return we ignore the rest of the code in the function we exit out of the for loop we exit out of the if we don't pass go we don't collect $200 we're just done with this function if we have a function that never executes any return statement at all we will return the special value none and the other thing to remember is that print is for people you can use it inside of a function and it'll generate an output in the output window but it doesn't cause anything to return from the function return is for functions it doesn't print anything but it returns a value to the spot where the function was invoked see you next time welcome back for a little way of the programmer segment on how to decode a function here's a habit that I hope you'll cultivate whenever you see a function definition or whenever you write one decode that function definition ask yourself three questions first how many parameters second what types of values will be bound to those parameter names and third what is the type of the return value so let's work through a few examples here's a function c y u 3 how many parameters does it have that's our first question whenever we try to decode a function and it's the easiest one to answer because you can just look in the side the parentheses and you can see that there are three variable names separated by commas so three inputs and that's exactly the question that's being asked sure three inputs x y and z the next question is what will their types be so what are the types of x y and z the first question is asking you is for x and y because it turns out they have to have the same type how can you tell that there's nothing in the function declaration that tells you that so you have to look in the code and see how those variables are used in particular we have x minus y in this expression what are the types of objects that you can do a minus on the answer is numbers they could be integers they could be floats strings not so much and not less either so let's try that integers and floats sure enough we are correct how about the type of z well we have to look where z is used and z is used down here in the else we're running an append method on it which kinds of objects can you do append on and the answer is only lists remember that lists are mutable you can do append strings you're not allowed to so c is correct if i had answered d i'd get a little feedback telling me append can't be performed on strings and then the final question for decoding a function is what kind of return value does it give you and there are two spots in this code where we're returning a value either y minus two or x plus three we previously inferred that both y and x had to be numbers either integers or floats and this doesn't tell us anymore about whether they're going to be integers or floats minus and plus are both operations that work on both of them so the return value could either be an integer or a float your debugging sessions will be a lot shorter if you can always answer those questions about any function that you're working with so build the habit whenever you see a function decoded by figuring out how many input parameters and what are their types and what kind of value will be returned see you next time welcome back let's take a snippet of code that we've seen before and turn it into a function so we can invoke it at any time instead of copying and editing the code we've seen something like this before we're trying to eventually get to defining a function total but here's some code that computes the total of a list the sum of all the values and it uses the accumulator pattern here we've got a particular list one five and seven we start our accumulator with zero we iterate through the list each time updating the total the accumulator gets its old value plus the current number so the accumulator totes starts at zero and it ends up being zero plus one for one one plus five makes six six plus seven makes 13 and we print out 13 now the question is how can we make it work for any list of integers rather than just for one five seven you can see here that the problem that we're being asked to do is define a function called total and so we're getting an error because total wasn't even defined we never created the function called total so let's do that let's define a function called total and we'll put all of this code inside that function so we're defining a function total it's going to have some parameters we're going to come back to that in a second and then all of this code we'll put inside here and we're going to have to make some adjustments the first thing is that we want total to take an input we don't want it to just work on this particular list of one five and seven we want it to work on any list so let's make that be a formal parameter when the total function gets invoked that's when one five seven or some other list will be specified so we're going to have that behind the scenes assignment to whatever list we want to run this on rather than having the assignment to a particular value and then of course we need to let's invoke total we're going to invoke it and assign the value to a variable called y now let's see what happens in code lens with this so we first get the variable total bound to the function we've defined the function and now we're invoking it on the list one five seven so there's a behind the scenes invocation the variable lst now gets bound to the list one five seven and we're ready to execute the lines of code and we're going to set tot equal to zero and we're going to iterate num is first bound to one then it'll get bound to five and then to seven this time is bound to one and tot gets updated to be zero plus one now num is five tot gets updated to be one plus five and so on now we're printing out that total and we're returning it oops no we're not so this is that gotcha that often gets people when they're new to writing functions that we don't want to print out the total inside the function we want to return the total what'll happen here is that y is going to be bound to none when we really wanted y to be bound to 13 so the way to fix that is instead of printing the total we're going to return the total and before I do that I just want to show you one other thing that's kind of instructive I hide code ends if I run this I'm going to get errors because our function is not right but I'm also going to get a lot of outputs the reason is and this can be kind of confusing we have tests when we give you these exercises for example writing a function named total we have some tests where we behind the scenes are invoking the function total and checking to make sure that it's giving the right output so for example we're invoking it on the list one two three four five and we're expecting the total to be 15 one plus two plus three plus four plus five but the actual value that's getting returned is none and so we're getting an error but each time we invoke the function our line five is running and so we're actually printing out the 15 the 13 is coming from our invocation but the 15 is coming from when there's that behind the scenes invocation for the test then we have zero when we invoke it on a list of zeros and we have another zero when we invoke it on an empty list and we have the list of just two it's printing out every time but it's returning the value none and so our tests are failing let's fix that let's return tot and return does not use parentheses it's a statement not a function call now when we run it we will pass the tests and we will not print out anything here in the output window because we had no print statements if we wanted to print out y we could get the value 13 to print the 13 prints but we don't get any printouts from the rest of the tests all of those cause the code inside total to run but it doesn't have any print statements so we don't get any confusing outputs there let's look back at what we did we started with our accumulation snippet with just one hard coded list and just to remind you so we started with just this hard coded snippet that worked to find the total of a particular list then we converted it to a function definition the list became a parameter of the function rather than being hard coded as a particular one the specific list that we were originally working with became an input or an actual parameter it gets bound to the formal parameter and then we needed to change the print to return we needed to return the total and print it out here rather than doing a print inside the function this is a common process for abstracting from a bit of code that works on a particular value you make that value be a formal parameter name of the function you pass specific values in when you invoke the function and the challenge is to make the bit of code inside the function be more general so that it works on any possible input in this case any list of numbers see you next time hooray you now have the tools to write reusable functions rather than just one off bits of code you should now be able to define functions with appropriate names for formal parameters identify formal parameters and parameter values in a code sample and predict the return value of a function given sample parameter values i'm going to assume that you've managed to work your way through the exercises we've given you for this lesson and so by the power vested in me by the University of Michigan and by Coursera I hereby dub you programmer with all the privileges and responsibilities that entails let's finish with a joke about functions why did the functions stop calling each other because they had too many arguments see you next time welcome back in this lesson we're going to highlight a few subtleties with functions including that each execution gets a fresh set of local variables that disappear at the end of the function execution that functions can call other functions and that functions can have side effects on mutable objects at the end of this lesson you should be able to one avoid the use of global variables and function definitions by creating formal parameters for all values that are needed and to identify whether a function has any side effects including mutations to lists and dictionaries we'll see at the end welcome back the scope of a variable is the set of statements where a variable name can be accessed in the function square on line two we assign y to have a value since that assignment is inside the function definition the scope is only the rest of that function definition on line six we really can't refer to y in fact we really can't refer to y we get an error on line six the error message is telling us the name y is not defined and that's true even though line two will already have executed line five causes the square function to run so line two will have executed but even so we say that that variable y was local to the function square and its scope is local we can't access it out here on line six now suppose we had a variable y at the top level outside of any function definition say we did y equals five now when we get to line six y will be defined but the value will be five and not a hundred what we get is the value five that we that we assigned on line four and not the value one hundred that we would have assigned on line two it's going to help to introduce a little vocabulary here the idea of a namespace a namespace is an environment where all names are unique in the city of an arbor michigan we can only have one main street in a single stadium boulevard but names can be reused in other namespaces other cities have a street named main street but it doesn't refer to our street it refers to theirs in python there was one global or top level namespace and then each invocation of a function creates a new namespace in code lens we show these namespaces as stack frames there's a global frame where we've defined square and we've defined y and then we invoke the square function that creates a new stack frame and code lens is nice enough to label it for us it says this is the stack frame for the invocation of the square function the behind the scenes assignment where the formal parameter name x got its value of 10 because we passed in the value 10 for it that goes into the local stack frame that's in the local namespace for the invocation of square then we set y to be 100 that's also in the local stack frame so it's really interesting to notice here we've got y is five we've got y is 100 this works just fine because they're different namespaces just like we have main street in an arbor and main street in another city we can have y in the global frame and y in the local frame so here's the gory details of how variable lookup and variable assignment work with namespaces i'm just going to adjust this example slightly the first rule is if you refer to a variable in the code inside a function if that variable name is ever assigned a value inside the function definition even further on in that function definition then it treats the reference as local if it finds value on the local stack frame it uses that value and otherwise it's an error so here it's going to be an error on line two we are referring to the variable y y is assigned a value somewhere in the square function which means that y should be a local variable throughout but when we are on line two y doesn't have a value it does not go and get the y equals five from the global stack frame it just says y is referenced before assignment on line two so that's an error if however the variable name is never assigned a value inside the function definition then it will look in the global stack frame suppose i do w equals q plus one well let's show it in code lens so we can really see what's happening define all these variables in the global frame and now we start running square so we get to line two it tries to look up q q is never going to be a local variable in the square function so therefore when we look up q you don't go look it up in the global frame and we get seven add one to it and w gets the value eight so if we look up a reference to a variable and that variable is ever local inside the function it's got to look it up only in the local frame it's never going to be a local variable in that function then it looks it up in the global frame now this can be pretty confusing and your programs will be a lot easier to understand if all of your variable references are just references in the local frame so my recommendation is avoid referring to global variables inside a function definition don't do this it's legal python but it's not a good idea make another formal parameter for the function and have whatever value you need to get passed in get passed in as a parameter to the function so that's how variable lookups references happen inside of functions what about assignment inside a function assignment is always done in the local namespace the stack frame for that function call so if we go back to our original code notice that line two creates a variable y on the local stack frame so there is already a y in the global frame but when we get to line two we create one on the local stack frame so the rule for assignments is that we're always assigning to a local variable there actually is a way to force look up an assignment inside of a function to use the global stack frame i'm going to show you how to do this for your understanding but don't do it in your programs it'll lead to confusing programs that are hard to debug other programmers will shun you you'll have no friends and you'll live a miserable life okay maybe it won't be that bad but just don't do this i'm going to show you for your understanding i can declare y to be global if i now say w equals y plus one again don't do this just so you can understand it y is five on the global frame we say please treat y as the global y now line three works because we can get to the y equals five and w gets to be the value six now we're going to change y in the global frame and we're going to return our 100 so this is doable but it leads to very confusing code that's local and global variables for you inside a function definition variable assignments create local variables they can't be referenced outside the code for that function and that includes the formal parameters of the function those will also be local variables variable references use the local version of the variable you can reference a global variable inside a function either by explicitly declaring it global or by referencing a variable name that is never assigned a value inside the function but your life will be better if you never do that and make all of your references to variables inside of functions the local references we'll see you next time welcome back you've already seen that a function is a useful way to abstract from a bit of code to something that has a name and has parameters the parameters help to make the code more reusable you pass in a different parameter value and get a different result and the name makes it so that you can refer to the whole action of the function with a single name it makes your code easier to read and understand in this video we're going to carry the abstraction process one step further we're going to decompose what a function does into a set of actions and some of those actions may be implemented in other functions we'll just invoke those other named functions so in other words inside of a function's code we're going to invoke another function for example here we have a function sum of squares that takes three inputs x y and z it takes the square of each one aligns six seven and eight and it adds them all up and returns the sum and in order to get the square of each of those inputs we invoke the square function which is defined up here so we're inside the function sum of squares we're invoking the function square let's see that execute in slow motion so at first we define the two functions and then we set values for a b and c those all get set in the global frame and now we're ready to invoke sum of squares so we get a new stack frame and we get the values for the parameter names so x gets whatever a used to be bound to or still is bound to so that minus five came from there y gets the value that b had which was two z gets the value that c had which is 10 now when we execute sum of squares we get to line six and on line six we are invoking the square function we're invoking that inside of the definition of sum of squares so when we invoke it we get another stack frame so this is going to give us a local namespace for variables for the execution of square and it's also interesting to note that we have a variable x whose value is minus five in sum of squares there's also an x whose value is minus five but these are different variables so we could change the value of x in square and that would not have an effect on sum of squares in fact we'll see that because we do a similar thing with the variable y on line two we create a value for y in the square stack frame so now y has the value 25 there's also a value for y in the sum of squares frame but it has a different value two and we have not changed anything here once we continue the execution we'll go forward the square stack frame is going to disappear we're going to return the value 25 but the y in sum of squares is completely unaffected that explains the mechanics now let's try doing functional decomposition to write a little bit more useful function remember from last week when steve taught you about dictionaries you had some code to accumulate a dictionary of counts i've turned that into a function count freaks short for count frequencies it takes one formal parameter st short for string to remind me that the input parameter is going to be a string and it's going to count frequencies for whatever string is passed in now the code should look a little bit familiar from last week we start by making an empty dictionary we iterate through all of the characters in the string if the character is not yet a key in the dictionary meaning this is our first time seeing it we add it as a key in the dictionary and we give it an initial value of zero then regardless of whether we've seen this character before or not we're going to increment the counter that's associated with it so if this was the first time we saw it it starts with a zero and now it's got one for its count but if we've already seen this character say three times before the count will get updated to be four so the new things here over what you saw last week are that we've turned the the bit of code into a function and before you just saw how to do this with a particular string now we're making it work for any string we've made the string be a formal parameter of the function and the other thing that's different is instead of printing out the result at the end or assigning it to a variable we just return it and then whatever code is invoking count freaks will decide what to do with the return value similarly you had code last week given a dictionary to find the best key in the dictionary to find the key that had the maximum value associated with it so again i've turned that into a function i called it best key and it will take any dictionary as an input that dictionary should have keys and values where the values are numbers and it's going to return the key that has the highest value given both of these functions count freaks and best key i can now compose a bigger function most common letter so it's going to take a string as an input and just to distinguish it i've chosen to give it a different parameter name i could have called it st just like i used for count freaks but i decided to call it s here and we're doing this in two steps in step one we count the frequencies in s which creates a dictionary and i've assigned that to a variable called frequencies i pass that dictionary into the best key function and what i get back is one key the key that has the highest value i return that and that should be the most common letter in our string s so let's run this and make sure that it actually works down at the bottom i'm invoking most common letter on line 21 and i'm passing in a string that has a a bunch of b's a bunch of c's you can see that it's got more b's than any of the other letters and what we should get as our output is just the letter b and sure enough that's what we get if i change my string and give it a whole lot more c's now when i run it i should get the letter c instead and sure enough i do now i've described this as a composition process often though you'll actually solve problems in the opposite order by decomposition you might start by saying hey i want to find the most common letter so let's decompose that into first finding the frequencies of all the letters and then picking the one with the highest frequency when i start that decomposition process those functions may not exist yet but i just refer to them by name and then afterwards i fell in definitions for them so really how i wrote this code i started by defining most common letter and i referred to the functions count freaks and best key even though those functions didn't exist because i'd given them names that made me know what they would do i was able to write that function and have it be clear what it was going to do and then i had to go and fill in the other two functions as a little aside you may be wondering why it works to have online to a reference to the count freaks function which has not been defined yet when we talk about executing python code from top to bottom so we shouldn't be able to refer online to to a function that isn't defined until line five the reason we don't get an undefined variable error is that even though online to we're referring to count freaks we don't actually execute line two until after we invoke online 21 the most common letter function so by the time we actually execute line two count freaks has been defined if we were not inside a function definition we really would have a problem for example if i say print of x and then x equals four i will get an error because online one x is not defined the difference is that inside a function we're not actually going to refer online five to count freaks until line five executes and by that time the count freaks function has been defined so to summarize functions can call other functions it's called composition you get multiple stack frames when that executes each one has its own name space and its own local variables as a problem solving strategy it's helpful to decompose define a function by referring to other functions that don't yet exist and then write those functions see you next time welcome back there's one more tricky thing that we want you to be able to reason about with functions it's called side effects if a function makes a change to a mutable object like a list or a dictionary that's called a side effect the vocabulary here is that the main effect of a function is the value it returns and any other lasting impact that it has is a side effect one other thing that i'll refer to as a side effect is printing something out in the output window let's see a side effect of mutating a list first this code gives us a little this code gives us a little reminder that variables are local so we create these two functions we make a local variable y that's in the global frame and then we invoke the double function the double function takes as input some value which it assigns to a formal parameter y so notice that we have a different y they happen to have the same value but they are completely different variables here and then we're going to assign on line two of the code a new value for y watch what happens in the double stack frame it's not going to affect what happens in the global frame so when we execute line two in the double frame y now has the value 10 that did not have any effect in the global frame you might also have noticed that the double function doesn't have a return statement and therefore it returns the value none it doesn't actually matter in this case because we don't do anything with the value of double we're not assigning it to anything when we do finally get to line 10 and we're going to print out the value y we get the value y from the global frame so it's the value five that's going to print out not the 10 that we had in the local frame for the double function and sure enough we get the value five now that was just a reminder that we have local variables and changing a local variable doesn't affect the global variable but our next lesson is that even though we don't affect a global variable we might affect a value that is shared by a local variable and a global variable so variables are local but objects are not here we're going to see that we mutate an object inside a function and it stays mutated on line number 12 we've created a list with four elements our students on line 12 we've created a list called my list its value is this list of four strings our students are awesome that's you and then we're calling change it on line 13 we call change it that creates a new stack frame for the change it function and its formal parameter lst gets a value the value it gets is whatever my list had is its value well my list was pointing to that list object and so the lst variable in the change it stack frame that lst variable also points to the same list this is important because now when we get to line number five list square bracket zero gets a different value instead of being our it becomes michigan and instead of students we get wolverines so notice that my list and list these are still two different variables but because they're aliases for the same list when we finish this execution and then we print my list my list has the mutated value the variable my list is pointing to the object which has been mutated and we get michigan wolverines are awesome instead of our students are awesome so we call this a side effect the change it function is having a side effect on the list object just as we talked about earlier when we first introduced the idea of multiple aliases for the same list object this can get confusing if you're not careful sometimes it's clear that a function is going to have side effects and you expect it but sometimes you'll be surprised and debugging can get difficult what happened to my list you'll ask to avoid potential confusion it's best to just avoid side effects in your functions whenever you can if functions never ever have side effects that's a style called functional programming there are programming languages built around that principle of functional programming but python is more flexible and we will sometimes make use of side effects but you should do it sparingly and consciously see you next time few you've now seen some of the subtleties of passing parameters accessing global variables don't do it functions calling other functions and functions having side effects you should now be able to avoid the use of global variables and function definitions by creating formal parameters for all values that are needed and you should be able to identify whether a function has any side effects including mutations to lists and dictionaries as much as possible i encourage you to avoid side effects come as close to strict functional programming as you can speaking of functional programming why did the functional program return her tv because she kept muting the sound by accident she returned the tv and asked for one that was immutable okay that one was a stretch we'll see you next time welcome back this will be a short lesson i just want to introduce a cool feature packing and unpacking it doesn't really let you do anything new but it lets your code be a little more readable at the end of this lesson you should be able to one recognize when code is using implicit tuple packing and use implicit tuple packing to return multiple values from a function and you should be able to read and write code that unpacks the tuple into multiple variables see you at the end welcome back in many places where the python interpreter is expecting a single value but the code provides multiple expressions separated by commas it automatically packs all those values into a single tuple this is just a convenience that makes the code look a little nicer it looks like you're working with multiple values when really it's just making one tuple out of them for example in this code we have several pieces of information about an actor julia roberts and a movie she was in duplicity we can explicitly make those different pieces of information into a tuple using parentheses like we do on line number one so we have open parenthesis there closed parenthesis this is our usual way of creating a tuple in between we have a bunch of values there's julia roberts 1967 all of these values are separated by commas to indicate where one expression ends and another begins so that's the syntax that you've already seen for creating tuples it turns out that line three does exactly the same thing it's exactly the same except that we've left out the parentheses no more parentheses here and it implicitly reads that as oh we got a bunch of values we got to pack it into a tuple on line four you can see as with any tuple julia is the name of a variable we look up its value its value is a tuple square bracket four says go and get the fifth item so julia is one two three four five so what prints out here should be 2009 so lines one and line three are just synonyms for each other line three is just a way that maybe looks a little bit better as if you have multiple values one place where this is especially useful is when a function wants to return multiple values for example in this code circle info is a function and it wants to return two values the circumference of the circle and the area of the circle you can only return one value from a python function but that value can be a tuple as we've done here so that works fine i can print the circle information of a circle with radius 10 and it has a circumference of 62.83 and a an area of 314 but i also have the option if i think it looks better to leave the parentheses out and it will automatically return a tuple and i'll get exactly the same output there you go so that's tuple packing we'll see you next time for unpacking welcome back last time you saw that on the right side of an assignment statement multiple expressions separated by commas will be automatically packed into a tuple so line one the variable julia gets bound to a tuple square bracket zero will be julia square bracket one will be roberts and so on but instead of referring to the elements of that tuple with square bracket zero and square bracket one we can unpack them all at once into a bunch of different variable names so we've put a bunch of variable names here on the left side and what that does it's as if we've said name equals julia square brackets zero surname equals julia square brackets one and so on so it's just going to take julia which is a tuple and take all of the values from that and put the first value that goes positionally just like we do when we're calling functions the first value goes to the first variable name the second value goes to the second variable name and so on notice however that the number of variable names on the left hand side of the assignment statement has to equal the number of values that are on the right hand side if i run a b c d equals one two three i will get an error we need more than three values to unpack on line one because we have four variable names so if i have four variable names that'll be better it works but if i have five variable names i also get an error too many values not enough variable names there's even a way to pass a tuple to a function and have them automatically unpacked into the parameter names in this code we've defined a function add that just returns the sum of its two inputs if i say to add three and four that will give me seven now i can take five and four and i can tell the add function to get your values from the tuple z on line six i'm saying to add the contents of the tuple z and it unpacks the first element of z into the variable x so x gets five y gets four and we see that nine prints out now i had to do a special little notation in this asterisk in order to tell the python interpreter that i wanted z to be treated as a tuple whose components would be unpacked and assigned to the two parameter names x and y if instead i tried to just say add of z i'm going to get an error because what it's going to try to do is treat z as a single value assign that value to x and then it looks for what value should i give to y and there's nothing for y so i get an error here add takes exactly two arguments it has two parameter names x and y that's what the error message is telling us but only one was given we only gave it a single value the tuple five four that didn't work there was no problem on line six though because the star z said hey even though it's a single value a tuple five four you should unpack that value and five should go for the first variable name and four should go for the second one unpacking is particularly useful for making iteration code more readable for example when you iterate over a list of tuples in this code we iterate over the key value pairs in a dictionary so we ask for all of the items from d so that's going to give us a tuple with k one and three and then another tuple with k two and seven and so on we get a sequence of all of these and we're iterating through that sequence so in line four the value of p will be one of these tuples and we can ask for p square bracket zero to get k one and p square bracket one to get three let's see what happens when that runs we have this key k one value three that's coming from the first tuple remember we have this format string where we're substituting in p square bracket zero there so we get key colon and then we're substituting in k one and then just the word value colon and we substitute in whatever p square bracket one is at this point in the string so we get the three on the next iteration p is bound to k two and seven and so we get key colon k two value colon seven reading that code is a little hard because you have to remember that when you're using this p square bracket zero let's see p was a tuple that tuple was representing a key value pair in the dictionary and therefore p square bracket zero must be the key we can use mnemonic variable names to help us make that a little bit more readable let's do this in two steps suppose i say k equals p square bracket zero and v equals p square bracket one then i can k and v here and this is just if i pick a good variable name like k for representing a key and v for representing a value it's easier to read line six and remember that what i'm printing out is the key and the value and once we have that idea we can go even farther and instead of unpacking ourselves on line four and five we can have python do it for us so we can stick two variable names here k comma v and it's going to automatically take each item and unpack it into the two variables k and v this is going to give me exactly the same result so that's unpacking you've got one tuple and multiple variable names the tuple has to have the same number of values as the number of variable names we can do that with explicit assignment where there's an equal sign or we can do it when there's behind the scenes assignment like iterator variables in a for loop as you've seen here k and v are having behind the scenes assignment to a tuple and we're getting them to get unpacked into the two variable names k and v there's also behind the scenes assignment whenever you have parameter names in a function and we can also do the automatic unpacking there but that requires you to use the star notation to tell python to do the unpacking we'll see you next time well that was quick you now know how to recognize when code is using implicit tuple packing use implicit tuple packing yourself to return multiple values from a function and read and write code that unpacks a tuple into multiple variables for a quick lesson a quick joke you know what i love doing more than anything trying to pack myself into a small suitcase i can hardly contain myself see you next time welcome back so you've already seen the for loop as a way to iterate over every item in a sequence a while loop is a much more general way of iterating so a while loop is kind of like a hybrid between a for statement and an if statement so a while loop looks like this you say while and then just like an if statement you have a conditional that comes after the while and then you have a piece of code that will run if this condition is true but unlike an if statement where after you're done running this piece of code if you use a while instead then by the time you get to this end of code then we loop back up here and check now is this condition still true so in the sense that it loops back up to the top that's kind of like a for statement so again when we get to a while statement we check is this condition true if the condition is true then we run this piece of code and then we go back up and check is this condition still true so if this condition is not true just like an if statement then we skip this block of code and execute the next code block so let's see this in action so here i have some code that's going to take the sum of numbers one plus two plus three plus whatever n we pass in so here we pass in a number called a bound as an argument to our sum two function and we're using a while loop in order to do this so on line 11 we print out the value of sum to four which should be one plus two plus three plus four which is going to give us 10 and then here we print out the sum to 1000 which is going to be a much larger number so first i'm just going to run my code just to make sure that it still works correctly and good so i see i get 10 here and i get this number from line 13 okay so now that we've done that let's look a little bit at how it works so you see that we initialize two variables here so we start out the sum to be zero and this is going to be kind of akin to our accumulator variable so what it's going to keep track of is the sum so far and then we have this other variable which we call a number and what this is going to keep track of is where we are and that's going to start out as one but then we're going to set it to two to three and then so on until we get to n which in this case is a bound so now here we have a while statement and our condition is while a number is less than or equal to a bound again a bound here is whatever we're adding up to so as long as our a number variable is less than or equal to a bound then what we do is we first say the sum equals its previous value plus a number so when a number starts out as one then the sum is going to be zero plus one which is going to be one so then after that what we do is we say a number equals a number plus one again a number keeps track of where we are so a number is going to start out as one and then it's going to go to two and then three and then four and so on until a number is less than or equal to a bound again a bound here is the number that we're adding up to so it's going to keep going up by one until we get to a bound and as we're doing that as we're incrementing this a number then we're adding that number to the sum that we have so far and then by the time this while loop is done running the sum is going to have the correct answer but let's inspect this code just to make it a little bit more clear so I'm going to open up code lens and here I have the same piece of code except I'm only printing out sum two when called with four so the first thing that we're going to do is evaluate the function so we can see that sum two is the function that we declared right here and then we print out the value of sum two when called with four so what that means is that a bound is going to have the value four and then we start out the sum with value zero and a number with one remember the sum keeps track of the sum so far and a number keeps track of where we are and now we're at the important bit the while loop again we say while a number is less than or equal to a bound the first time that we run this a number has the value one one is less than or equal to a bound which is four and so yes we do run this code so in this code we say the sum equals the sum plus a number so the sum is going to go from zero to one and then we say a number is a number plus one so it goes from one to two now the next time we run this we check is a number still less than or equal to a bound so in this case a number is now two and we ask is two less than or equal to four yes it is and so we run the code in here and we say the sum equals the sum plus a number a number is now two so the sum is going to go from one to three so the sum is now three and then a number gets its previous value plus one so a number is going to go from two to three and we ask again is a number less than or equal to a bound yes three is less than or equal to four so we run this code so the sum goes from three to three plus a number which is three so the sum goes to six and a number is going to go from three to four and then we ask is four less than or equal to four yes it is so the sum is going to go from six to six plus four or ten and a number is going to go from four to four plus one or five and now here's another key point so now a number is five and when we ask is five less than or equal to four it is not so this condition is false meaning that we're done running our y loop and we can see that the sum here has the value that we actually want so now when we're done running our y loop we skip two line nine which returns the sum and we get ten which is the correct answer so let's answer a few questions first true or false you can rewrite any for loop as a while loop well this is true a while loop is a much more general form of iteration it's capable of expressing what you can express in a for loop and more as we'll see in a bit this question asks which type of loop can be used to perform the following iteration you choose a positive integer at random and then print out the numbers one up to and including the selected integer in this case we could actually use a for loop if we wanted to because we could use the range function and anytime you can use a for loop you could also use a while loop so i'm going to say here that the answer is a and this question we're asked to write a while loop that's initialized to zero and stops at 15 if the counter is an even number append the counter to a list called even nums so i'm going to say count equals zero and i want it to stop at 15 so i'll say while count is less than or equal to 15 and i'm just going to say count equals count plus one so what we're doing in this code is we're initializing count to be zero and then inside of the while loop as long as count is less than 15 we assign its value to its previous value plus one so count is going to go from zero to one to two to three four five six seven eight nine 10 11 12 13 14 and then 15 and then it does get assigned to 16 but as soon as it's 16 then we exit this while loop because 16 is not less than or equal to 15 and so by the time we're done with this while loop count is going to be 16 now this question is asking us to do a little bit more it's saying if the counter is an even number append the counter to a list called even nums so i can check if counter is an even number by saying if count modulo two is zero again that's just saying if the remainder one divided by two is zero and what we want to do then is we want to append it to this list which i'll call eve nums we're going to start out as an empty list and if count is even i want to say eve nums dot append count so let's test our code to be sure that it works okay so we can see that now our code passes all the tests and even nums has the value zero two four six eight and so on to 14 this question says below we've provided a for loop that sums all of the elements of list one right code that accomplishes the same tasks but instead uses a while loop assign the accumulator variable to the name acumen so the strategy that we're going to take in order to do this is we're going to have one variable that's going to keep track of the current index so i'm going to call that variable idx idx is going to be zero and then one and then two three four five six and then by the time at seven we want it to stop because there isn't a seventh item so let's write the code to properly set idx first i'm going to say idx equals zero and i'll say while idx is less than the length of list one then idx equals idx plus one now an important point here is that i said while idx is less than the length of list one and i initialized it to zero so why did i do both of these things well first i initialized it to zero because again lists and sequences are zero indexed meaning that we have to start out at zero to get the first item now i said less than the length of list one rather than less than or equal to because here list one is one two three four five six seven items long so the length of list one is seven but we only want to go until the last item and because we're zero indexed we have the last item is actually item six so by the time we get to idx equals seven then we want to break out of this loop we only want idx to be zero one two three four five six so i'm going to say less than the length of this list not less than or equal to okay now we don't just want to loop through all of the indices of this list we also want to assign a new variable acum to be the sum of every item so i'm going to start out acum equals zero to say that we haven't seen anything so far and then just like on line six here where we say total equals total plus the value of that element i want to say acum equals acum and then i want to say plus the value at index idx and i get that by saying list one sub idx so when idx is zero this is going to be eight when idx is one this is going to be three and so on so again as idx iterates through all of the indices this is going to be the value at that index and so we add the value at that index to the previously accumulated value now let's run our code and be sure that it's correct so we can see that our code works as we wanted it to that's all for now until next time the listening loop is one of the most common patterns that you'll encounter when using y loops what the listener loop is is essentially a pattern that waits for some input or some value before deciding to terminate the loop so you can only use y loops with the listener loop because by definition a listener loop doesn't necessarily know how many times it's going to run even after your program starts you don't know how many times you're going to actually have to execute the code in a while loop so for example what this piece of code does is it keeps asking the user for a next number to add up so it's going to ask the user for input and then it asks the user to enter zero if there are no more numbers to enter so again before we run this program we have no way of knowing how many numbers the user is actually going to input and so we can't use a for loop we instead need to use a while loop and we need to say while the user has not entered the number zero so let's look at this code in this code again we're adding up the numbers that the user inputs we're going to add those numbers into this variable the sum the sum is going to start out as zero and then we're going to use another variable x to store what the user put in here we're just going to arbitrarily initialize x to be negative one the reason that we do that is because in our while loop we say if x is not equal to zero so if we started x out as zero then we would skip our while loop so in our while loop we assign x to be whatever the user inputted but we cast it to be an integer so x is going to be an integer representing whatever the user just entered and then we add that value to the previous value of the sum and reassign the sum in other words we add x to the sum now when the user has had enough numbers that they've entered and they enter zero then we're going to exit this while loop and we're going to print out the sum so let's run our code so here we're asked what's the next number to add up i'm going to say 10 and then the next number to add up i'll say 20 and then i've entered 10 and 20 so that's enough for me so i'm just going to say zero to say that there are no more numbers and when i do that then i can see that i get the sum of 10 and 20 is 30 i can do this any number of times so i can do five four three two one and then zero and i'm going to get 15 and again python isn't going to know how many numbers we actually add up before entering zero which is why we need to use a while loop here so this pattern is called the listener loop pattern again we we're kind of listening for a particular input in this case we're listening for the input is x equal to zero and while that input is anything other than the termination value then we keep running our code in the while loop let's look at a slightly more complex example in this code what we're going to do is we're just going to ask the user if they like lima beans and we ask them to enter y for yes or n for no but here's the problem when the user inputs a value for whether they like lima beans or not then they could input anything they could put in a number they could put in any other letter but what we really want is for them to enter y or n so what we're going to do is we're going to keep asking them for a valid input until they enter either y or n and then by the time that they've produced a valid input if they say why then we print out great they are very healthy if not then we print out too bad so let's run our code first to see what we're actually running so again we ask do you like lima beans y or n i'm going to put in something else now when i hit okay then i'm going to get this prompt again and it's going to say do you like lima beans yes or no so i'm going to enter something else and i'm going to keep getting this prompt until i enter either y or n i'm going to say why for yes and i get great they are very healthy so a couple of notes here so first all of these print statements appeared kind of suddenly because of a kind of quirk in the interpreter uh that we actually use these should have been printed out as i was inputting numbers the important thing to focus on here in the output is that we printed out great they are healthy when i finally entered why for yes okay so the important part of this code is this function get yes or no what this function does is it's going to keep pestering the user until they enter in either y for yes or n for no if they enter in any other invalid input then it's going to ask them again for input so in this code we first create a variable that keeps track of whether their input is valid we initialize that to be false because the user hasn't actually put in anything and then we say while the user's input is not valid then ask the user for an answer and we assign that to the variable answer we then convert that to be uppercase so lowercase y gets converted to capital y lowercase n gets converted to capital n and then we say if answer is either of the valid possible inputs either capital y or capital n then we assign valid input to be true and what that means is that the next time we actually go through this while loop valid input is true and so not true is going to be false and false is actually good in this case because that means we're breaking out of the while loop and returning whatever the user answered if the input is not valid then we instead print out please enter y for yes or n for no that's what get printed out here and then we go back up to the top of this loop and we say while not valid input valid input is going to be false meaning that not valid input is going to be true meaning that we're going to ask the user again to enter in y or n so as long as the user enters anything other than y or n then this while loop is going to keep asking them to enter in yes or no until they actually enter in y or n in which case we're going to return what they entered that's all for now until next time welcome back break and continue are two special kinds of statements that can be used within while or for loops a break statement breaks out of whatever loop contains it so here if we're within a loop and we encounter a break statement then python is going to break out of the loop immediately so this break statement is going to say that the program should jump from here to here and it's going to skip the rest of whatever is in the body of the for loop and it's not even going to check the condition again a continue statement is similar a continue statement like a break statement does skip whatever is in the rest of the loop so it's not going to run this code but unlike a break statement which takes us to the bottom of the loop a continue statement instead says to continue at the top of the loop and so it's going to check this condition once more so let's see what break and continue statements do in code so here we have a while loop this condition is true meaning it's always going to be well true so what that means is that without a break statement then this is almost by definition going to be an infinite loop but here in the body of the loop we first print out this phrase while I was print and then we call break and then we say print does this phrase print and then here we print out we're done with the while loop so I want you to think a little bit about what this code is actually going to print so what I expect to happen is that when we run this code even though this says while true this is only going to print out once because after this prints out then we break out of our while loop and then we skip what's here because that comes after the break and then we print out we're done with the while loop so I expect this and this to print out let's run our code to be sure that that's the case so you can see the only things that print out are this statement and the statement now what would happen if I replaced the break with the continue so when I run this code what I should expect is that it's going to get stuck in an infinite loop the reason is that we online to print out this phrase while we print and then this continue statement jumps to the top of the loop and checks the condition again and here this condition again by definition is going to be true and so we're going to print this out again and then continue and so we have an infinite loop and I would expect this phase will always print to be printed out well a huge number of times before our program actually stops terminating and you can see when we look at our code that that's exactly what happened so let's look at another example to see how a continue statement works so here we have a slightly more complicated piece of code we have a number x which we set to zero and then we say while x is less than 10 and as long as x is less than 10 we print out we are incrementing x now what we do is we say if x is even so in other words if x modulo two is zero then we add three to the value of x so x would jump from zero to three and then we say continue which takes us back to the top of this for loop then we say if x modulo three is zero in other words if x is divisible by three then we assign x to be x's value plus five and almost regardless then we add one to x and by the time we're done we print out done and we print out the value of x now let's run this code in code lens to see what happens so again we start x as zero and we say while x is less than 10 print out we're incrementing x here x is divisible by two so x is going to jump from zero to three now again when we hit this continue statement we're going to go to the top of this while loop so we jump back up to the top we ask is three less than 10 yes it is in this case three is not an even number so we check is it divisible by three which it is and so we add five to x and then we add one more on to x and so x ends up with the value nine nine is less than 10 so we print out we're incrementing x nine is not even but it is divisible by three so x gets the value 14 and then 15 and then now when we check our condition 15 is not less than 10 so we say we're done with our loop and x has the final value of 15 so again here we use to continue statement to ensure that we weren't going to run what was in the rest of this loop we instead jumped to the top of the while loop that's all for now until next time welcome back so as we talked about before a while loop is generally more capable than a for loop but there are a few drawbacks to using while loops the first is that it can be a little bit more tedious to express the same concept as you might have seen in some of the code samples from before but the second is that you can get stuck in what's called an infinite loop an infinite loop is a loop that never terminates in other words your program if it could would keep running forever now remember that if we have a while loop like this then every while loop has a condition and by the time we get to the end of the while loop then we're going to go back to the top and check if this condition is still true and what that means is that if you ever run the code in this while loop then it better have a chance of switching this condition from false to true but if that's not the case in other words if this condition is always true and we always reach the end of this while loop then our code is going to be stuck in what's called an infinite loop because if we always reach the end of this while loop then we're always going to go back to the top and if this condition is always true then we're always going to run this code once more and that's called an infinite loop because your program would keep running infinitely if it could in reality this textbook has a mechanism to prevent your code from running too long and pretty much every interpreter has some way of breaking out of an infinite loop but they can be frustrating nonetheless and they're important to be able to recognize now this code is actually stuck in an infinite loop so if I ran my code and I would see first that my cursor is stuck on a pointer so that tells me that something's wrong if I even try to click or type anything then I'm not going to be able to because this computer is working so hard trying to run this little while loop that has three lines now after a while then the python interpreter is going to say that this code ran too long but until then I'm not going to be able to do anything on the page so as it's evaluating this let's try to analyze why we're actually stuck in an infinite loop so here we go the code finally finished running and you can see that it printed out bugs many many times so I'm going to scroll to the bottom here you can see that it printed out bugs a lot and at the end I'm going to get a message that this program exceeded the runtime limit so in other words this program tried to run for too long and this probably indicates that your program is stuck in an infinite loop so let's look at why this program is stuck in an infinite loop so here we initialize a variable b to b15 so that looks good so far and we say while b is less than 60 okay so this condition is of course going to be true while b is 15 so when b is 15 then we're going to run this code and then we say b equals five well that looks a little bit out of place but let's come back to it in a little bit on line five we print out bugs which as you can see gets printed out a lot because this wire loop is run a lot and then on line six we say b equals its previous value plus seven now let's come back to this line b equals five because what assigning b equal to five does here is it kind of resets the value of b so again our check is if b is less than 60 and so if we keep resetting b to be five even if we increment its value to be 12 later on by adding seven to it then b is still always going to be less than 60 because every time we run this loop we reset b's value to five so maybe if line four was accidental so if we commented it out then we would actually be able to run our code so again the problem here was that we were resetting b to be five every time we ran this code so it's important to be able to recognize how infinite loops occur they can occur in many many ways so this example is just one of many ways to accidentally write an infinite loop as you get more experience in writing while loops you'll become more adept at being able to identify and fix infinite loops that's all for now until next time the fact that optional parameters default values are only evaluated once can lead to some unintuitive outcomes so here we define a function that has two arguments a and l l's default value is an empty list now note again that because this value is only evaluated once what that means is that we have an empty list that lives somewhere in the frames and objects as the default value for l so here after we define our function f then we first print out f called with the value one now in the body of f we actually mutate the value of l so what that means is that when we call f on line five we pass in one for a and l is going to be that empty list that we created right here when we run line two to say l that append a then we append one to our empty list and now our list is a list that has one item whose value is one and now on line three we return that list l but the thing to note here is that even though this list is going to go away in code lines it actually keeps its value of a list that has one item whose value is one so even though this disappears in code lines by the time we get to line six this list still has the value one so now on line six when we call f with a equals two then you can see that l is this list that has one item in it already so now on line two when we append a to our list we have add two onto this list so we're going to say list is now a list that has one and two in it and so if we kept calling our function f to add new values to our list then we would get a longer and longer list as our default value for l so here by the time we get to line seven and we call f with three then when we call this function our list l already has two values in it and on line two we add three on as a third value in our list so the important thing to note again here is that as our default value gets mutated it affects feature calls to this function f now an important distinction here is distinguishing between lists that are different objects but have the same value versus this list which is the same object so for example on lines eight and nine we pass in two different values for l on line eight we pass in a list that has the string hello as its one item on line nine we pass in a list that looks identical but because these are separate expressions then these are actually separate objects what that means is that by the time we get to line eight and we print out the value of f called with a equals four and l equals the list with the item hello then when we actually call l dot append a then we're mutating this list and that has no effect on this list so here we're mutating this list to add four onto it which you can see here and then when we get to line nine then we're mutating a different list so note again here that we have a list whose only value is hello it doesn't have the four that we added on to this list that's all for now until next time keyword parameters are closely tied with optional parameters so here we have our function f that takes in three arguments x y and z and y and z have default values of three and seven now here on line five we can see that we can call our function by only providing x and then y and z get their default values on line six here we call our function with x equals two and y equals five and z getting its default value but suppose we wanted to call our function and provide a value for x and z without actually passing in a value for y so in other words we want to give y whatever its default value should be so if we wanted to do this we would want to kind of skip a value for y if we try to do something like skipping y by saying two comma blank comma eight then we would get a syntax error because python doesn't understand this format instead what we can do is we can use a keyword parameter so rather than skipping an argument we can just say that this eight is intended to be the value for z and we do that by saying z equals eight when we call our function now what this does is x gets the value two and z gets the value eight and we never specified a value for y so y gets its value three and when we run our code we can see that in this second call here that this is the case so again x is two z is eight and then y it's its default value so these are keyword arguments we could also do this for x so we could say x equals two and z equals eight and this would have the same outcome and the nice thing about keyword arguments is that they allow us to put our arguments in any order so here i'll say x equals 20 and z equals eight but i'll specify z before i specify x now note that keyword arguments aren't going to give arguments that don't have default values default values so for example if i tried to call our function f and just specified z equals eight then we haven't passed in anything for x and we get an error because we haven't actually specified what x is so if we wanted to do this we would have to specify a value for x somewhere or we would need to give x a default value in our function definition so it's also possible to accidentally specify multiple values for the same argument here so the easy case to see it would be if we specified z equals eight and then z equals four if we did that we would see that we have a keyword argument repeated on line five it's less easy to see if we accidentally do it by saying something like passing in ten for x by passing it in as the first argument and then later on accidentally specifying that x equals eight here we get the same kind of error so we see that we have multiple values for the argument x on line five and that's because here it looks like x should be ten by virtue of the fact that we passed ten in as the first argument and then here it looks like x should be eight by virtue of the fact that we specified x equals eight another thing to note about keyword parameters are that keyword parameters always have to be expressed after positional arguments so i'm going to modify our code to say that x should be ten and z should be eight and this works just fine but if i change the order of this to say that z equals eight and then we pass in ten then we're going to get a syntax error and that's because again keyword arguments always have to come after any positional arguments that don't specify a keyword so in other words we have to put z equals eight after specifying that x is ten so if i change the order here again then we'll see that our code is fixed now let's look at some questions that involve keyword arguments so in this question we're asked what value will be printed for z so we call our function f we specify x is two we specify y is five and then we have z equals initial and so what that means is z is going to be the value of initial as soon as this function is declared we can see that initial is set here and so the value of z is going to be seven in this question we're asked what value will be printed for y so we have the same function definition so we specify initial equals seven and then we provide default values for y which is going to be three and for z which is going to be seven and then when we call our function f we pass in a value for x which is two and a value for z which is ten and so what that means is that y gets its default value which is three so we should get three printed out for y in this question we're asked what value will be printed for x so we have the same function definition x y and z and when we call our function f we specify that x is two here but then we also specify that x is five and this isn't going to fly python is going to give us a runtime error because we tried to specify two different values for x two and five so the answer here is e in this question we're asked what value will be printed for z so we have the same function definition as before arguments x y and z we specify that z's default value is going to be initial when we define our function the value of initial is seven but it just so happens that we overwrite that value later on so that it's zero but again these default values are only evaluated when we declare the function and when we declare this function f initial had the value seven and so we can just almost replace this with the value seven and it doesn't matter that we changed initial later on z is going to have the value seven by default so the answer here is b that's all for now until next time lambda expressions are an alternative way of defining functions so suppose that we have this function definition and here i'm using args as a stand-in for a list of arguments so i might have x comma y comma z these arguments might even have default values i might have x equals one etc and we have some return value expression so that might be return x plus five or any expression that we want to return if we have that kind of function then we can express it using a lambda expression this is a lambda expression whatever we wrote for our list of arguments here we could transplant to our lambda expression and reuse it down here so we could have lambda and then our argument might be x and y and then we say colon and then whatever value we want to return we can again specify in our lambda return x plus y and this expression has a value whose type is a function we don't even need to give this function a name like we do when we use def if we do want to give it a name we can say func equals and then provide our lambda function but we can also make it an anonymous function in other words a function that doesn't have a name so let's look at some code examples here on line one we define a function f it takes in one argument x and it returns x minus one and again this is kind of the traditional way of defining a function if we called our function by saying print f called with three then we should get the value two if we print out the function object itself by saying print f then we should get that this is a function named f and if we print out the type of f by saying print the type of f then we should get that f is a function now if we wanted to write this function with an equivalent lambda expression then we would say lambda x which is our argument and then we want to return x minus one so we just write x minus one again we don't need to add a return statement for lambda expressions in lambda the return is implicit so if we print out the value of this expression then we should see that it's a function so in this case it's a lambda function specifically if i instead say lf equals this value and if i print out lf called with three then we should get two and if i print out the type of this lambda function then we should see that just like f our lambda function lf is a function so suppose that we wanted to convert this traditional function definition called last char into a lambda function then what we could do is we could say last char equals a lambda it takes in one argument s so lambda s colon and then again remember the return is implicit with lambda so we can just say s sub negative one and now the value of this expression is a function which is equivalent to this function we don't necessarily need to actually give it a name but here we chose to give it the name last char so let's answer some multiple choice questions about lambda functions so this question asks if the input to this lambda function is a number then what's returned so here we have a lambda expression it takes in one argument x and it returns negative x and so what that tells me is that if we pass in one we should get negative one if we pass in negative 10 we should get positive 10 etc and so that to me is b a number of the opposite sign so positive numbers become negative negative numbers become positive so i would answer b that's all for now until next time welcome back in today's lesson you're going to learn a powerful new command for sorting sorting's a big it's a big deal in the computer science curriculum in a typical computer science curriculum you would learn several different sorting algorithms and you'd analyze properties of those algorithms in my first programming course back as an undergraduate i had to implement something called merge sort where you take the items you keep chopping them in half until you get to very small lists which are already sorted and it turns out that taking two already sorted lists and merging them together into an even bigger sorted list is something that you can do pretty easily it goes pretty fast and so we build up from these small sorted lists until we have the whole thing sorted i spent days and days in the computer lab is before we had personal computers so i had to go to the computing lab and every time i hit run i had to wait for for the mainframe to run the program and print out a stack of papers telling me how the the program had run after many hours over several days i very proudly took a stack back back home and showed all my roommates yeah you know i finally did it it won't be so won't be so hard for you we are not going to look at details of sorting algorithms we're just going to use a built-in python function but we do want you to have a little mental model of what happens inside of a sorting algorithm because it's going to help you to figure out how to invoke it well we've given you these great videos from sapienti university illustrating sorting algorithms using hungarian folk dances you'll watch a couple alternative hungarian dances showing different sorting algorithms don't worry about the details of the algorithms but do notice something that they have in common they always involve a bunch of pairwise comparisons which are interactions between a pair of dancers two dancers will look at each other they're each wearing a number and they look at their numbers and they do dancing and the one with the higher number always ends up on the right at the end of that interaction after they've done a whole bunch of these pairwise comparisons the dancers are in order by their numbers so again the particular sorting algorithm is not our focus here we're just going to call a function that does it does the sorting for us and and whatever sequence we give it comes back sorted in the order we want it at the end of this lesson you will be able to invoke the sorted function to sort any sequence you'll be able to specify either low to high order or high to low using the reverse parameter you'll be able to specify a property to sort by using a key function and in a later lesson you'll learn some more advanced sorting things so we'll see you at the end let's get started python provides two ways to sort a sequence the dot sort method and the function sorted we'll start with the dot sort method it operates on a list and it doesn't return anything but it changes the order of items to be from lowest to highest for example on line four we first specify the list l1 and we have the dot notation saying we're going to do a method on l1 and we're going to do the sort method not passing any parameters to that sort once we've executed that on line four when we get to line five and print by magic l1 is going to have its items sorted from lowest to highest minus two then one then three and so on so let's run that and sure enough we get them sorted based on the print statement on line five on line six we're doing the same thing except instead of sorting the first list we're sorting l2 which has three strings in it cherry apple and blueberry what does it mean for one string to be smaller than the next well we use dictionary order or alphabetic order so apple begins with a blueberry again begins with b so apple comes before blueberry and we get the list apple blueberry cherry our second option is the sorted function it sorts the items in exactly the same order but there are a few things that are different about this way of of telling python to sort first of course is that we're using the function syntax instead of the method syntax so no period the sequence is going to be sorted we pass in as a parameter to the sorted function and we're going to get a value back that's our second difference we get a value back and we can either assign that value to a variable as we've done on line three or we can use it in an expression so for example on line five we're passing it directly to the print function a third difference is that when you invoke sorted on a list it doesn't change the original list it produces a new list that has the same items in a different order so when we get to line six and we print the original list that original list won't be changed at all so let's run that and you see that we get from line three apple blueberry cherry and the same thing from line five the same result of calling sorted is getting printed again on line five on line six we're asking to print whatever is the current value of l2 and that is unchanged from the original list so notice the difference here on lines 10 through 12 we're doing the original way so we see from line 11 that we invoke using the dot notation and then we don't have to pass the list as a parameter because it's being specified before the dot when we use sort if we then print on line 11 we see that the list l2 itself has been modified and one other minor thing to notice here is that when we call sorted it returns a list when we invoke the sort method as we do on in this lower half on line 12 we don't get a value back we just get the value none all of the action of the dot sort method is a side effect it changes l2 but it doesn't return anything useful the value that we get back is just none so that's the basics of sorting in python for the rest of these lessons we'll be using the sorted function rather than the sort method it's just safer that way we've emphasized previously how confusing things can get when you use mutation operations so we avoid them whenever we can one other nice thing about the sorted method is that we can we can apply it to any sequence not just to lists we could for example try to sort a string rather than sorting a list so you might wonder what is this going to produce and I encourage you to try to pause this and make a prediction what will this do is this just going to give us apple back well no it's not going to give us apple it's going to treat apple as a sequence of characters a p p l and e so it's going to give us a list of single letters and they're going to be in alphabetic order so we get a e l p and p the comparable operation will fail because we can't destructively sort a string strings are immutable and that's going to give us an error this sort attribute is not available for strings so dot sort you can only do on lists but the sorted function you can do even on immutable sequences like strings though in general we're going to use sorted and not sort see you next time when we sort things from highest to lowest welcome back what if we don't want to sort from the smallest item to the largest but instead we want the reverse order well that's easy you've already seen how you can reverse a list there's actually a reverse function or you could do a list accumulation but it's actually even easier than that because we can specify an optional parameter for the sorted function called reverse that's an optional parameter it's default value if you don't provide a value for it is false but if you pass the value true in you get the list back in the opposite order you can see that on line number two in addition to saying what list we want to sort we're also saying that the reverse parameter should get the value true and when we do that we'll get the things in reverse order cherry blueberry then apple i hope i'm not making you too hungry with these examples so this reverse equals true is just passing a parameter the usual thing that we've seen before for functions the actual value that we're passing in here the word true is just a boolean value if you'll recall from when we were doing boolean values i could change this instead of passing the boolean value true i could pass the boolean value false and that would say don't give this back in reverse order not reversed so we would get apple blueberry and cherry now false is the default value for the reverse parameter so if i leave it out entirely i get the same thing that i would get as if i say reverse equals false so if i don't want it reversed i don't have to say it if i do want it reversed i have to say reverse equals true and i get it in the reversed order so that's sorting a sequence in the opposite or the reverse order we'll see next time when we specify a custom order for sorting based on some property of the items that are getting sorted welcome back we're going to learn something that's extremely useful and powerful but conceptually a bit tricky so hang in there on this one once you get it you're going to think wow this is really cool at least i do so let's say we have a list of numbers like l1 created on line one and we want to sort them on some property like their absolute value so from line nine we're printing out the absolute value of three which is just three but the absolute value of minus 119 is 119 and on lines 12 and 13 we're just going through each of the items in the list and we're outputting their absolute values so we get one seven four instead of minus two we get its absolute value which is two and finally three now suppose we want to sort l1 based on the absolute value we can just tell the sorted function to use absolute value as the property that we want to sort by the way we do that is we use this other optional parameter called key here's another optional parameter just like we had reverse in the previous video here we've got key and we can specify a value we're going to specify absolute and that tells the sorted function to sort by the absolute value here you can see that instead of having minus two first minus two is coming after one because its absolute value is bigger we've printed out the sorted version of l2 we can also do it in reverse order by combining the use of reverse equals true with key equals absolute and we've got that here now that all seems pretty straightforward until you start to really think about what's going on the thing that we're passing as a value for the key parameter is the value of the variable absolute that value is a function so we're passing a function absolute to another function sorted I hope your first reaction to that is like it should blow your mind like huh we're passing a function to a function so yeah that's really weird but eventually you're going to say wow that's really powerful so to make sense of this we need to have a little mental model a way to think about what's going on inside the sorted function we're passing it in this function and what is it doing with that function well what it's doing is before it starts comparing any of the items to each other remember comparisons that's like the pairs of dancers comparing their numbers to each other so before it does any of those comparisons the sorted function uses the function that you pass in absolute and it uses that to determine numbers to assign to each of those dancers that is behind the scenes when you call sorted sorted is going to call the function that you provide and it's going to call it once for each of the items in the sequence it's going to do that to determine some property of that item like its absolute value it's going to write it down on a little post-it note that the item carries around and then the sorted function does all the kinds of comparisons between the items but the comparisons always between the values that are on those post-it notes so for example how is this working we've got this sequence one seven four minus two and three before we do any of those comparisons between pairs of items we're going to run our function absolute on each of the items in turn and we're going to annotate the item with a little post-it note so we're going to have a post-it note for one that its absolute value is one and for seven that its absolute value is seven for four that its absolute value is four those are kind of uninteresting but for minus two we get that the absolute value is two and that's going to change things a bit then we go and do these pairwise comparisons one against seven and seven against four and seven against minus two and all of that there's a whole bunch of those that happen behind the scenes in the sorted function but whenever it's doing a comparison it uses it uses the post-it notes to determine the order so we end up with one and then minus two and then three four and seven and the reason we end up with minus two coming after one is because we had the comparisons based on their post-it notes one had the one but minus two had two so minus two with its post-it note of two ends up coming after one so if you think of that as being the process that's going on you invoke the sorted function the sorted function calls your function like absolute and it calls it once on each of the items in the sequence i'm going to prove to you that that really is what's going on the way i'm going to prove that is i've modified the absolute function here just a little bit by having it print something out just added this thing on line four so every time the absolute function gets invoked we're going to print something out so on line 10 we're saying to print something then we call the sorted function and then we print something to say hey we're done with sorting well you'll see that line 10 generates that and line 12 sprint statement saying that we're done is there then everything in between is stuff that's happening because of calls to absolute that happened during the execution of line 11 so we pass in the value absolute to the sorted function we never actually say invoke the absolute function we never do that but inside the execution of the sorted function it is calling absolute and it's calling it one two three four five times once for each of the five items in the list you can see is calling it for the value one for the value seven four minus two and three so that's what's going on behind the scenes when the sorted function is executing just so it can get the right numbers onto the post it notes so that it can use those for doing all the comparisons that lead to sorting of the items so the thing that you pass in for the key parameter has to be a function so often we'll call it the key function it has to be a function like absolute is here it's a function it has to be a function that takes one input that input is going to be one of the items at a time from the list and it has to return some value that's going to go on the post it note some property of the item so typically a number now i just want to mention one thing here about passing in a function we can either pass a function by name as we did in originally with this we just say the absolute function or we can pass in a lambda expression remember from a previous lesson that lambda expressions are expressions that produce anonymous functions so instead of specifying absolute here we could specify a lambda expression so this is a lambda expression this whole expression evaluates to a function object that function takes one input x and returns as its value whatever this expression returns now if you write this like this it might look a little silly to an experienced programmer because really this lambda expression is returning a function that does exactly the same thing that absolute does but you might still want to do that just if it makes it clearer for you this is a conceptually challenging thing this idea of passing in a function for the key parameter and i find a lot of students really they just understand it better this way where the lambda reminds them that they're producing a function and when they just see key equals absolute it doesn't doesn't quite click that absolute is referring to a function object so if this helps you you're welcome to do it this way there are other places where we'll have other lambda expressions which make even more sense and we'll see that in a later lesson so to summarize if you want to sort a sequence based on some property of the items call sorted and pass a value for the key parameter the value for the key parameter has to be a function that function takes one item as input and returns a value to write on the post it note a property of the item the value for the key parameter can either be the name of a function like absolute or a lambda expression like i've shown online nine here so play around with this a little do some exercises keep trying it until it makes sense and then you'll be able to sort anything anytime see you next time congratulations rah rah you've learned how to get python to sort sequences for you the sorted function handles all the details you just have to specify the desired sort order you should now be able to invoke the sorted function to sort any sequence you should be able to specify low to high or high to low sorting using the reverse parameter and you should be able to specify a property to sort by using a key function it's joke time i'm sure you've heard some rags to riches stories about the guy who started out working in the company mail room and eventually became the ceo well this one has a little twist it's jerry's first day at the company and he's assigned to the mail room he's given the task sort the incoming mail and jerry sorted the letters so fast that his motions were literally a blur his supervisor was very impressed at the end of the day the supervisor approached jerry says i just want you to know that i'm very pleased with the job you did today you're one of the fastest workers we've ever had oh thank you sir said jerry it's beaming and tomorrow i'll try to do even better better how is that even possible jerry replied tomorrow i'm going to read the addresses we'll see you next time when we sort dictionaries and learn how to break ties hi glad to have you back for a little more in-depth look at sorting we're going to look at sorting dictionaries which can be a little confusing even though it follows the same mechanics that you learned in the previous lesson we're also going to look at how to break ties in the primary sort order with a secondary sort order at the end of this lesson you should be able to sort a dictionary's keys based on their values or some property of their values and you should be able to break ties by having the key function return a tuple bye for now welcome back one sorting task that comes up frequently but a little tricky is sorting a dictionary the way we sort a dictionary is to sort its keys after that we can iterate through the keys and look up values as we need to so remember this code you've seen something like it before maybe with a slightly different list we've just got a list with a bunch of items some of them are repeat and we're creating a dictionary that counts for every item for every distinct possible value in the list how many times does it come up in the list so a comes up once twice so at the end in our dictionary d we're going to have the value two associated with the key a and then on lines nine and ten we're just printing out the values from that list so e appears twice f appears once a appears two times as i just manually counted and so on notice that the keys e f a b c and so on they appear not in any special order and that's just the way your dictionaries work when you ask for the keys you get all the keys back but there's no promise made about what order they'll appear in suppose i cared about the order and i really wanted to say you know a appears one times b appears twice and so on the way to do that is going to be online nine to sort the keys before doing the iteration so let's suppose that instead of just asking for the keys i pass those keys into the sorted function now i'm going to get the results in alphabetic order a appears two times b appears two times and so on well that's all well and good how about if we wanted to sort based on the counts instead so we really want d to get printed out first because it appeared the most times and after that we'll get the things that appeared only only twice now we can specify a property of these keys that we want to use for sorting and there's a little confusion that's going to go on here because we're using the word key in two different ways we use the word key to refer to a key in a dictionary like in this dictionary d we have letter capital a is a key and capital b is a key and so on but then we have a second use of the word key which is the parameter name in the sorted function so if we want to say sort these keys based on some property of them we say k equals and we pass in a function here these are just two different meanings of the word key and you got to keep them separated so remember the key function is going to take one list item as an input in our case we have a list of all the keys a b c d e f and so on and so that's what we're going to have as one input and i'm going to call my function the parameter for this function k just to remind me that the thing that's getting passed into it is one key from the dictionary like the letter a or the letter f i'm choosing the the parameter name k to remind me of that and then we're going to return a property of that key and the property we want is what is the value associated with that key in the dictionary d so if i have the key c what i want to do is get the value one and i want to use that for the sort order for d i want to use four the way i can do that is i refer to d square bracket k just look up for the current key what is its value in the dictionary and i get if the current key is c i'll get one and if the current key is d i'll get four and that's really all i need to do in order to resort this output in the way that i want it let me just clear those markings for you so now we have d appearing at the end because we're going lowest to highest the items that occur least frequently to the ones that occur most frequently if i wanted to do it in the reverse order i just do if i wanted to do it in the reverse order i just use it reverse equals true like we've done before now d we'll get printed out first so the things to remember here are we're not doing anything new with sorting there's no new mechanics here we're just passing in a function for the key parameter but we have perhaps a little bit of a confusing function this function is taking one key from the dictionary as its input and returning a property of that key the lookup of its value in the dictionary d so we have key as in a key from the dictionary that's our letter k we've chosen to remind us that we're dealing with a dictionary key and then we have the parameter named key for the sorted function one other thing i want to point out is that when we tell the sorted function to sort the keys there's a shorthand we can use you may recall we've said before that anytime a list is expected there's some place in the code where the python interpreter is expecting a list if you provide a dictionary it will automatically grab all of the keys as the list so this is equivalent we can either say d.keys or we can just say d because we're passing them to sorted if we pass the dictionary it automatically figures out that we want to sort the keys so this is sorting a dictionary's keys based on on their values there are several useful exercises at the bottom of the page in the textbook that i encourage you to work through in order to solidify your understanding we'll see you next time welcome back what if we really want to control the sort order specifying how to break ties on the primary property we're using for sorting the answer is that we take advantage of python's built-in sort order for tuples so take a look at this code i have a list containing five tuples each tuple has three items in it the first tuple has a three and two the second tuple has c one and four and so on online six i am going to sort those tuples so we're still going to get a list of five tuples and then we're just going to print each of them with an iteration what order do you think they're going to come out in is it just going to do them in the original order because it doesn't know how to sort tuples is it going to put all of the a's first so we'll get this one and then this one and then the b and then the two c's is it going to do some random order what's it going to do well let's see what it's going to do is put the a's first and then the b's and then the c's so when you sort tuples you're really sorting by the first element in those tuples but there's more we have a built-in tie-breaking mechanism with tuple sorting here the first values were both a so it went on to the second value and two comes before three there was only one b so there was no tie-breaking that needed to happen but between the two c's those are equal so it goes on to the second element if those are equal it goes on to compare the third elements and it would even do fourth and fifth as many as you had elements in the tuples so when it compares two tuples it first compares their first elements if one of them is smaller then that whole tuple is smaller but if they're equal it goes on to compare the second elements of the tuples and then the third elements and so on so that's going to turn out to be useful for us when we try to control a sword order for breaking ties even when we're not sorting things that are that are tuples suppose we had really duplicate items so we had another a three two it's exactly the same as the first element in that case it's just going to put both of them in there one after the other so we've got both of the a three two showing up it's never going to collapse them if you have six elements to start with and you sort the list you're always going to end up with six elements at the end even if two of them are identical all right so that's sorting tuples we're going to take advantage of the python sword order for tuples in order to be able to specify you know fine grain control on our sword orders for other things using tuples to create a tie breaking mechanism the way we're going to make this tie breaking mechanism is that we're going to make our key function as always take one item as input and it's supposed to return a property of the item but instead of returning one property of the item we're going to return a tuple containing two properties of the item so here's an example we've got a list of fruit names and we're going to sort them and the property that we're going to use to sort them is defined by this lambda expression it takes as its input one fruit name and it returns as its output a property but in this case it's two properties as a tuple the first one is the length of the fruit name how long is the word and the other is the fruit name itself so this is going to produce for peach a tuple five comma peach for kiwi it's going to produce four comma kiwi remember this idea that the key function is sort of producing a post-it note that's associated with the item so peach has associated with it this tuple and kiwi has associated with it this tuple when sorted is going to decide what order they should go in it's sorting them based on these post-it notes these tuples so four kiwi is going to go before five peach because the tuple ordering says look at the first element of the tuple first however when it comes along and sees four comma pair it's going to have a tie when it compares four with four and it's going to then use alphabetic ordering as the secondary sort order to break the ties so we'll get as our output we get the four letter fruits first kiwi and pear and then the five letter fruits apple mango and peach and those are in alphabetic order apple before mango and before p mango before peach now what if we want to have the long words first this is just our standard mechanism with the sorted function we can add the reverse equals true parameter and now we'll get blueberry to show up first that's all fine except it's completely reversed the sort order from what we had before so now we have peach before mango before apple those are all the five letter words and we now have them in reverse alphabetic order in addition to reversing the long words to short words what if we wanted to have longest words first but break ties with alphabetic order rather than reverse alphabetic order this starts to get pretty tricky one solution that's available to us is a little trick instead of using reverse equals true to to reverse our sort order which will make it so that we reverse both the primary and the secondary property i'm going to try to just reverse the primary property and there's a trick i can use for numeric properties like the length of the fruit name if i just make all of them be negative values so blueberry is now going to be minus nine and kiwi is going to be minus four minus nine is less than minus four and so i'm going to get the longer ones to come first but i haven't reversed everything so my secondary sort order is still going to be from lowest to highest which will be alphabetic so i still got blueberry first but i now have apple before mango before peach in alphabetic order that trick only works if we have a numeric property if you had two alphabetic properties and you wanted to do reverse on one and not on the other it would be harder and i don't have an easy solution for you so summarizing if you want to specify a tie breaking property have your key function return a tuple like key functions everywhere they always take one item from the sequences input but now it's going to return a tuple or the first element of the tuple is the primary property to sort by the next element is the secondary property to sort by and you could even have more elements in the tuple we also saw that if you just want to reverse order for one of the properties but not the others instead of using reverse equals true you can make the key function return the negative of all the numbers see you next time glad to have you back here's a little way of the programmer advice on when to use a lambda expression and when to use an aimed function for your key parameter when sorting basically my rule of thumb is if the lambda expression is short and simple so that it's pretty easy to understand what it's doing use the lambda expression and as soon as it gets too complex refer to a name function instead and and give it a good name that describes the property you're trying to sort by for example here's one that's just at the outer limits of what i'd consider simple enough to put in a lambda expression we have a dictionary called states it's got as its keys the names of states in the united states minnesota michigan and washington and the value associated with each state with each key is a list a list of city names st paul minneapolis st cloud still water for minnesota and arbor travis city and so on for michigan now we want to sort the keys based on some property of their values so we have a generic structure where we have a dictionary called states and we're going to sort its keys we just asked to sort the dictionary but that always means we'll get to sort the keys and our key function is going to take one item from that list as an input and it's going to look up some property in fact it's going to look up that state in the states dictionary but in this case we're not just trying to get the whole list as the value associated with minnesota we're trying to get some property of that list and in this case we're taking square bracket zero so we're taking square bracket zero so that would get us st paul and we're passing it to the len function so that's going to give us eight and for michigan we would take the list that states square bracket state gets us the whole list of cities we take square bracket zero from that that'll get us the city of an arbor and we'll pass that to the len function so we'll get one two three four five six seven eight nine and arbor has nine letters and similarly for washington we'll get the length of seattle which is seven letters so when we sort this we should get a list of the states minnesota michigan and washington but we should get them in the order washington first because its first city has a shorter name and sure enough we get washington and then minnesota and then michigan so that's great when the property we wanted was the length of the first city name and as i said i think this is sort of just at the outer limits of what's understandable in a lambda expression this got to be kind of complicated we had it worked pretty well to understand it because states square bracket state is sort of a chunk a pattern that we've seen before we're taking a dictionary key and we're looking up its value in the dictionary if you think of that as a little chunk as oh that's the value of the state in the dictionary then we can just parse the square bracket zero to say oh take the first element of the list and pass that to the len function anything much more complicated than this i think would be pretty hard to read as a lambda expression well let's think about one of those harder things suppose that we wanted to take a different property for each state we want to find the number of cities in its city list that begin with the letter s we want to sort in that order so okay what does that property sound like the number of states that begin with the letter s number of states that sounds like it's going to be a count accumulation and only those that begin with the letter s that sounds like we're going to need to filter as we count so it's a count and filter accumulation it's not too hard to write it as a function but it would be pretty hard to write just as a lambda expression so i'm going to write a function i'm going to call it s cities count and it's going to take as an input a city's list and what it's going to produce as an output is is a count of how many cities begin with a capital s well this is a count filter and accumulation so i'm going to say count equals zero for city in city's list if the first letter of city is s then increment our counter by one when we finish the iteration ct will have the total number of cities that began with s so we'll return ct now how am i going to use that the key function is going to take one state as input and i wrote this city's count to do our accumulation as though it's taking a city's list so i can't just say key equals s cities count that's not going to work because the key function has to take one state as an input and my key function is taking a list of cities as an input so this is going to be a little bit of a hybrid i'm going to take one state as an input i'm going to look it up to get the city's list so that's sort of our canonical way of sorting a dictionary by its values is we look up that key look up the key state in the state's dictionary but that's going to give me a city's list and i'm going to pass that to my new function let's see when you write that much code the chances of running without an error aren't very good let's see if i happen to get lucky wow i did okay so we got michigan first and then washington and then minnesota so hopefully if we go look at the city's list we'll see that minnesota is the one that has a lot of cities beginning with s and sure enough it does washington has st paul for one st cloud for two and still water for three so it had three uh washington had seattle for just one and michigan had zero so sure enough we got these in order of the number of cities that begin with the letter s so let's review what we did here again we made a helper function a named function called city's count that did our thing that was a little too complicated to put into the lamb expression and we passed some property of our key the list of cities in that state we passed that as an input to this helper function s cities count now it's actually possible to not use a lamb expression here at all and just refer to a named function but we would have to make our function then take as input not the city's list but the state name so i'm going to do a version of that i'm going to call it s cities count for state and it's going to take a state name as its input and it'll make the city's list equal the lookup of the state in the state's dictionary and then everything else can be the same as before if i do that i can say that my key function is just s cities count for state that time i wasn't so lucky so this is always instructor let's look at this error message it says syntax error e o f and multi line statement on line 19 now e o f might be a little confusing it's an acronym e stands for end o of f is file end of file basically it means it got to the end of the program it was sort of in the middle of trying to parse something and it's saying line 19 so i always like to go back and look at the first thing before that here's line 17 and sure enough i am missing a closing parenthesis so it was kind of waiting around expecting there to be another closing parenthesis and it didn't find it and so it got an error let's run now and sure enough that works exactly the same as the original now i actually prefer the original over this second one even though line 17 now looks very clean we just say key equals this function name but i find that the previous version and we can use our scrapper to go back to go back to it let's see yeah this one where we had key equals lambda of state s cities count of states square bracket state the lambda expression is more complicated here but the lookup of one key in the dictionary is just a very common idiom when we are sorting the keys from a dictionary so that's a little advice on when to use lambda expressions versus named functions my basic advice is if the lambda expression is simple enough do it that way and when it gets too complicated to read or write then it's time to use a name function to move some things out into another function that you can give a good name for we'll see you next time congratulations now you really know how to sort simple sorts complex sorts all sorts of sorts you can sort dictionary keys based on some property of their values you can break ties by having the key function return a tuple you can use lambda expressions you can use named functions you are a pro at sorting now in the intro to this week i made a big deal out of saying we're not going to go into details of which sorting algorithm gets used and how long it takes it to run so instead of a straight up joke let me give you a little programmers humor a sorting algorithm that runs really slowly it's called bogosort and it works by trial and error take the items shuffle them just at random into some random order and check if they happen to be sorted if they are we're done otherwise try again shuffle see if they're sorted keep going until you get a lucky shuffle remarkably the code for this in python is short in the random module there's a built-in function called shuffle and i wrote the function to check if a list is in order in five lines and another four for the trial and error while loop here's the full code of course because you do a random shuffle every time it takes a random amount of time to finish but as you can imagine this is not a fast way to do sorting if you got a bunch of items and you just shuffle them the chances that it's going to be sorted aren't very good i just ran it once on a list of 10 items and it took 68 seconds to complete i didn't dare to try it with 11 items by contrast python's built-in sorted function was able to sort a million items in just over half a second now you have a party trick ask your friends who've taken some computer science courses some time ago to try to remember the slowest sorting algorithm they studied they'll have fun recalling bubble sort and insertion sort and trying to decide which one is slower and then you can regale them with bogey sort you'll be the life of the party trust me see you next time hello and welcome to course two and your end of course project in this project you'll be building a program in a few steps to perform what's called a sentiment analysis programs like these are widely used in a bunch of different companies and different situations and in this case you'll be working with twitter data although it's fake twitter data you will eventually make a csv file and use that to produce a graph of your results which is a super useful way to visualize data and share results of programs you build with other people then you'd be able to build a program like this one to use real twitter data which we can't use in this particular case in order to build this analysis in your eventual csv file we're going to take you through a series of steps to build different functions in order for you to put together this complete program as you proceed through the program you should focus on individual steps one by one and make sure that you understand the instructions for each step before you move forward to writing the code you should always focus on one step at a time because thinking about multiple steps in your program can often get overwhelming and can confuse you about what code needs to happen first and what should happen later so make sure you focus on individual instructions for one step at a time build your plan and translate that into code as you work through the project as you think about each step that you have to work through you should think carefully about what you know about functions what is the input for each function what is its return value and what does each function have to do after the input and before it returns its final output as you work on each function you'll be able to put them together to build the full project remember also that in each step you may need to copy some work you have done earlier into the next step to come up with a chart like this you'll want to make sure that as you work on each concept you isolate what you have to do without focusing on earlier pieces and focus on the examples from the course that will be useful to you for the project for example function definition advanced functions and dealing with files since you're creating a file in order to visualize your data here i also think that this project is a particularly exciting way to think about how you can apply these concepts to things you might want to do in real life so to speak understanding different things about how programs can apply to you so good luck and have fun