 In this lesson we're going to learn how to use Python to create the bar charts that we've seen in the previous couple of lessons and learn a little bit more about the data type that makes that all possible, the list. So as we saw in previous lessons we could take any selection of text like say for example the text of the book Pride and Prejudice and create an associated bar chart that demonstrates the frequency of each character as its proportion of the text. We're going to learn today how to do just that using some Python. Now we're going to be working with strings probably no surprise as we've been working with them for our text throughout the entire course, but we're going to learn about a new string method dot count that'll be very useful for our work. So in the code snippet here on the screen you see that we have some cipher text message stored to the variable called message and we can print out message dot count and inside the argument for the count we have this we have this ability to specify a substring that you want to count how many times it appears within the larger string. So it's going to count how many times the string capital A appears inside of the message and we can see here that it appears nine times. We could easily do the same thing and count for B and now we can see there's 56 of those and we could do this for all 26 letters although we'll see there's a more efficient way to do that. But more maybe relevant to the work we're trying to do with our bar charts we don't want to just straight up count the number of times these characters appear but we want to determine and compute the proportion of the message that they take up. So in order to get that proportion we also need to divide by the length of the message so don't forget our good helper function here that we can use with strings len l-e-n to compute the length of it and of course we would want to make sure that that message the variable that has the string stored to it is kind of cleaned up text so to speak all capitals no spaces no punctuation we'll have fully cleaned it so it is just the letters that way when we calculate the length we're not computing anything else that really doesn't contribute to the overall number of letters in our string. When we do that we can see that in this particular ciphertext the character C takes up almost 10 percent of the overall message because its proportion is 0.098. So let's figure out how we can use Python to compute the proportions not just for one letter at a time but for all of the characters in a message. Let's start out with maybe a familiar setup for our code we'll define our alphabet that we'll be using for this message so in this case the 26 uppercase characters a through z and then we'll define the message we're working with as a string and we're storing it to the variable named message. We're going to go ahead and calculate the length of that message and then just store it to the variable named message length we could just kind of call that command on message every single time later on in our code but we'll see we're just going to end up redoing the same calculation over and over again it's a little bit more efficient if we calculate that one time and then store it for use later on. Now next up since we want to calculate a proportion for every character or for each character in the string called letters there's the queue that we're going to be using a for loop to do this we're going to say for char four character in uppercase letters so we're going to go iterate over that string one letter at a time so we're going to take the character a store at the char count the number of a's and then turn that to a proportion do the same thing for b b goes into char again and again and again see if you could think about what code we can put in underneath that header for the loop in order to do just that for now just computing the proportion don't worry about saving it or doing anything else with it. We'll take a look at that line of code in three two one message dot count char divided by message length so for each of the 26 letters that are currently stored to the letters string we're going to count the number of occurrences of that substring single character in the message and then divide by the message length and then I've chosen to use print here just so I could see what it's doing we'll see that we're going to have to tweak that just a little bit. So there's all of our proportions I didn't show all 26 they wouldn't fit on the screen but I've shown the first six or seven here six. But just printing those numbers is not going to be helpful for the computer to make a bar chart remember printing is really just for us humans it allows us to see what the code is doing and see what our calculations are but the computer can't take what you've printed and then carry that on to go do something else with it we're going to have to save this somehow so that we can pass that off to the bit of code that's actually going to make the bar chart so we need to figure out exactly how to do that and it turns out the tool or the function that we're going to be using in python requires this not to be stored as an integer or as a float or as a string but a new type of data that python works with called a list. In short a list is just a data type in python that can be a collection of all of the other data types all the other objects that we've seen so far so for example on the screen I've made a very simple list that's called a and I've assigned it to be the list that contains the string science the string math the number four the boolean false and the float 3.14 notice how I constructed that list it's a square bracket on the left and a square bracket on the right and the elements in that list are separated by commas and when you print out that list it shows you all of the elements that are inside of the list there are a lot of important characteristics of lists and I want to list them out below and we're going to dig a little bit deeper into some of these to see why they're so important the first one is that list can contain any amount of arbitrary objects so as we already saw strings floats integers booleans doesn't matter if they could even hold list of lists and we'll we'll touch on that again in a moment lists are ordered we'll see that's helpful because if they always show up in the same order then they're predictable and we can pick certain things out of the list because we know where to find them we can access those elements now that they're in order by using an index just like how we can specify a particular character out of the string using an index or indices by slicing we can do the same thing with lists lists can be nested to any arbitrary depth which is just basically a fancy way of saying that if you if you wanted to put a list inside of your list you can do that and there's a way that you can get access to all of those different parts the items inside of your nested list are easily accessible lists are mutable which means that after you've defined them you can change how many things are in them how big they are which makes them really dynamic other data types you've seen don't have all of these characteristics they're a little bit more rigid less flexible so lists are really a great tool to be using in python and we'll see why throughout the rest of this lesson let's talk about indexing lists in python so this should look very similar to when we work with a string to index into it you just use the name of that list open up a square bracket and you'll put the integer that represents the index that you want to work with just like with strings our first index of zero is the first element that would have been science so when I asked to print a of one am I going to get back math because math is actually an index one even though it's in the second position we can use slices just like with strings so notice when we do a of two colon five you'll get the item that starts at index two which is four up to but not including index five so we'll get indices two three and four that's why we get four false three point one four notice that when you do a slice on a list you get a list back it's kind of like when we do a slice on a string we get a string back and just like with strings we can use negative indices to get access to things from kind of the back of the line so three point one four is that index negative one false is that negative two and that integer of four is that index negative three there are a few other helpful list commands that we can use as we're working with these the length function that we've used with strings works the exact same way with a list so you can quickly compute how many things are in a list by doing length of that list here we can see we've got five elements you can use a function min min and assuming your list contains all numerical values it'll give you the lowest value the smallest value inside of that list if it were all strings it would minimum would give you the first one that comes alphabetically and then kind of the opposite of that we have max so again if your list contains all numericals it'll return back to you the largest value inside of your list and if this were a list of strings it would give you the kind of last item when you sorted it alphabetically so let's take a look at one of those other cool characteristics of lists about how they can contain other lists inside them here we've got a pretty complicated looking list here you can kind of quickly realize that we use a single character string that if it's a direct element of list x but we have these other lists inside of it that contain double character strings if it's like nested inside one time and then we have another list inside of that list with triple letter characters to indicate that that's a third level list item so we could look at it this way when it's all flattened out in a single line it gets a little kind of hard to read into that so we've used the code visualizer to show it this way we've got our variable x which is the list in the global frame and its arrow points down to a list with five elements in it at index zero is the string a at index one is the four element list that has bb another list ee and ff in it and inside that list at index one is another list that has the elements ccc and ddd going back to the original list of x in index two we have a character g at index three we have another list with two elements in it hh and ii and then back to list x and index four we have the single character j so we can see we can really kind of combine a lot of lists and together and we'll see a couple of reasons as we're working with them why we would do that for cryptography now to get access to these lists that we've been nesting inside of each other we're going to have to combine together some of our indices so if i were to just do x of one it's going to give me that list that's circled up in green here which contains another list inside of it if i wanted to get a specific element out of that list i could do x of one of two so it's going to go to the first list at at index one which is the one that contains bb a list ee and ff and now it's going to go to index two and that one and pull out ee likewise i could do an even bigger nested list call by going to the list that's at index one going to the list that's in that list at index one and then pulling out the string that's at index zero so x of one of one of zero would give me back ccc we mentioned that you could change what's inside of a list the fancy name for that in programming means it's a mutable object that once you've defined it you can change things right in place we couldn't do that with strings if i had a string and i just wanted to change a particular character out for another one i have to basically reconstruct an entirely new string and then save that here i can just directly uh alter the elements inside of a list so here we've got a list that's called fruits it contains the three strings apple banana and orange and i'm going to just redefine the element that's at index one i can just say fruits of one now equals pair and you can see it changed it right in place i didn't have to do anything else or anything fancy i just can just directly change it likewise if i had another element i wanted to add to the end of the list i could call dot append so fruits dot append of this new fruit peach just sticks it on the end so it extends the list out it used to contain three items now it contains four items no special code to say that it had to expend it out to four or anything it just knows add a new item onto the end this append function is going to be really helpful for our work when we're iterating over a long string or a long list and for each element we want to maybe stick a computation onto the end of a list all right so we know enough about lists to actually go ahead and start computing our proportions or our frequencies as we'll call them and the tool that we'll be using which we'll learn about in just a moment requires that we have all of our frequencies stored in the list and we should just go ahead and do that in the exact same order as our letter string so if letter a is at position zero in the string we'll put the frequency of a and item zero in our list let's pull up the code that we've had from earlier we just make one small tweak notice we now have an empty list being defined that's called frequencies right under the message before we compute the message length that's where we're going to put all of our frequencies as we compute them all right i've got the same set up from before for char in letters so we're going to iterate over that letter string first a goes to char then b then c and so on and think about what are we going to want to do for each one of those letters a b c d e and so on we want to compute that proportion and now we want to stick that into the frequencies list that we've already started to create take a moment see if you can finish out the code we'll go over it in just three two one so next up let's actually just compute the proportion we'll break this down into a couple of steps so we're going to take our message we're going to count the number of times character shows up in that message and because in this case we care about frequencies or proportion we're going to divide by the message length now unlike before we're not going to want to print this proportion out we're going to want to append it into the frequencies list so we can just say frequencies dot append proportion now one thing to call out here about the way that we use dot append is that i didn't need to have like a line that said frequencies equals frequencies dot append proportion that was kind of a strategy we saw when we were putting a letter on to the end of a string we would say like string name equals string name plus the character we're adding on and that would append it on and then save it with list we can just straight up do dot append and it will change it right in place no need for another assignment statement so that's one big difference between strings and lists and it has to do again with because lists are what we call mutable you can change them without having to redefine them all right next up we see what we did if that if that for loop ran for each time that there was a character in letters would run 26 times each time we would append a proportion onto the frequencies list and when we're done we should have all 26 proportions right there in the list ready to go the benefit again of doing it this way is that not only can we print that list out so we can see it we have saved those proportions in the list to the variable that's name frequencies so that other code that might follow can use those numerical values as well let's take a look at how we can actually start to use these values to create a bar chart in this course we're going to be using a python package or a library as it's called that named matplotlib to create some visualizations for us in python much like how we can write our own functions and define them to do some specific computations or maybe print something out to the risk to the screen other people who afraid of their own functions can share them with you and a collection of functions that's called a package so you can see here that there's a lot of functions in this matplotlib package that would do some mathematical or statistical things that maybe you're already really familiar with from other courses you can make graphs you can make scatter plots there's the bar chart that we're interested in but there's a lot of other functionality in this library that we won't even begin to touch on in this course if you're interested in learning more i recommend you go check out the matplotlib website to learn about all of the different functionality and how you can actually start using this tool in more detail for now though let's take a look at how we can use this library to make a bar chart so since somebody else did all the hard work of writing the functions to create the bar charts for us we just need to learn how that we can start using those functions now most libraries the reason why they exist is because the python default language as a whole doesn't contain everything out of the box you can only add in the functions that we want for a particular product so how do we do that this line of code here will get us started you can say import which is the keyword that tells python go look for a specific library and if you find it do the following it's going to import matplotlib so that's the name of our library dot piplot and piplot is the specific component in that library that we are planning to use and since that's so long for us to type matplotlib dot piplot we can give it a nickname to make it easier for us to reference during our later code so we say import that as plt so instead of having the type matplotlib dot piplot i can just type plt it's kind of an alias uh in order to shorten up our code next up we're going to go ahead and create plt dot bar so we're going to call that function we just pulled in from the matplotlib library dot bar dot bar takes in two elements the first is a list of our um the labels for our bar chart it's got to be a list so i'm going to use this function that's called list if you give it a string it will actually turn your string into a list of characters we'll talk more about that in a little bit but it's an easy way to make a list from a string the second thing that dot bar takes in is another list of your values or the frequencies in our case the numerical values so first element is your labels text second element is your heights which is or the frequencies which is which is numerical next up we can uh customize the plot we're about to make with some labels so you can label your x axis label your y axis and give your whole plot a title by just doing plot dot x label plot dot y label plot dot title and giving them a string and now we've set everything up no picture has shown up yet we're just kind of telling it what to make we're going to make the bars here's the labels here's the title and then you end with just a plot dot show and when you run all of that code your bar chart will appear with all of the settings that you have indicated and there are a lot more settings that you can play around with and you can read all about them on the map plot lib website and if you are making more complex plots that had maybe two graphs side by side or things like that there's actually a better way to do that this is we just kept things really simple for this example and because in this course we're really just going to be making one bar chart at a time so the code like this should work just fine but if you're going to keep working with map plot lib and other visualization libraries in python definitely read the documentation to learn how to do those more complicated graphs because it'd be just a little bit different than what we saw here so you might have noticed there's a function we used in there to make this all possible this list function which we can see on the right hand side of the screen the list function takes in the string and it returns a list back where each element is a single character of that string so the example we have on the screen is hello and it's going to give us a list of h in the first element e in the second and so on this is what's called casting variables in python where you you can turn one type of variable into a corresponding version of that it's a different type so it's string to list there's two more examples on the screen one of them is int int you can give it a float or a string that only contains numerical values it will convert those into integers for you so you can actually use them for things that require integers likewise on the bottom left corner there's the str string function where you can give it a number or really anything and it will convert that into a string and then you can index right into that string so I converted 5.0 into the string 5.0 and then I can now access just a particular character out of there I chose index one to get the dot out of there the reason why we would usually cast one data type to another type is that these functions that we're pulling from other people that have that they've written they make assumptions about what the variable type should be in order to make it work so we're going to have to do the legwork on our side of the code to make sure that we're constructing our data and a type that's constructing our data and the type that works with the function that we're pulling into use so putting everything together that we've seen from top to bottom in our cell we would first import matplotlib.piplot as plt normally our import statements always come at the top of our notebook that's or at the top of a cell that we're going to be using those functions in we define our letters our messages we've got our frequencies as an empty list we compute the message length we've run through our letters one character at a time and for each character we calculate the proportion and append it to the frequencies and then we specify all of the things that we want for our bar chart chart by doing plot.bar list of the labels list of the values specify an x label a y label in the title and then we show our plot at the end this is going to be a really nice way for us to do our frequency analysis and get our bar charts up and running in no time so make sure you've got a good copy of this code and you understand how each piece of it works so that's it at this point we now know how to construct some list edit their values and use matplotlib to turn those values into a bar chart