 So let's see a really common application of using a dictionary and or a sort of get the idea Something known as getting word frequencies. Let's imagine. I have some File night in our case. I'm using a text file. I'm using Allison Wonderland from Project Gutenberg and as you can see it's literally just the entire structure of Allison Wonderland In a text file now the entire idea is maybe I want to look at this and say well How many times does the word Alice appear in this? File and so how would I go about doing that? Well, it's a lot of words and a lot of text But that's where you know coding can do things for us And so just to walk through a few different things you may have noticed that the List entries had a lot of punctuations attached to them. So Just even here book book has a comma to it There's question marks and maybe I don't you know when I'm looking for how many times the word her appears I don't want her and her period to be separate values So the first thing I just have here is just a quick little function that is going to look at a string and then Utilizing string dot punctuation which just has a list of all the punctuation characters in Python Just to even see that in action. So rent Punctuation You can see that that's all the different characters Inside of just punctuation. I don't want to Use any of those or Consider them in my word frequency. So what I'm doing is for each one of those characters go into our particular line and Simply replace that character with a blank string. So effectively get rid of all the punctuation fair enough. So With that then I'm just doing some referencing to my file And again, we have the relative file path structure going on here where I say go up a directory and I happen to have another folder beside my directory called data and Then there is a text file called Alice in Wonderland there and again, that's exactly there's the data There's the Alice in Wonderland Then I am establishing a an empty dictionary that I'm calling word count and the entire idea is every time I see a word if that word doesn't exist in my dictionary Create a new entry and then add to it. So let's kind of walk through that So again, I'm opening up and just to sort of see this in action And in fact, let me clear the results so you don't Get to see everything right away So I've loaded in that function in the memory I've established that I have two variables a file path in a word count and now just to look at sort of my File processing so once again, we're using the width command to Establish oh well with this file being open Let's call that file or the contents of that file f5 Okay, then for each line inside of five because again as you can kind of see with Alice going on here There's multiple lines going on. So this is one line This is one line This is one line. There's multiple lines so for each one of those lines the first thing I want to do is Convert all of the punctuations into blank spots. So once again for just here I want to get rid of that comma and in that colon, okay Then I'm going to say well now that I've gotten all rid of it And I'm just dealing with words split those words up Based on a space separate them out into individual words in a list Because when I've done that now I can traverse that list of words So for every word inside of a particular line of text So in this case for every single word on this line without the punctuation so bank and of having nothing to do Etc. Every single one of those words. I am just doing a little bit of cheating and making everything lowercase That's mostly just a little more, you know cleaning up. So, you know the difference between Wonderland here where it's all caps and then if there's another, you know time where we see Wonderland and it's you know only one capital letter though. They are considered the same So I'm just kind of doing some little parsing there and then as you can see in the comment what I'm saying here is Okay, let me take a look into my dictionary Let me look in word count. If the particular word I am dealing with is not in that dictionary Make it So again, I'm just using line 10 as my point of reference This is the first time bank is being shown or read or you know processed by the Python file Bank is not in the dictionary. So Create a new entry for bank and set it equal to zero The reason why is because right after you do this Look up the value at a particular Key in this case bank and add one to it So I'm just going in and saying all right Well go through look at it each word and then if you don't see that word make make it exist And if you do see that word Add one to it So no crashing so that means it worked And so then just the last little bit that I've got going on here for right now is take the top 10 in no particular order just take the first words of the Dictionary and first 10 words of our dictionary and show me what their count is Or sorry, just show me them. So Alice's adventures Wonderland Lewis Carol the Millennium Fulcrum Edition so we're just seeing all those words and you may notice that there is no particular order to the dictionary entries Because as you can see wonderland is appearing before Carol and whatnot. That's perfectly fine We'll get to that a little bit But again, we're just looking to see what we can do and convert these into Getting counts for each one Ah, so we've gotten word frequency counts Alice's Appears 12 times and again, we remove the apostrophe. So, you know, you can assume Adventures six times in it's a very common word. So it appears 366 times the super common word so it is appearing Over 1500 times, but you can start to go through and you can see oh, well, here's how I can get the frequencies There is a problem one of the issues with the dictionary is there's no established order as you can see the Even though we see it, you know, it's even though it has a large number We don't know if it's the largest number and then you've got you know addition only appears once But there may be other words that are a lot and there is no fancy quick way to convert a dictionary into a sorted list But through the power of the internet, you know You can find very quick ways to do that sorting process and they start by converting a dictionary into a list of tupled values tuples are just very short Changeable lists and so in this case, let's say for example, I want to take my word list It's just another dictionary going on here, but I want to take a dictionary and turn it into a sorted list of tuples sorted lists of lists So the first thing I'm doing here is I'm creating a sorted list or I'm creating a blank list and Then going through every entry in the dictionary again It is called a key or give me the key for every entry in the dictionary so every single one of the words in my dictionary and then all I want to do nothing terribly crazy nothing terribly fancy I just want to add them. I Want to add the count and then the key in that order specifically the count and then the key To my Empty list. Okay, fair enough. Then I'm going to utilize some of Python's internal functions So in this case, there is something called sorted and what sort it does is it looks at a lists and sorts it Now the last little piece here is some of Python's witchcraft black magic No one knows how it actually works, but it does us thing The entire idea here is we're going to utilize list slicing and very specifically if you use the colon colon native one what that actually will do is Reverse the order of the list So if we think about what I've just done Sort the words or sort our list now again that is going to Make things that are one You know addition one Millennium one Lewis one Carol one those entries currently sit at the very top of our list This is just saying put them on the bottom which conversely is going to put whatever the largest Numbers are at the top. So we've sort of sorted ordered in a descending order. Awesome so now we can take that same traversal approach I can Go in and say it will give me the top words and I'm using just 10 here because there are a lot of words But give me the top 10 words in the Alice in wonderland.txt file For each one of those words We're going to extract out the count and the word and then just print them out. So Let me see where I left off. So let's load the function in the memory and Then let's see what the top 10 words of Alice in wonderland are The as we expected and to a she it of said I Alice You know what you would expect a lot of very common filler words and articles but as you can see once again, we can see Here's how we can get not only the word frequencies But then how to sort those word frequencies into most commonly used words