 So the first thing that we need to do is import our pandas library. So we're gonna say import and we're gonna say pandas. Now this will import the pandas library but it's pretty common place to give it an alias and as a standard when using pandas people will say as pd. So this is just a quick alias that you can use that's what I always use and I've always used it because that's how I learned it and I want to teach it to you the right way so that's how we're gonna do it in this video. So let's hit shift enter. Now that that is imported we can start reading in our files. Now right down here I'm gonna open up my file explorer and we have several different types of files in here. We have CSV files, text files, JSON files and an Excel worksheet which is a little bit different than a CSV. So we're gonna import all of those I'm gonna show you how to import it as well as some of the different things that you need to be aware of when you're importing. So we're gonna import some of those different file types and I'll show you how to do that within pandas. So the first thing that we need to say is pd. and let's read in a CSV because that's a pretty common one. We'll say read underscore CSV and this is literally all you have to write in order to call that in. Now it's not going to call it in as a string like it would in one of our previous videos if you're just using the regular operating system of Python. When you're using pandas it calls it in as a data frame and I'll talk about some of the nuances of that. So let's go down to our file explorer. We have this countries of the world CSV you just need to click on it and right click and copy as path and that's literally going to copy that file path for us. You don't have to type it out manually you can if you'd like and we're just going to paste it in between these parentheses. Now if we run it right now it will not work I'll do that for you. It's saying we have this unicode error. Basically what's happening is it's reading in these backslashes and this colon and all those backslashes in there and this period at the end. What we need to do is read this in as a raw text. So we're just going to say r and now it's going to read this as a literal string or a literal value and not as you know with all these backslashes which does make a big difference. When we run this it's going to populate our very first data frame so let's go ahead and run it and now we have this CSV in here with our country and our region. Now if we go and pull up this file and let's do that really quickly let's bring up this countries of the world it automatically populated those headers for us in the data frame but we don't have any column for those zero one two three. So if we go back as you can see right here there's this index and that's really important in a data frame it's really what makes a data frame a data frame and we use index a lot in pandas. We're able to filter on the index search on the index and a lot of other things which will show you in future videos but this is basically how you read in a file. Now if we go right up here in between these parentheses and we hit shift tab this is going to come up for us let's hit this plus button and what this is is these are all the arguments are all the things that we can specify when we're reading in a file and there are a lot of different options. So let's go ahead and take a look really quickly. Really quickly I wanted to give a huge shout out to the sponsor of this entire pandas series and that is Udemy. Udemy has some of the best courses at the best prices and it is no exception when it comes to pandas courses. If you want to master pandas this is the course that I would recommend it's going to teach you just about everything you need to know about pandas. So a huge shout to Udemy for sponsoring this pandas series and let's get back to the video. The first thing is obviously the file path. We can specify a separator which there is no default so when we're pulling in the CSV when we're reading in the CSV it's automatically going to assume it's a comma because it's a comma separated file. You can choose delimiters headers names index columns and a lot of other things as you can see right here. Now I will say that I don't use almost any of these. The few that I'm going to show you really quickly in just a second are at the very top but you can do a ton of different things and I'm just gonna slowly go through them. So that's what those are. You can also go down here this is our doc string and you can see exactly how these parameters work. It'll show you and give you a text and you walk you through how to do this. Again most of these you'll probably never use but things like a separator could actually be useful and things like a header could be useful because it is possible that you want to either rename your headers or you don't have a header in your CSV and you don't want it to auto populate that header so that is something that you can specify. So for example this header one and I'll show you how to do this the default behaviors to infer that there are column names. If no names are passed this behavior is identical to header equals zero. So it's saying that first row or that first index which it's like right here that zero is going to be read in as a header but we can come right over here and we'll do comma header is equal to and we could say none and as you can see there are no headers now instead it's another index. So we have indexes in both the x-axis and the y-axis and so right now we have the zero and one index indicating the first column and the second column. If we want to specify those names we can say the header equals none then we can say names is equal to and we'll give it a list and so the first one was country and what's that second one oh region so right here that's the first row but we'll rename it and we'll just say country and region and when we run that we've now populated the country and the region we're just pretending that our CSV does not have these values in it and we have to name it ourselves that's how you do it but let's get rid of all that because we actually do want those in there so we're just going to get rid of those and read it in as normal and there we go. Now typically when you're reading in a file what you need to do is you want to assign that to a variable almost always when you see any tutorial or anybody online or even when you're actually working people will say df is equal to df stands for data frame again this is a data frame in the next video in the series i'm going to walk through what a series is as well as what a data frame is because that's pretty important to know when you're working with these data frames but we'll assign it to this value and then we'll say we'll call it by saying df and we'll run it and that's typically how you'll do things because you want to save this data frame so later on you can do things like data frame dot and you can you know pass in different modules but you can't really do that it's not as easy to do it if you're calling this entire csv and importing it every time so let's copy this because now we're going to import a different type of file so now we've been doing read csv but we can also import text files now you can do that with the read csv we can import text files let's look at this one we have the same one it's countries of the world except now it's a text file because i just converted it for this video i'll copy that as a path and so now when we do this oops let me get those quotes in there they'll say world dot txt it will still work as you can see this did not import properly um we have this country backslash t region and then all of our values are the exact same with this backslash t that's because we need to use a separator and i'll show you in just a little bit how we can do this in a different way but with that read csv this is how we can do it we'll just say sep is equal to we need to do backslash t now let's try running this and as you can see it now has it broken out into country and region we could also do it the more proper way and this is the way you should do it and i'll get rid of these really quickly but just want to keep them there in case you want to see that but you can also do read underscore table and let's get rid of this separator and now we have no separator it's just reading it in as a table let's run this and it reads it in properly the first time this read table can be used for tons of different data types but typically i've been using it for like text files um we can also read in that csv so let's change this right here to csv we can read it in as a csv but just like we did in the last one when we read in the text file using read csv this read table you're going to need to specify the separator and so i'll just copy this and we'll say comma and now it reads it in properly again you can use that for a ton of different file types but you just need to specify a few more things if you don't want to use the more specific read underscore function when you're using pandas now let's copy this again we're gonna go right down here and now let's do json files json files usually hold semi-structured data um which is definitely different than very structured data like csv where it has columns and rows so let's go to our file explorer we have this json sample we will copy this in as the path let's paste it right here and we'll do read underscore json again these different functions were built out specifically for these file types that's why you know each one has a different name so now we're reading this in as the json let's read it in and it read it in properly now let's go ahead and copy this and take a look at excel files because excel files are a little bit different than other ones that we've looked at um so let's just do read underscore cell and let's go down to our file explorer and let's actually open up this workbook as you can see we have sheet one right here but we also have this world population which has a lot more data let's say we just wanted to read in sheet one we can do that or by default it's going to read in this world population because it's the first sheet in the excel file well let's go ahead and take a look at that let's get out of here and let's say oops I forgot to copy the file path let's go ahead and copy as path and we'll put it right here and let's just read it in with no arguments or anything in there or no parameters when we read it in it's reading in that very first sheet so this is the one that has all of the data now let's say we wanted to read in that extra sheet name or the second sheet name we'll just go comma sheet underscore name so it's equal to and then we can specify sheet was it sheet one like this yes it was so we just had to specify the sheet name right here and then it brought in that sheet instead of the default which is the very first sheet in that excel and that definitely covers a lot of how you read in those files again you can come in here and hit shift tab and this plus sign and take a look at all the documentation and you can specify a lot of different things things that I didn't think were very important for you guys to know especially if you're just starting out the ones that we looked at today are what I would say are like the ones that I use almost all the time so I wanted to show you those but if you're interested in any of these other ones or you have very unique data and you need to do that um you know it's worth really getting in here and figuring things out a few other things that I wanted to show you just in this kind of first video or this intro video on how to read in files um one thing that you may have noticed especially in this file right here is we're only looking at the first five and then the last five so if we wanted to see all the data all the data is in these like little three dots right here right we want to be able to see that data but right now we can't and that's because of some settings that are already within pandas and all we need to do is change that so this one has 234 rows and four columns so obviously we can see all the columns well let's just change the rows all we'll say is pd dot set underscore option now what we need to do is we're going to change the rows we're not going to change the columns at least not on this one so we'll say quote display dot max dot rows now if we just run this or whatever data we bring in it's going to be able to show the max rows and then we'll say 235 all those 234 rows I'm just going to be safe let's run this and now it has changed it so let's read in this file again and you'll see how it's changed now we have all the numbers and we have this little bar on the right that allows us to go down all the way to the bottom and all the way to the top so now we can actually look and kind of skim and see our values I like that better than just having that you know shorter version we can do the exact same thing on columns as well so if we look at this one this is our json file it has the same thing right here we have what was it 38 columns but we can only see I think it's maybe it's 20 or something like that I can't remember but we have 38 we can only see like let's say 15 of them or 20 of them we'll do the exact same thing and we'll just say pd.setOptions.max.columns and we'll set that to 40 for that one when we run this oops let's get over here when we run this one again we can now scroll over and see every single one of our columns now that one is a in my opinion a lot more useful I like being able to see every single column so definitely something that you should be using especially when you have these really large files you want to be able to see a lot of the data and a lot of the columns so when you're slicing and dicing and doing all the things that are about to learn in this pandas series you know you know what you're looking at I also want to show you just how to kind of look at your data in these data frames as well because that's also pretty important so let's go right down here and the very last one that we imported was this one right here this read excel so this data frame is the only one that's going to read in let's run it this is the last one to be run so this variable right here df it won't be applied to all these other ones which we can always go back and change this typically you'll do something like data frame two you want to do something like that um so let's keep data frame two oops so what we're going to do is we're going to bring data frame two right down here and we want to take a look at some of this data we want to know a little bit more about it something that you can do is data frame two dot info and we'll do an open parentheses and when we run this it's going to give us a really quick breakdown of a little bit of our data so we have our columns right here rank cca three country and capital it's saying we have 234 values in those columns because there's 234 scroll up here because there's 234 rows that tells me that there's no missing data in here at least not you know completely missing like null values there is something in each of those rows the count tells me it's not null so there's no null values and it tells me the data type so it's ringing in as an integer an object an object and an object and it also tells us how much memory it's using which is also pretty neat because when you get really really large data types memory usage and and knowing how to work around that stuff does become more important than when you're working at these really small you know sample sizes that we're looking at we can also do oops let me get rid of that you can also do data frame two and we'll do shape and for this one we do not need the parentheses and all this is going to tell us is we have 234 rows and four columns we're also able to look at the first few values or rows in each of these data frames so we can just say data frame two dot head and if we do that it's going to give us the first five values but we can specify how many we want you say head 10 it'll give us the first 10 rows right here we can do the exact same thing and let's go right down here and we'll say tail but it'll give us the last 10 rows within our data frame now let's copy this and let's say we don't want to actually look at all of these values or all these columns we can specify that by saying df2 and oops let's get rid of all of this and we'll say with a quote we'll say rank and now we can take just a look the rank data now we can't do that by doing the index or at least not like this if we want to use this index that is right here we can but there's a very special function called look and I look for that and I'm going to have an entire video on this because it does get a little bit more complex but there's df2.loc and there's look and I look stands for location and I location that's only for the indexes whether it's the x axis or the y axis those are the indexes and for location it's looking for the actual text the actual string of the index so if we come up here that data frame two we can specify 224 and it'll give us this information right here a little different format so let's go bracket and we'll say 224 and when we run this it gives us our rank cca country capital with our values over here kind of like a dictionary almost now let's copy this and we'll say df2.loc and right now these look the exact same but we haven't really talked a lot about changing the index and you can change the index to a string or a different column or something like that and we'll look at that in future videos the i-lock looks at the integer location so even if these let's go right up here even if this index had changed to let's say this rank or the cca3 or country or whatever you make this index the i-loc will still look at the integer location so that 224 would still be 224 even if it was Uzbekistan so then when we look at this it's going to be the exact same but if we had changed that index this loc is the one that we could search on and we could search Uzbekistan is that how you spell Uzbekistan hey I nailed it so that is how you use loc and i-loc again I just wanted to show you a little bit about how you can look at your data frame or search within your data frame now in future videos I'm going to dive a lot deeper into a lot of the concepts that we just looked at because I just kind of touched on them I wanted you to have a brief introduction to them so that in future videos I'm not just dropping everything on you all at once so hopefully this was a good quick introduction to those topics you should be able to read in a file now see your data frame and kind of look at it in a few different ways that we just looked at and I hope that that was helpful and if it was be sure to check out all my other videos on python and pandas and if you like this video be sure to like and subscribe below and I'll see you in the next video