 The next step in our introduction and accessing data is talking about importing data, which will probably be the most common way of getting data into R. Now the goal here is you want to try to make it easy, get the data in there, get a large amount, get it in quickly and get processing as soon as you can. Now there are a few kinds of data files you might want to import. There are CSV files that stands for comma separated values in the sort of the plain text version of a spreadsheet. Any spreadsheet program can export data as a CSV and nearly any data program at all can read them. There are also straight text files txt. Those can actually be opened up in text editors and word processing documents. Then there are xlsx and those are Excel spreadsheets as well as the xls version. And then finally if you're going to get fancy, you have the opportunity to import JSON, that's JavaScript object notation. And if you're using web data, you might be dealing with that kind of data. Now, R has built in functions for importing data in many formats, including the ones I just mentioned. But if you really want to make your life easy, you can use just one. A package that I load every time I use R is Rio, which is short for our import output. And what Rio does is it combines all of ours import functions into one simple utility with consistent syntax and functionality. It makes life so much easier. Let's see how this all works in R. Just open up this script, and we'll run through the examples all the way through. But there is one thing you're going to want to do first. And that is, you're going to want to go to the course files that we downloaded at the beginning of this course. These are the individual R scripts, but it's this folder right here that's significant. This is a collection of three data sets, I'm going to click on that. And they're all called MBB. And the reason they're called that is because they contain Google Trends information about searches for Mozart, Beethoven, and Bach three major classical music composers. And it's all about the relative popularity of these three search terms over a period of several years. And I have it here in CSV or comma separated value format, and as a text file dot txt, and then even as an Excel spreadsheet. Now let's go to R and we'll open up each one of these. The first thing we're going to need to do is make sure that you have Rio. Now I've done this before that Rio is one of the things I download every time. So I'm going to use Pacman and do my standard importing or loading of packages. So Rio is available now. I do want to tell you one thing significant about Excel files. And we're going to go to the official R documentation for this. If you click on this, it'll open up your web browser. And this is a shortcut web page to the R documentation. And here's what it says. I'm going to actually read this verbatim reading Excel spreadsheets. The most common R data import export question seems to be how do I read an Excel spreadsheet? This chapter collects together advice and options given earlier, note that most of the advice is for pre Excel 2007 spreadsheets and not the later xlsx format. The first piece of advice is to avoid doing so if possible, if you have access to Excel, export the data you want from Excel in a tab delimited or comma separated form, and use read dot delim or read dot CSV to import it into R. You may need to use read dot delim to or read dot CSV to and locale that uses comma as the decimal point. Exporting a diff file and reading it using read dot diff is another possibility. Okay, so really what they're saying is don't do it. Well, let's go back to our, I'm just going to say right here, you have been warned. But let's make life easy by using Rio. Now, if you've saved these three files to your desktop, then it's really easy to import them this way, we'll start with the CSV. We use Rio underscore CSV is the name of the object that I'm going to be using to import stuff into. And all we need is this command import, we don't have to specify that as a CSV or say that has headers or anything, we just use import. And then in quotes and in parentheses, we put the name and location of the file. So on a Mac, it shows up this way to your desktop. I'm going to run that. And you can see that it just showed up in my environment on the top right, I'll expand that a little bit. I now have a data frame. I'll come back out. Let's take a look at the first few rows of that data frame. I'll zoom up. And you can see we have months listed. And then the relative popularity of search for Mozart, Beethoven and Bach during those months. Now, if I want to read the text file, what's really nice is I can use the exact same command import and I just give the location and the name of the file. I have to add the dot txt. But I run that and we look at the head and you'll see it's exactly the same. No difference piece of cake. What's nice about Rio is I can even do the xlsx file. Now it helps that there's only one tab in that file, and that it's set up to look exactly the same as the others. But when I do that, we run through. And you see that once again, it's the same thing. Rio was able to read all of these automatically makes life very easy. Another neat thing is that our has something called a data viewer. Now we'll get a little bit of information on that through help. And you invoke the data viewer. Let's do this one, we do it with a capital V for view. And then we say what it is we want to see. And we'll do Rio underscore CSV. When we do that command, it opens up a new tab here. And it's like a spreadsheet right here. And in fact, it's sortable, we can click on this, go from the lowest to the highest and vice versa. And you see that Mozart actually is setting the range here. And that's one way to do it. You can also come over to here and just click on this little, it looks like a calendar. But it is in fact the same thing, we can double click on that. And now you see we get a viewer of that file as well. I'm going to close both of those. And I'm just going to show you the built in R commands for reading files. Now these are ones that Rio uses on its own. And we don't have to go through all this. But you may encounter these in a lot of existing code because not everybody uses Rio. And I want you to see how they work. If you have a text file and it's saved in tab delimited format, you need to complete address. And you might try to do something like this read dot table is normally the command. And you need to say that you have a header that there's variable names across the top. But when you read this, it's going to get an error message. And it's, you know, it's frustrating. That's because they're missing values in there. And the top left corner. And so what we need to do is we just need to be a little more specific about what the separator is. And so I do the same thing, I say read dot table, there's the name of the file in its location, we have a header. And this is where I say the separator is a tab the back score says that indicate this is a tab. So if I run that one, then it shows up it reads it properly. We can also do CSV. The nice thing here is you don't have to specify the delimiter because CSV means that it's comma separated. So we know what it is. And I can read that one in the exact same way. And if I want to, I can come over here. And I can just click on the viewer here. And I see the data that way also. And so it's really easy to import data, especially if you use the package Rio, which is able to automatically read the format and get it improperly and get you started on your analysis as soon as possible.