 When it comes to data sourcing, a really good way of getting data is to use what are called APIs. Now, I like to think of these as the digital version of Proofrock's mermaids, if you're familiar with the Love Song of J. Alfred Proofrock by T.S. Eliot. He says, I have heard the mermaids singing each to each. That's T.S. Eliot. And I like to adapt that to say, APIs have heard apps singing each to each. And that's by me. Now, more specifically, when we talk about an API, what we're talking about is something called an application programming interface. And this is something that allows programs to talk to each other. It's most important use in terms of data science is it allows you to get web data. It allows your program to directly go to the web on its own, grab the data, bring it back in, almost as though it were local data. And that's a really wonderful thing. Now, the most common version of APIs for data science are called REST APIs. That stands for representational state transfer. That's the software architectural style of the worldwide web. And it allows you to access data on web pages via HTTP. That's the hypertext transfer protocol that, you know, runs the web as we know it. And when you download the data, you usually get it in JSON format that stands for JavaScript Object Notation. The nice thing about that is that's human readable, but it's even better for machines. Then you can take that information and you can send it directly to other programs. And the nice thing about REST APIs is that they're what's called language agnostic, meaning any programming language can call a REST API can get data from the web and can do whatever it needs to with it. Now, there are a few kinds of APIs that are really common. The first is what are called social APIs. These are ways of interfacing with social networks. So for instance, the most common is Facebook, there's also Twitter, Google talk has been a big one and four squares well and then SoundCloud, these are on lists of the most popular ones. And then there are also what are called visual APIs, which are for getting visual data. So for instance, Google Maps is the most common. But YouTube, something that accesses YouTube on a particular website, or AccuWeather, which is for getting weather information, Pinterest for photos and Flickr for photos as well. So these are some really common APIs, and you can program your computer to pull in data from any of these services and sites and integrate it into your own website or here into your own data analysis. Now there's a few different ways you can do this. You can program it in R, the statistical programming language. You can do it in Python. Also, you can even do it in the very basic bash command line interface. And there's a ton of other applications basically anything can access an API one way or another. Now I'd like to show you how this works in R. So I'm going to open up a script in our studio. And then I'm going to use it to get some very basic information from a web page. Let me go to our studio and show you how this works. I've opened up a script in our studio that allows me to do some data sourcing here. Now I'm just going to use a package called JSON light. I'm going to load that one up. And then I'm going to go to a couple of websites. I'm going to be getting historical data from Formula 1 car races. And I'm going to be getting it from airgas.com. Now if we go to this page right here, I can just go straight to my browser right now. And this is what it looks like. It gives you the API documentation. So what you're doing for an API is you're just entering a web address and to end that web address it includes the information that you want. I'll go back to R here for a second. And if I want to get information about 1957 races in JSON format, I go to this address. I can skip over to that for a second. And what you see is it's kind of a big long mess here, but it is all labeled and it's clear to the computer what's going on here. I'll go back to R. And so what I'm going to do is I'm going to save that URL into an object here in R. And then I'm going to use the command from JSON to read that URL and save it into R. And which it is now done. And I'm going to zoom in on that so you can see what's happened. I've got this sort of mess of text. This is actually a list object in R. And then I'm going to get just the structure of that object. So I'm going to do this one right here. And you can see that it's a list and it gives you the names of all the variables within each one of the lists. And what I'm going to do is I'm going to convert that list to a data frame by I went through the list and found exactly where the information I wanted was located. You have to use this big long statement here. That'll give me the names of the drivers. Let me zoom in on that again. There they are. And then I'm going to get just the column names for that bit of the data frame. And so what I have here is six different variables. And then what I'm going to do is I'm going to pick just the first five cases and I'm going to select some variables and put them in a different order. And when I do that, this is what I get. I'll zoom in on that again. And the first five people listed in this data set that I pulled in from 1957 are Juan Fangio makes sense, one of the greatest drivers ever and other people who competed in that year. And so what I've done is by using this API call in R, it's a very simple thing to do. I was able to pull data off that web page in a structured format and do a very simple analysis with it. And let's sum up what we've learned from all of this. First off, APIs make it really easy to work with web data. They structure it, they call it for you, and then they feed it straight into the programs for you to analyze. And they're one of the best ways of getting data and getting started in data science.