 The first step in discussing data science methods is to look at the methods of sourcing or getting data that's used in data science. You can think of this as getting the raw materials that go into your analyses. Now you've got a few different choices when it comes to this in data science, you can use existing data, you can use something called data APIs, you can scrape web data, or you can make data. We'll talk about each of those very briefly in a non technical manner. But right now, let me say something about existing data. This is data that already is at hand and it might be in house data. So if you work for a company, it might be your company records, or you might have open data, for instance, many governments, many scientific organizations make their data available to the public. And then there's also third party data, this is usually data that you buy from a vendor, but it exists. And it's very easy to plug it in and go. You can also use APIs. Now that stands for application programming interface. And this is something that allows various computer applications to communicate directly with each other. It's like phones for your computer programs. It's the most common way of getting web data. And the beautiful thing about it is it allows you to import that data directly into whatever programmer application you're using to analyze the data. Next is scraping data. And this is where you want to use data that's on the web, but they don't have an existing API. And what that means is usually data that's in HTML, web tables and pages, maybe PDFs. And you can do this either with using specialized applications for scraping data, or you can do it in a programming language like our Python, and write the code to do the data scraping. Or another option is to make data. And this lets you get exactly what you need. You can be very specific. And you can get what you need. You can do something like interviews, or you can do surveys, or you can do experiments, there's a lot of approaches. Most of them require some specialized training in terms of how to gather quality data. And that's actually important to remember, because no matter what method you use for getting or making new data, you need to remember this one little aphorism you may have heard from computer science, it goes by the name of Gigo, that actually stands for garbage in, garbage out. And it means if you have bad data that you're feeding into your system, you're not going to get anything worthwhile, any real insights out of it. Consequently, it's important to pay attention to metrics or methods for measuring, and the meaning exactly what it is that they tell you. There's a few ways you can do this. For instance, you can talk about business metrics, you can talk about KPIs, which means key performance indicators also used in business settings, or smart goals, which is the way you're describing the goals that are actionable and timely and so on. You can also talk about in a measurement sense classification accuracy, and I'll discuss each of those in a little more detail in a later movie. But for right now, in some we can say this data sourcing is important because you need to get the raw materials for your analysis. The nice thing is there's many possible methods, many ways that you can use to get the data for data science. But no matter what you do, it's important to check the quality and the meaning of the data so you can get the most insight possible out of your project.