 Welcome to the third video in the series, Data Science for Beginners. In this one, you'll get some tips for formulating a question you can answer with data. You might get more out of this video if you first watch the two earlier videos in this series. The five questions data science can answer, and is your data ready for data science? We've talked about how data science is the process of using names and numbers to predict an answer to a question. You can also call names, categories, or labels. Now the trick here is that it can't be just any question. It has to be a sharp question. A vague question doesn't have to be answered with a name or a number, but a sharp question must. Imagine you found a magic lamp with a genie who will truthfully answer any question you ask. But it's a mischievous genie, and he'll try to make his answer as vague and confusing as he can get away with. You want to pin him down with a question so airtight that he can't help but tell you what you want to know. If you were to ask a vague question like, what's going to happen with my stock? The genie might answer, the price will change. It's a truthful answer, but it's not very helpful. But if you were to ask a sharp question like, what will my stock sale price be next week? The genie can't help but give you a specific answer and predict a sale price. Now once you formulate your question, check to see whether you have examples of the answer and your data. So if your question is, what will my stock sale price be next week? Then we have to make sure that our data includes the stock price history. If our question is, which car in my fleet is going to fail first, then we need to make sure that our data includes information about previous failures. These examples of answers are called a target. A target is what we are trying to predict about future data points, whether it's a category, a name, or a number. If you don't have any target data, you'll need to get some. You won't be able to answer your question without it. Now to take this to the next level, sometimes you can reword your question and get a more useful answer. The question is this data point A or B predicts the category of something, and to answer it we use a classification algorithm. The question of how much or how many predicts an amount, and to answer that we use a regression algorithm. To see how we can transform one of these into the other, let's look at the question, which news story is the most interesting to this reader? It asks for a prediction of a single choice from many possibilities. In other words, is this A or B or C or D? Is it this story or that one or that one or that one? And it would use a classification algorithm. But this question may be easier to answer if you reword it, such as, how interesting is each story on this list to this reader. Now you can give each article a numerical score, and then it's easy to identify the highest scoring article. This is a rephrasing of a classification question into a regression question. How you ask a question is a clue to which algorithm can give you an answer. You'll find that certain families of algorithms, like the ones in our news story example, are closely related. You can reformulate your question to use the algorithm that gives you the most useful answer. But most importantly, ask the sharp question, the question you can answer with data, and be sure you have the right data to answer it. We talked about some basic principles for asking a question that you can answer with data. Be sure to check out the other videos in Data Science for Beginners, from Microsoft Azure Machine Learning.