 Hello students, welcome to module number 16 of your introduction to data science course. In this module we are going to cover fundamentals of inferential statistics. Inferential statistics is about study of data collected through a sample and then making generalization about the overall population. In this we use different techniques like probability theory, probability distribution, then we use different estimations or forecasting techniques, then there is something called central limit theorem which is very interesting basically you know theory or a theorem about the data we collect from samples and then we talk about hypothesis testing. Hypothesis testing is the I would say basis of over inferential statistics. We collect the data, we have certain assumptions. We set assumption about a data which is called dull hypothesis and then which is normally you note it with H0, then H1 is your alternative hypothesis which you discover after putting your or applying your statistical model or the statistical distribution on the data. So in this picture you see that there is a population and there is a sample. So a sample has to be collected most probably or it is best is to select a random sample. There are different techniques again sampling or random sampling is a complete subject within statistics but we assume here that we are just talking about basic sampling techniques. So we select a random sample from a population and we have to make sure that when we apply data collection method which is a survey. So you should try your planning so that you can collect the true representative of the population so that you generalize some estimation about the population. It should be close to the real values. Now we talked about probability. So probability is something about happening of something. If I say that for the first time you are watching in the picture that we have this dice, the grain of wheat, when we roll it, there are six different possibilities. Your one will come, two will come, three, four, five, six. Everybody likes a six but it doesn't come basically. You have to wait. This is something which is not in your control and it is basically it depends on something which I think nobody knows. Anyway this is just to explain to you the probability theory. The chance of anything happening is called probability. For example, if it is raining, then it is raining. Now in probability or estimation, weather forecast is used a lot and there are many complicated things in it. So when we apply these things to our different models of statistical, then we will discuss them. It is important to discuss here and understand the logic of probability theory. This is one of the simplest examples. So basically probability and when you combine it with a lot of samples, if you have rolled one dice a hundred times or five hundred times, then the results come, you tell them and record them. Or if you roll two dice, then the results come that the first one has three, the second has five, the second has six, the second has six. So when you do all this, then you do it this way, the entire data is collected and then when we plot it according to the distribution, then your probability distribution is formed. In the world where there is a lot of betting on games, on casinos and so on, then the theory of probability is very applicable. The fundamental of our inferential statistics is the statistical estimation that I have collected a data, I have built a model on it, and then I have estimated it. As we are talking about cricket in one of the previous modules, when you play a 50 over or a 20 over game, then the team management plans it. First I have to do this in 10 hours, then this in 20, then this in 30. And if it does not happen in that way, then they manage it according to what it is like to increase the speed in the next over game, you have to increase the average or you have to stop the average, depending upon what you are bowling or betting. So this is the use of estimation. The central limit theorem, this is very interesting, it says that everything in this world is normally distributed, which is in fact not, but this is a theorem that we have a lot of things that are applicable. That is statistically proven that as the sample size increases, it becomes close to your population. And as you increase, when you study different statistical distributions, you will get an idea that when you increase the sample size, then the distribution can be converted into a normal distribution. The final inferential statistic in this is hypothesis testing. As we said earlier, it is null hypothesis, then it is alternate hypothesis. Based on that, you can study it in three areas. The left side, the lowest side of your distribution, the center side, or the higher side, you can study those three. The bell shape curve, it does not mean that this thing is small or big, it is basically number of occurrences or the representation of standard deviation or variance represents that thing.