 Hello guys, welcome back to class. So today we'll be dealing about probability sampling. We have already covered the sampling designs, the basics, so you can have a look on the previous session which was sampling designs, the probability and probability. It was just the basics that the designs name, not much details about it. Let's see much details about the probability sampling and the next video I'll be covering non-probability. So probability sampling as the name suggested all are equal. So the equal chance for being in the study is same for every participant. So always probability sampling is a better one compared to non-probability. So let's see what is inside probability sampling. So we know already the techniques of random sampling, systematic stratified cluster and multi-stage. So random sampling as the name suggested we just took samples randomly from the total population. So each person in the sample frame. So sample frame and sample sample is different. Sample frame is the population from which the sample is being drawn. So our reference population, suppose just take an example, dental college and we are doing a study on dental students. So the entire dental students will be our reference population. Sampling frame will be the colleges or the college where the study is being happened. And the sample will be the actual number of people where the study is going to actually happen. So it will come reduce and reduce and reduce at least the least number will be the sample. Little more will be the sampling frame and the target sample will be the most bigger population. So the target population will be all the dental students in could be state, could be our country like that. Sampling frame will be the colleges or the college where the study is going to happen. And sample will be the number of students or the total people where the study is going to actually happen. So random sampling is we take people from sampling frame by using two methods. So that methods are one is lottery method and another one is table of random numbers. So it will be like this. These 12 people are there. So we take randomly. So we get 258 and by randomly we are taking. So all the 12 participants have the chance of being this four. So this can be done by two methods. One is lottery and another one is table of random numbers lottery methods. We know how the lottery is going to work because the people who are taking lottery will be thinking that you will get the price. So every person who is taking lottery will have a chance to get the price. So that is why all are taking the lottery. The same way we put all the participants name in a big bin or big container and we take how much sample we need. So all the names are in the list. All the names are in that particular box. So all have equal chance of being the sample. So we take how much sample we need. We take from that box table of random numbers. This lottery number can be done when the population is finite. So we have 100 people and we need to take 10 minutes. We can use this. Suppose we have 10000 people and we need to take 100 from 10000. It will be very difficult because we need to make cheats or we need to make numbers 10000 numbers and we need to put in this box. That is quite different and what is difficult. So what we do that time is we use table of random number technique. So this is a very age old technique. Nowadays this is not being used anymore because computers are out taking this method. So table of random numbers is just like you can see the numbers. Usually it will be 5 digit numbers. The 1 to 9 numbers will be randomly put in this type columns and rows. So each series will be having 5 digits. That is 1 2 3 4 5. It will be either from 0 to 1. Any random sequence it will be put. Okay. So there are some scientists. So I hope that is clear. The rows and columns will be there. Each series will be of 5 numbers and that will be any order between 0 to 9. Now it is 6 1 4 2 4 first 2 0 4 1 9. So it can be any number. Usually it will be 5 series 5 number series. It will be put in columns and rows and it can go up to 4 pages. Now this is just part of a big data. Big data set it is commonly used. So lottery method I was saying if you want to take 55 students out of 50 we should put cheats for 50 and we mix up and take the 5. So all 50 had a chance to be in the study. At least only 5 people will get the chance. But still 50 had a chance. That is probability. All are equal at the beginning of this technique. So this table of random numbers as I shown this is another part of that big series. There are many scientists like tippet, yeats, fissures I made this random numbers charts it go up to 4 5 pages. So we can take any part of that that table number series and we can use accordingly. So I will just give an example so we will get an idea. So we want to take 10 units. That is 10 is our sample size from 5000. So we have 5000 people and we need to take 10 from 5000. The lottery method is not possible. So this 10 should be taken between 3000 to 8000. So total we have some 20-25000 people. So suppose we are taking 10 people between 3000 and 8000. So what we are doing is we take this random number and we see the numbers which are coming between 3000-8000. So many numbers are coming like this. So what we are doing is we take this part of the table number series. Then we will see we need 3001 to 8000 10 units. So we go like this. We take a number. This is less than 3000. This is fine. This is fine. This is more than 8000. This is less than 8000. So we go select accordingly. In another way we can select if it is in 300 to 800. We have to select 10 people from 300 to 800. We can use the first three letter 300 to 800. So this can be taken 664, 399, 979. So if it is 30 to 80 again we can take the first two digits. So what was the total sample? We go like this. So usually it will be 5 number series. Now we have taken 4 number series. So it depends on the total sample we needed. If it is like 100 people and if it has to be done between 3000 and 8000 or between two certain numbers we have to use that particular number series. Hope that is clear. With 3001 to 8000 we needed 10 people and we took numbers between that particular 3001 to 8000 and we have selected the series. You can see that all the numbers are between 3001 and 8000. And once we get our sample, once we reach 10 we can stop it. Because again people are here. This is also a number. This is also a number. This is all between 3000 and 8000. But we started from here. We go like this. Once we reach here we got our 10 samples. So we stop it there. That is table of random numbers. It is little bit confusing but nowadays it is not using. Computers are interaction. Computer automatically selects random numbers. Next one is systemic random sampling. This is like we go in a systematic fashion. So suppose we needed 100 people and we needed to take it from 1200. So what we do is we find out the sampling interval that is total sample divided by sample size. That is we get 12. Next our sampling interval is 12. So every 12th person we select. So that is first person will be the number 12. Then 24, 36, 48, 60. It goes till last it reaches in 1190 around. So we are giving chance for everyone but we are going in a systematic way. So we go like this. It is suppose another example every third person. We started from 2, then 5, then 8, then 11, then goes to 14, 17. Like that we go in a systematic way. But first we need to find out this sampling interval. That is total population divided by sample size. We get sampling interval. It is very easy but still it maintains the equality. Next one is stratified random sampling. Suppose this random sampling, though it has the equal sense for every student or every participant to be in this study, but sometimes it may not happen as we planned. So if you are taking a student population, when we do it has around 50 boys and 50 girls. When we do random sampling, sometimes we get all the 50 from, suppose our sample size is 10, we get all the 10 from boys side. So we lost the representation of girls. So in such scenario, if we have a heterogeneous group, that is various number of homogeneous groups are mixed. We need to go for a stratified sampling. That is each subpopulation is known as strata. That is heterogeneous. Our classroom will be boys, girls, homogeneous group. If you are thinking about age, it will be again a different age groups. If you are thinking about a college and classroom, the class itself will become a homogeneous group. So it changes in every context. What is heterogeneous and what is homogeneous based on our objective of the study. So urban rural residents, another homogeneous group. So we go to strata. So what we do is we have 12 people and we need four sample. So what we do is we make the first three, then this three, then this three, then three as different strata and we take one from each strata. So suppose random sampling, we might get all four from here. One, 10, 12 and two, and we might lose this. Sometimes in heterogeneous group, it shouldn't be happening. So we do stratification of a heterogeneous group and made into homogeneous strata and we take randomly from each homogeneous group. So this will be a randomization of a random sampling. So from this three, I'll take one from this three, I'll take one, this three, this three. So the random sampling will be there in every stage. Okay, even is systematic sampling also the random sampling was there when we find out our sampling interval. This we need to take one person that is one to 12. That is 12 is still was our sampling interval. So we taken one to 12 can take one, we can take 12, but it should be random sampling. So suppose one to 12, again, we made to a lottery method and we take six. Okay, so every sixth person will be taken. So it will be 618, 30, 42, 54, 66 like that goes on. So random sampling will be there in every design. Only thing is the way we implement is a little bit different. So that is about systematic and stratified sampling. Next is the last one cluster sampling cluster sampling when we go for a beaker geographical area, we might need to do clustering. Okay, it's not stratification clustering. So usually we have one stage, two stage and multi stage sampling. Okay, it is almost same, but little bit different. One stage is we have to make all the clusters of the population in the list. Okay, so suppose we are taking a study in our district. Okay, so we are making the panchayat as clusters and we put all the clusters into our list. Okay, then we take, then we take, then we do random sampling. We have all the cluster that is all the panchayat in that district. Then we do simple random sampling. So we get some five panchayat and we take all the elements from this five panchayats. Okay, that is one stage. First we make clusters that is panchayat. This is scenario panchayat in one district. Then we randomly select a few panchayat and all the members is panchayat will be included in the study. If suppose we have a smaller geographical area, we can go for this one stage. Two stages, we list all the clusters in the population. First select the clusters usually by simple random. And the units in the second cluster of first stage are then sampled in the second stage again by simple random. Suppose we are doing the study in Kerala state. What we do is first we make the clusters as district. So we have 14 districts. So we select simple random sampling. We do simple random sampling and select some five or six district. That is our first stage. Second stage again, random sampling is done. Okay, so every district had a chance. So then what we do in first stage, we select all the members from this four or five district. If it was the first stage because we are taken some panchayats from one district and we are taken all the members from those panchayat. But since it is a bigger population, we cannot do that much. We cannot take that much big samples. So what we do is we select units that is participants from the first stage clusters to the second stage. Okay, so we select if suppose we needed 10,000 or one lakh people. What we do is we select these people from the first stage district. So first stage district will not be taken entirely rather than we do again random sampling on the district that is first stage unit. So the second stage unit will be randomly sampled units from the first stage. Whereas in one stage cluster sampling, the entire unit will be taken after the first stage random sampling. Here after the random sampling, again, there will be one more random sampling of the first unit. Then we'll get the sample we needed. Suppose if it is a very big population like a country, so it is not all possible to do one stage or two stage. So here it comes to states. So sorry, we have to do a multi stage of random sampling. So first we have to select the states level one. We have to randomly select the districts with representative of each state. Some district will take again we do the sampling we select some panchayats. Again we select some towns or municipalities. So every stage we keep make sure that we keep a random sampling technique so that it maintains that equality principle. Okay, so multi stage will be on a bigger population, very bigger population like gross of people to stage and one stage will be lesser population. One stage is just like very small population of a district or something. So that's all about probability sampling. So we can just have a recap probability sampling basic principle is equality all are equal like all have equal chance to be in the study. Random sampling systematic stratified cluster the multi stage is same comes under cluster. Random sampling is just like we are putting into a cheat. So anyone can have that participant of sample. So if it is a very finite number very less number we can do lottery method. If it is a random number we have to go for a random number technique. That is random tables are there tipped see its future tables such tables we need to go for in a series sequence like we start from left we go on. Then we reach the right then we'll come back to the left and we start from left and go here until we reach the desired sample if it is 10 we can stop here. So it depends on the sample where to be taken from here it is 3000 to 8000 so that's why we're taken 10 people between this number. If it is 20 again we can start from here and we can go on selecting it. So it will be like a series of numbers sometimes it will be four mostly it will be five series between 0 to 1 and it will be put in X and Y axis that is rows and columns. If a sample frame that is the total sample if it is 10 has to be taken from 100 we can select the first three if it has to be selected from 1000s we can select the first four if it is 10000 region we can go for 10000. So it is any more using it is replaced by computer systematic means there will be sampling interval so sampling interval will be repeated. So if it is random sampling will be done at sampling interval that is 12 is our sampling interval. So we do random sampling for the 12 numbers and we get some 6 7 8 whatever so we go in that sequence if it is 6 so we repeat the sampling interval 12 12 12. So 18 30 42 so systematic means we have to do the systematic way stratified means we have to divide the heterogeneous into homogenous population that is making strata. The making of strata depends upon your objective it can be gender it can be age it can be class wise it can be college wise so it depends upon your objective of your study. So then after selecting strata we have to do the simple random sampling in each strata lesser sampling I explained it in detail. It is based on the area in stage two stage multi stage every stage we have to follow the random sampling technique until we get the sample. Okay that's about how about the probability sampling technique it is way better than the non probability because it follows the principle of equality. Thank you I will come up with non probability sampling in next class. Thank you.