 As-Salaam-Alaikum. Welcome to lecture number 38 of the course on statistics and probability. Students, you will recall that in the last lecture I discussed with you the very important and interesting concept of hypothesis testing. I discussed in detail some of the very fundamental concepts involved and we applied this method and this concept to testing hypotheses regarding mu. In today's lecture, we will be discussing hypothesis testing regarding mu 1 minus mu 2 and later we will be talking about the population proportion p and also p 1 minus p 2. But before I talk about any of these students, I would like to discuss with you and revise with you the six basic steps that are involved in any hypothesis testing procedure. As you now see on the screen of hypothesis testing, procedure begins with the formulation of the null and the alternative hypotheses that are denoted by H naught and H 1 respectively. It may be a two-tailed test or it may be a one-tailed test and if it is a two-tailed test, then in case of mu, our null hypothesis would be mu is equal to mu naught and the alternative would be that mu is unequal to mu naught. But if it is a right-tailed test, then the null hypothesis says mu is less than or equal to mu naught, whereas the alternative hypothesis would say mu is greater than mu naught. Hence, it simply means some numerical value that you will be assigning to mu according to your hypothesis. We put the subscript naught to denote this number, mu naught is the value that we are giving to mu under the null hypothesis. Students, you may know that right-tailed test is greater than mu naught. It is a left-tailed test and in that case, your null hypothesis says mu is greater than or equal to mu naught but your alternative will say that mu is less than mu naught. This discussion is in accordance with the previous lecture where I told you that if the alternative hypothesis is greater than or equal to mu naught, then it means that your critical region is totally right-tailed. This is called right-tailed test. If your alternative hypothesis is less than sign, then your critical region is totally left-tailed. The second step in the hypothesis testing procedure is the decision regarding the level of significance. As you now see on the screen, the level of significance represents the probability of rejecting the null hypothesis if it is true. And as I mentioned in the last lecture, students, we always begin by assuming that H naught is true and then we proceed with all the steps. And if in the end, we see that our computed value is very different from the hypothesized value, then we say that we are going to reject this hypothesis. But as far as the beginning of the hypothesis testing procedure is concerned, we always begin by assuming that H naught is true. And since the level of significance represents the probability that we are going to reject a hypothesis which is actually true, of course, we want this probability to be small and it is going to be our own decision how small we are going to keep it. Usually, it is taken as 5 percent, but sometimes we would like to have it as small as 1 percent. The third step is the determination of the test statistic, that formula which will enable us to test our hypothesis. The fourth step is to compute the value of our test statistic based on the data that we have collected on sample basis. And the fifth and very important step is the determination of the critical region. As you now see on the slide, we need to determine the critical region in such a way that the probability of rejecting H naught, if it is actually true, is equal to the level of significance alpha that we have already set. And the other very important point is that the location of the critical region depends upon whether we are conducting a one-tailed test or a two-tailed test. So, if we are conducting a one-tailed test or a two-tailed test, then we are going to take the half area of that five percent to the left tail and the half area to the right tail. And if we are conducting a one-tailed test or if we are conducting a one-tailed test, then we will not divide that area or that level of significance into two parts, rather, poora kapoora jo area hoga corresponding to the level of significance that will be taken on the right tail. To peheli surat jo thi us kesme, humari critical values agar z statistic ki ru se yeh saari baate ho rahe hain, to humari critical values ka honi hain minus 1.96 and 1.96. Lekin dosri surat meh, when it is a right tailed test, humari critical values sirf ek hi hogi and that is z is equal to 1.645. Or in sab baaton ke baad, the last step is very simple and that is to draw a conclusion. Agar meri computed value, acceptance region ke andar lai kari hain, then of course, I will accept the null hypothesis or agar meri computed value rejection region meh lai kari hain, then I will reject the null hypothesis. Alright, let us now proceed to hypothesis testing regarding mu 1 minus mu 2. Aapku yaad hena ke jo lecture sampling distribution ke hawale se hua tha usme, after discussing the sampling distribution of x bar, we proceeded to the sampling distribution of x 1 bar minus x 2 bar. Aapku yaad hoga ke tab yeh saari discussion hui, that we do not have just one population under consideration, rather now we have two populations that we are interested in aur unki jo mean values hain un meh hain interested hain. To humne all possible samples of size n 1 draw ki athe from the first population, all possible samples of size n 2 from the other population, then we found x 1 bar corresponding to every sample that we had drawn from the first population and x 2 bar corresponding to every sample that we had drawn from the second population. Istha se bohat saari x 1 bars, bohat saari x 2 bars hame hasil hui. Lekin baat isse bhi aage jalti hain aur wo yeh that we found all possible differences x 1 bar minus x 2 bar aur wo jo differences aate bohat saari of this type unki jo probability distribution hame ne form ki thi, that of course is called the sampling distribution of x 1 bar minus x 2 bar. Thir do nahayate hain properties pehli kya that the mean of the sampling distribution of x 1 bar minus x 2 bar is equal to mu 1 minus mu 2. The difference between the population means and the standard deviation or the standard error of x 1 bar minus x 2 bar came out to be the square root of sigma 1 square over n 1 plus sigma 2 square over n 2, because we are assuming that the samples were being drawn independently. Students, aap kahange ke itni lambi aur chauri discussion us purane lecture ki dubara se hum kyu kar rahein. Iski wajayi hai ke chahe aap interval estimation karein, chahe aap hypothesis testing karein. Jaisa ke mai pehle kaya chuki hu, both these concepts depend on the concept of the sampling distribution. Aapko yad hi hoga ke confidence interval ki toh me ne aap ke saath puri derivation ki aur aap ne dekha ke but derivation based hi iss sampling distribution pethi iss wak meh sirf yeh kahna chahati hu ke mu 1 minus mu 2 ke baare mein testing karni hain aap uske liye jo relevant sampling distribution hai that is the sampling distribution of x 1 bar minus x 2 bar. Jiski yeh properties me ne abhi aap ke saath share ki, balki revise ki, kyu ke hum already discuss kar chukhe. Toh aap chuke hum nahi chahate ke hum wo derivations iss type ki baar baar karein. Isliye sirf me aapko iss ka jo result hai iss ka jo test statistic bantha hai I will be conveying that to you but I just wanted you to keep in mind all the time ke aap jo formula humare saamne aane wala hai uski bhi bunyadi logic bilkul usi tara ki hai jaisa ke iss se pehle mu ke case me last time hum ne discuss ki. So let us try to understand hypothesis testing regarding mu 1 minus mu 2 with the help of an example. A survey conducted by a market research organization five years ago showed that the estimated hourly wage for temporary computer analysts was essentially the same as the hourly wage for registered nurses. This year a random sample of a number of 32 temporary computer analysts from across the country is taken. The analysts are contacted by telephone and asked what rates they are currently able to obtain in the marketplace. A similar random sample of 34 registered nurses is taken and the resulting wage figures are listed in the table that you now see on the screen. For the computer analysts the wages are 24.10 dollars, 23.75 and so on. And for the registered nurses we have figures 20.75 dollars, 23.80 and so on. Conduct a hybrids. A hypothesis test at the 2 percent level of significance to determine whether the hourly wages of the computer analysts are still the same as those of registered nurses students. In order to solve this problem, aap sabse pehle ye dekhye, ke paan saal pehle ka jo humare paas rekod hai ya jo information hai uski roose to uswakth bilkul baraabar unki wages the aap baan saal ke baaj ye sarve kya jaah rahah hain in order to see ke kya kui farkh pada hai. So according to the six steps that I have discussed with you so many times what would be the first step in this particular problem as you now see on the screen. If we let subscript 1 denote information pertaining to computer analysts and subscript 2 denotes the information pertaining to registered nurses, then the null hypothesis will be mu 1 is equal to mu 2 against the alternative that mu 1 is not equal to mu 2. Now the first statement mu 1 equal to mu 2 can also be written as mu 1 minus mu 2 equal to 0 and the alternative hypothesis can be written as mu 1 minus mu 2 is unequal to 0. Students ye jo mu ki baath ho rahe is particular problem me I hope you realize that it is the mean wage of the analysts or the nurses. Ke humaara jo null hypothesis ho yeh hoga or alternative be issi tarah se hoga ke they are not equal. Iska matlab yeh hoa ke this is a two tail test ke hum yeh nahi kah rahe alternative hypothesis me that mu 1 is less than mu 2 ya mu 1 is greater than mu 2. Hum simply yeh statement de rahe hain ke ji null yeh hain ke the wages are equal on the average or alternative yeh that it is not equal. The next step is the level of significance and as you see on the screen, we in this particular problem are taking alpha equal to 0.02, yani we are allowing only 2 percent risk of committing type 1 error. The third step in this case is the standardized version of the variable x 1 bar minus x 2 bar and according to the discussion that I had with you a short while ago the standardized version of x 1 bar minus x 2 bar is given by z is equal to x 1 bar minus x 2 bar minus mu 1 minus mu 2 over the square root of sigma 1 square over n 1 plus sigma 2 square over n 2. The third reason of the problem is that the value of the variable x 1 bar minus mu 2 is equal to the value of n 1 plus x 2 bar and it is not equal to the value of n 2 bar. It is not equal to the value of n 1 plus n bar minus mu 2, it is not equal to the value because standardization. So, this is our test statistic. Now, the question arises students, if sigma 1 square or sigma 2 square, if they are not known, then what do we do? As I indicated last time, we will be replacing them by their estimators S 1 square and S 2 square. You will remember that I had told you that small S square is an unbiased estimator, but capital S square is the biased estimator. But, today at this time, I would like to share another important thing with you. If your sample size is large, then small s square is approximately equal to capital S square. The reason is, as you now see on the screen, small s square is equal to summation x minus x bar whole square over n minus 1, implying that summation x minus x bar whole square is equal to small n minus 1 multiplied by small s square. On the other hand, capital S square is equal to summation x minus x bar whole square over n, implying that summation x minus x bar whole square can also be written as n times capital S square. Now, if you match these two equations, then we see that n minus 1 times small s square is equal to n times capital S square, which means that capital S square is equal to n minus 1 over n times s square. In other words, capital S square is equal to 1 minus 1 over n times s square. Students, ye jo relation develop hua between capital S square and small s square, aapne dekha ke agar small n, humara cho sample size hai, agar boh bada ho, toh phir 1 over n toh bohati choti quantity hogeena. And as you again see on the slide, what will happen is that we will have something like capital S square is equal to 1 minus 1 over 500 times s square, small s square, yani agar humara small n 500 ho, toh aisa hoga. But 1 over 500 is such a small quantity that we can say that our capital S square is approximately equal to 1 times small s square. And this is why we can replace small s square by capital S square whenever our sample size is large. Mathematically speaking, lekin waise aapko yad hoga ke main aap se kaha tha, ke as a rule of thumb if our sample size is greater than 30 or equal to 30, we consider it large enough for us to apply the Z statistic. And we do not have to think about the T statistic, the one which we use in case of small sample size and the one which I will be discussing with you in a forthcoming lecture. Going back to our example, students hum yaha tak poche ke hamara T statistic hai, Z is equal to x 1 bar minus x 2 bar minus mu 1 minus mu 2 over the square root of s 1 square over n 1 plus s 2 square over n 2. Dek hiye main aap s kaha hai, iski waja yehi hai na, case problem main, sigma 1 square and sigma 2 square are not available. The fourth step is to compute the value of Z and as you now see on the slide, for the computer analysts n 1 is equal to 32, x 1 bar comes out to be 23.14 dollars and s 1 square comes out to be 1.854. Similarly, for the registered nurse says n 2 is equal to 34, x 2 bar comes out to be 21.99 dollars and s 2 square is equal to 1.845. Substituting all these values in our formula, we obtain Z is equal to 23.14 minus 21.99 minus 0 divided by the square root of 1.854 over 32 plus 1.845 over 34. And upon solving this expression, our Z comes out to be 3.43. Hence aap ne note kiya, ke abhi abhi hamne jo calculation ki usme numerator main x 1 bar ki value, x 2 bar ki value aur uske baad minus 0. Why? Because we always begin by assuming that H naught is true. And according to H naught, mu 1 is equal to mu 2, yani mu 1 minus mu 2 is equal to 0. To prezair hai, ke mu 1 minus mu 2 ki jaga pe ham 0 iri klikhenge na. Now that we have computed the Z value, the next step is the critical region. And as you now see on the slide, since we have already stated that our level of significance in this particular problem is 2 percent. Therefore, students, because of the fact that this is a two-tailed test, half of the 2 percent that is 1 percent, 1 percent area will be taken on the right tail and 1 percent on the left tail. And then, if we look at the area table of the standard normal distribution, we find that the Z value corresponding to these tail areas, the Z values are plus and minus 2.33. Hence, the critical region is given by the absolute value of Z greater than or equal to 2.33. Students, absolute value should be greater than 2.33, which means that our actual value may actually either be greater than 2.33 or it may be less than minus 2.33. If my value is minus 3.14, that is 3.14 and that is greater than 2.33 and therefore, my value is falling in the critical region. After all, minus 3. something, it is in the critical region. So, absolute value that we will say that it is the absolute value of this thing is greater than 2.33. The last step is the conclusion. In this problem, our Z value has come out to be 3.43. This means that it is lying in the right tail and it is lying to the right of 2.33, which is one of the two critical values that we have and therefore, our conclusion is that we reject H naught. Students, this means that with only 2 percent probability of being wrong, we are saying that we conclude that the mean wage of the computer analysts is not the same as that of the registered nurses. We have rejected the analysis. There is a difference between the wages of the two categories of professionals. Now, the next question is which of these two categories of workers is earning more? Students, here you just need to use your common sense. Rejected ho gya. This means that the difference between the two wages is significant. The sample data that came from it, the X-bars that came from it, the difference in that is significant. But now, what is the difference? It is the common sense that you see that for you, your mean value is more. It is obvious that it is greater for the computer analysts. This is why our Z value has fallen in the right tail. If it had been a flat situation, then it would have fallen in the left tail. If our value falls on the left tail or falls on the critical region, then what do we say? They are not equal and then the conclusion would have been that mu 1 is less than mu 2, i.e., computer analysts ki tan kha, kum hai, nurses ki tan kha se. Ye us ke baat ka step hai. Pela step to ye hi tha, ke simply either we are going to accept that they are equal or we are going to reject that. In students, ye baat me ne, ab baad me kaiye na, ye is baat se aap ye na khaz kane, ke ham shuru me, hame kuch idea tha is baare me ke kiski tan kha zyada hoge ya kiski kum. Is baat tham ye assume kar rahe hain, ke shuru me hame kuch idea nahi tha. We were just trying to explore ke baarabar hai ya different hai. Baad me ab hum ye kaisakthe hain, jub hai ne kaha. Agar ye idea shuru hi me hota na, ke ji computer analysts ki baat gayi hui hai tan kha, toh phir ham shuru hi se apna jo hypothesis hain na, wo two-tailed tari ke se formulate na karte. Us vakt ham usko one-tailed ke tari ke se formulate karte. Ham null hypothesis ye rakte that mu 1 is less than or equal to mu 2 and the alternative would have been that mu 1 is greater than mu 2. Dekhye mu 1 kya cheez hai, ke computer analysts ki tan kha on the average. Agar shuru hi se hame re zain me ye baat hoti na, ke inki tan kha zyada hoge hi hai toh phir ye, jo hypothesis me ne alternative me abhi abhi aap ke saath aapko bataya. Ye hamara hypothesis hota. Alternative would be mu 1 is greater than mu 2. Aur yei baat ham test karna cha rahe hote asal me. Lekin uska jo, uska jo alternative hai, that would have been placed in the null. That mu 1 is not greater than mu 2, aap not greater than ko aap istra bhi toh kaya sakte hain na, ke mu 1 is less than or equal to mu 2. Let us apply this concept to another example. Say SRA ideas inshallah consolidate hongi. As you now see on the screen, suppose that the workers of factory B believe that the average income of the workers of factory A exceeds their average income. A random sample of workers is drawn from each of the two factories and the two samples yield the following information. For factory A, the sample size is 160, the mean wage is 12.80 and the variance is 64 and for factory B, the sample size is 220, the mean is 11.25 and the variance is 47. Now, in order to test this hypothesis, what is the first step? Of course, it is the formulation of the hypothesis. H naught mu 1 is less than or equal to mu 2 and h alternative that mu 1 is greater than mu 2. Dekhe ye bilkulu sitara ki baat hui jo maine abhi e bhi, pichle example ke liye aap se kahi. mu 1 of course, stands for the mean wage of the workers of factory A and mu 2 for those of factory B. Factory B ke workers ko ye shikayat hai ki factory A joh hai uske workers ki tankhaa, inki tankhaa se zyada hai. Halaan ke dono factories ki tara ki hai aur wo ek hi noiyat ka kaam kar rahe. Now, students in the alternative we are writing mu 1 is greater than mu 2. Isko aap istra bhi to par sakte hai na that mu 2 is less than mu 1. Exactly yehi baat ye factory B ke workers kaya rein ke hamari tankhaa kum hai factory A ke workers se on the average. Phi dekhye mu 2 is less than mu 1 is what the workers of factory B are saying. Istis ko haam ne alternative me iss liye rakha hai because due to the mathematical rationale of this whole procedure it is essential that we must keep that hypothesis in the null which contains the equal sign. Factory B wale worker jo baat kehre hain uske andar equal sign involve nahi ho raha. The reason is that they are saying ke hamari tankhaa unse kum hai. Wo ye nahi kehre ke hamari tankhaa unse kum hai ya zyada se zyada unke brabar hai. Istra ki ko baat nahi ho kar rahe. Ist liye the one which has the less than or the greater than sign without the equal sign that always will have to be in the alternative and the other one which will carry the equal sign that has to be in the null. The next step is the level of significance and in this problem let us suppose that we want alpha to be equal to 5 percent. Next is the test statistic similar to the last example z is equal to x 1 bar minus x 2 bar minus mu 1 minus mu 2 over the square root of s 1 square over n 1 plus s 2 square over n 2 and substituting all the values in the formula we obtain z is equal to 1.99. The fifth step is the critical region and students this time it is a right tailed test. Why did I say this time it is a right tailed test? Ist liye ke ham baat kar rahe the null hypothesis mu 1 is less than or equal to mu 2. Istra bhi kaisate hai ke mu 1 minus mu 2 is less than mu 2 is less than or equal to 0 and the alternative hypothesis students was mu 1 is greater than mu 2. Istra bhi kaisate hai ke mu 1 minus mu 2 is greater than 0. Abhi dekhye greater than kassain aagya alternative hypothesis me ist liye the right tail of our sampling distribution is going to be involved and the critical region is going to be on the right tail. Ab chuke level of significance 5 percent hai lehaza poore ka poora 5 percent area jo hai that has to be taken on the right tail of my sampling distribution aur ab takto aapko yaad hi hochuga hoga k z is equal to 1.645 if we want to have 5 percent area to the right. To yeh hui baat critical region ki in this problem ham yeh nahi kahenge ke modulus of z is greater than 1.645 please do not do that modulus to aap sirf tab likthe hain aap when you are talking about a two tailed test aadha critical region left side pe aadha critical region right side pe to un dono baaton ko ikatha karke we say modulus of z should be greater than or equal to something jab one tailed test hoga to you will simply say that z is greater than or equal to 1.645 this constitutes my critical region yeh baaton kahenge if we are talking about a right tailed test or agar yeh left tailed test hota students then I would have simply said that the critical region is given by z is less than or equal to minus 1.645 yaani agar mera z minus 1.645 say left side pe fall karjaye yaani usse bhi less ho then of course I will reject H naught yeh baathe me dohra rahi hu or baar baar dohra rahi hu the purpose being that they should you know they should be instilled in your mind ek the faapne is basic baat ko samaj liya to uske baat may it be the chi-square test or the F test or any other test you will feel at home and comfortable with this whole process alright the last step in this particular problem the conclusion to docker liyeh what was our z 1.99 so is it falling in the critical region yes it is greater than 1.645 and hence we reject H naught H naught kyaatha students that mu 1 is less than or equal to mu 2 usko to hamne reject kar diya so what is the alternative that we are accepting that mu 1 is greater than mu 2 jise ham mu b kaisate hain ke mu 2 is less than mu 1 yaani jo factory b ke workers kaya re thena us baat ko yeh saara jo test hai iss ne usko support kar diya wo bhi to yeh kaya re thena kya hamari tankhahe kum hai aur hamara saara jo procedure mathematical and scientific procedure jo hamne adopt kya usse their belief has been consolidated that mu 2 is less than mu 1 iss ka matlab ya hua ke abto wo bahothi aur zyada zor aur shor ke saath yeh demand kar sakte hain ke hamari tankhahe bhar hain jaai because we are doing similar work as the workers in factory a. Dekhah aapne how interesting the area of inferential statistics is this is the most interesting and fascinating part of this subjit balke mein kahungi ke jo kuch aapne pehle kya wo sab ek preparation thi to reach what we are doing now to be able to draw conclusions about real life phenomena that may affect policy decisions in a scientific way based on real date. Students ek aur nahayat important baat note karein. Yeh saari jo main aaj aapke saath discussion ki regarding the testing of mu 1 minus mu 2 maini iss saari discussion ke under yeh kyu nahi kaha ke ji we want to do testing regarding mu 2 minus mu 1. I mean aap usko ulta bhi to kar sakte the aap 1 ko 2 kahde hain aur 2 ko 1 kahde hain to iss seko toh koi farat nahi partha naa naam hi badal rahe hain hain hain aap 1 ka naam 2 rakdi hain aap 2 ka 1 rakdi hain. Shair ne kaha hain ke khirad ka naam junu rakh diya junu ka khirad jo chaha hain aap ka husne krish massage karee. You can do that and then what will you do issara kassara gharbar ho jaye gana. Actually there is no need to panic. Aap sab kuch usi tara se kareenge sirf the roles of 1 and 2 will interchange. But just keep in mind one thing students ke agar aap mu 2 minus mu 1 kahde hain hain aap, then you are talking about the sampling distribution of x 2 bar minus x 1 bar aur uske mutabek saari baate uske baat aap kareenge. In fact I would like to challenge you and I would like to encourage you to attempt the same questions that you would first do taking x 1 bar minus x 2 bar, baad me aap unni questions ko kareen using x 2 bar minus x 1 bar, yani un notation school taade. Aur dekhain ke aap ki jo aap ka jo result hain hain husakta hain ki usne kuch fark pade. Kis tara se? Not in the numerical value, but in the sign. Shahid aap kareke dekhain. Abhi iss problem mein, you had 1.99 as your answer. Shahid aap uskul taake karein to aap ka answer minus 1.99 who? But students, even if that happens keep in mind that if in the previous situation your 1.99 was falling in the right tail to the right of the critical value 1.645, in this new situation your value minus 1.99 will fall in the left tail to the left of the critical value minus 1.645. To phir netija kya nikla students? kya aap ke conclusion badal gayi? Aur is it absolutely the same? I throw this to you as a challenge. Aap iss pe khud work ki jay aur apna conclusion khud draw ki jay. What I would like to do next is hypothesis testing regarding p, the proportion of successes in a binomial population. Let us do this with the help of an example. As you now see on the slide, a sociologist has a hunch that not more than 50 percent of the children who appear in a particular juvenile court three times or more are orphans. To test this hypothesis, a sample of 634 such children is taken and it is found that 341 of them are orphans, one or both parents dead. Test the hypothesis using 1 percent level of significance. Students abham egb binomials population kisa deal kar rahe. Isliye ke hum yeh kaya rahe hain ke iss kusam ke bachon meh yaa teenagers meh, yaa wo orphan hain, yaa wo orphan nahi hain. And if he or she isn't orphan, that is success and if not, that is failure. To iss tara ki population ko hum binomial population kahenge. Aap hum jo testing karna chaathe hain, that is of course regarding p, the proportion of orphans in this category of teenagers or children. So what will be the procedure? Exactly the same as before, the six steps of any hypothesis testing procedure will prevail. As you now see on the screen, the null hypothesis says that p is less than or equal to 0.50, but the alternative is that p is greater than 0.50. Aap hu yaad hain na, that the sociologist has the hunch that not more than 50 percent of them are orphans. Ab yeh jo not more than ke alfaaz hain, students iss ke andar less than or equal to yeh dono baat hain iss ke andar aati hain. Therefore, this particular hypothesis jo sociologist ka hypothesis hain, in this particular problem it is falling in the null, not in the alternative. The second step is the level of significance and in this particular problem we want it to be as small as 1 percent. The third step is the test statistic, the formula that will enable us to test this hypothesis. As you now see on the slide, in this situation the test statistic is z is equal to x plus minus half minus n p naught divided by the square root of n into p naught into 1 minus p naught and in this formula plus minus half stands for the continuity correction. Students aaye ab iss formula par step-by-step gaur krte. Sabse pehli baat yeh yaad karein, that in a binomial situation x represents the number of successes in n trials. Duce lafzo mein, x represents the number of successes in my sample of size n. Yeh baat humeisha yaad rakhiye, ke aam halat mein, jab ham kisi aur type ki distribution kaazikar kar rahe hote hain. Maslan the normal distribution or the uniform distribution, continuous variable, waha peh x represents that particular variable that we are interested in. For example, height, weight, blood pressure, income, waghehra waghehra. But in a binomial situation x as written in the formula that you just saw always represents the number of successes in my sample of size n. x is the binomial random variable and it goes from 0 to n. Aapke sample mein mumkin hain ke ek bhi success na ho aur aapke sample mein ho sakta hain ke sari hi sari hi successes ho. Probability chkhwa hain us baat ki bohat kum ho but it is not impossible. So, x is the random variable going from 0 to n representing the number of successes in my sample of size n. When you studied the binomial distribution students, you remember that you found that the mean of this binomial random variable is equal to n p and the standard deviation is equal to the square root of n p q, where q of course is equal to 1 minus p. So, iss hawali se agar hain apne iss binomial random variable x ko standardize karna chahin z mein convert karna chahin iss basis pe ke agar sample size large hain to binomial distribution jo hain that tends to the normal distribution. So, we can use the normal approximation and then we can talk about z ke hain musko standardize kare to phir hain hain z mile. So, what will be the formula? z is equal to any variable minus its mean divided by its standard deviation and in this situation z will be equal to x our variable minus n p over the square root of n p q. So, abhi abhi apne jo formula dekhah, is it not very similar to what I just said? Let us have another look at the formula z is equal to x plus minus half minus n p naught over the square root of n into p naught into 1 minus p naught. Ab dekh rahe hain, ke agar hain iss mein se plus minus half hatane, then we are left with x minus n p naught over square root of n into p naught into 1 minus p naught. Ab ye jo p naught hain, jaisa ke main aapko pehle bataya ke null hypothesis ke tahain p ki jo value hum rakhinge that can be called p naught and since we always begin by assuming that the null hypothesis is true. Therefore, this is the value that we are going to have in the formula. Ab rege baat ke ye plus minus half ka kya sils lahe students, you do recall the continuity correction. You remember that when we said that the binomial distribution can be approximated by the normal, if n is large, to jo points hain, they are replaced by intervals and this is called continuity correction. Intervals hum jo baata hain, usse wo continuity hain me miltiye. Yeha pe hum iss ki jada tabseel me to ne jainge uska jo basic logic hain boh hi hai. Now, the question is ke hum plus half kab istimal karen or minus half kab istimal karen. Students, the answer is that if x is greater than n p naught, then you will subtract half, but if x is less than n p naught, then you should add half. As you now see on the slide, we have p naught equal to 0.5 according to our null hypothesis and n is equal to 634. Therefore, n p naught is equal to 317. Now, x is equal to 341 and therefore, x is greater than n p naught. Hence, we will be subtracting half. Therefore, our z value is computed as follows 341 minus half minus 317 divided by the square root of 634 multiplied by 0.50 again multiplied by 0.50 and upon solving this entire expression z comes out to be 1.87. Now, the next step is the critical region our alpha is equal to 0.01 and this is a one-tailed test because our alternative hypothesis says that p is greater than 0.50. Since this sign is the greater than sign, therefore, the entire one percent area has to be taken on the right tail of the standard normal distribution and if we look at the area table, we find that z comes out to be 2.33. The last step is the conclusion. Our z value has come out to be 1.87 since it is less than 2.33. Therefore, it does not fall in the critical region and hence, we can accept H naught and conclude that the sociologist's hunt is acceptable. Do you remember that the sociologist thought that not more than 50 percent of the children or teenagers of this particular type are orphans or since our z value fell in the acceptance region, therefore, we can accept this hypothesis and we can say that this data does not provide sufficient evidence for us to conclude that even more than 50 percent of such children are orphans. Students, in the next lecture, I will be discussing with you the hypothesis testing procedure regarding p 1 minus p 2. Of course, we can talk about p 1 minus p 2 and our test could be null hypothesis that p 1 is equal to p 2. For example, the proportion of smokers in Karachi is the same as the proportion of smokers in Lahore and the alternative that p 1 is unequal to p 2 or we can have a one-tailed test. In the meantime, I would like to encourage you to attempt a lot many questions pertaining to hypothesis testing regarding mu 1 minus mu 2 and hypothesis testing regarding a single population proportion p. My best wishes to you and until next time, Allah Hafiz.