ちゃん बतabor- दश्यचएन घब स्रज़़ा के मेक मेक सगल सढ्ता, लेग आघने , दश्यचएन धफ सचना के thi लैये, अणलोबन आप ह़द़ा चिडद मा है. और भई भई परख्षम इस नीदे इन थी तोब और रिएा च्डई � trap वर्टम्योटा करता पाभ्टूपलिलेशातचा तूर आपाद् उठ्तlocks में। उसमभित आप्टूपलिलेशात करता औरने करतिखाटा आबस्टीख्याण्प् से अपभुगत करे रहा रहा हरु बाद एक। इसे श़तम दीवार मट्रें ग़िल्चाः रफ्वादा करहाईए। अपस वो बवो वो वो रहा है कै क्छीखा वो ज़े कातेवाईए। अ land production quantity or a certain parametric value for that product that you give. For example, if we give a certain flat product. And we may say that the flat product has a certain strength property. Now we cannot test each and every product to have what strength property it has, तो अधन्फरे दीर of the product produced in an industry is the population. तो और विने क्या को जजिल है को जचिल है। But what we do is we draw randomly small sample, तो जचिल है कि बब़वादे तो तो जचिल है। Test the strength properties and then we declare that this is the kind of strength property our population, our production will have. To analyze the data, the small sample realization that we get from thehstrend properties is the question we are trying to answer through parameter estimation. So we talked in the previous session that there are two types of estimator. ह हcam is called a point estimator, which would for example give you the exact value of say yield strength of the product. चन दे� worthy another estimator, in which we give an interval estimator. So, we say that instead of saying that the yield strength will be exactly 1300 MPa, we instead say that it would lie in a certain interval wave for ninety-five percent of the time. idaćс that 95% of the product, which will come out of this industry will have real strength falling in this interface.  alarmingly, we have been starting a discussion in previous session about the point estimator  chronically, we talked about maximum likelihood estimator in which we said that ,Ч rest of its state, that we know the population distribution but somehow we know the form of the distribution special, but we do not know the parameter of the distribution. So, that parameter we say is if it is theta then we say that take a joint density function of all the sample realization that we have got, which is independent and identically distributed. So, it is simply a product of their individual densities and then this function contains all possible स्य formulation that you can get from the sample for the population with respect to theta. ग बावास बाधा। this is what we called a likelihood function of theta, unknown parameter and then we maximized this likelihood and the point that give you the gave you the maximum value, we called it a Point estimator, maximum likelihood point estimator of the parameter theta. three examples. Two examples, burn 복ley trials are normal and very simple and straight forever where you take the derivative of the maximum likelihood function equate it to zero and then you find what is the value of the one or two unknown parameters. But then we also took the case of Weibull distribution in which we found that you। ञीज़ी, चीज़ी। और जबी दाई, आप रव मुडिकाया वो नहीं धूल्चाím। तक अरोगार, तुई वह थहाँ तुए पवाझ्तार, चीच़ क। शाहँज भी कुपॉञ। चीच्चचचचचचचचचचच चचचचचच। जो सामेंजा। ड़ at at same 1 22 22 22 22 22 And then let us denote mu1, mu2, mu3, mua so on and so forth as a raw moments of distribution function f theta and let m1, m2, m3, ma etc correspond to the sample raw moments. Let us recall mu, the kth raw moment of a population is defined as expected value of x to the power k, while the sample raw moment of level k that is kth sample raw moment is defined as 1 over n summation of xi to the power k. Now, what we do in method of moments is that we equate the population moments with the sample raw moments. We take the population raw first raw moment and equate it to the sample first raw moment, population second raw moment and we equate it to the sample second raw moment like this. Now, if there are q number of unknown parameters then we take the first q raw moments from the population and first q raw moments from the sample and we equate them and we solve these equations because on one side when you talk about the population moments there is an unknown parameter theta in it while the sample raw moments are not unknown because there is a realization of data and from data we can calculate it. So, we have an equation with q unknowns, q equations with q unknowns and we need to solve them simultaneously. Let us take an example of Bernoulli trial which is the simplest of the kind. x1, x2, x3, xn come from a Bernoulli population where the probability of success is p and xi is 1 if trial is success and xi is 0 if trial is not successful. Then you know that probability of xi is p to the power x 1 minus p to the power 1 minus x where x takes on a value 0 or 1 and there is only one unknown parameter which is p. So, we need to compare only the first raw moment of the Bernoulli population and the first raw moment of the sample. The first raw moment of sample is a sample mean. So, we have to compare expected value of x is equal to sample mean and expected value of x in the Bernoulli trials is p and therefore the p curl this is the notation I have used. This is the notation I have used p curl is a method of moment estimator of Bernoulli parameter p probability of success p which is equal to x bar. Remember we got the same result in the maximum likelihood estimator. This is not a rule, this is an exception. Anyway, let us continue another thing which will look like a rule. If you take a normal distribution, we take a sample x 1, x 2, x 3, x n instead of calling it every time random I have chose now to call it independent identically distributed sample which is called IID sample from a normal distribution with mean mu and variance sigma square where mu and sigma square are unknown parameters. The distribution now there are two parameters. So, we need to have two population raw moments and two sample raw moments. So, the population raw moment is mu and the second raw moment is sigma square plus mu square. While the sample raw moment is sample mean and it is summation of xi square over n. This is the second raw moment. Now, if you equate the first moment with the first population moment with the first raw sample moment, you get the mu curl is equal to that is the MME estimator or method of moments estimator of mean value of normal population is same as the sample mean or sample average. Now, you find that you have to equate the second moment that is second moment of population with the second raw moment of the sample. And if you simplify it you will find that the method of moment estimator of variance of normal distribution sigma square curl is equal to actually this is MLE of sigma square which is we call it sigma square hat. You please confirm this. This is also a good case. But now let us consider the case where this may not always hold true. And this is why I am going to I have decided to discuss this method here. Method of moments is a very attractive method. And as you saw in the two simple examples, they very easily give us in a very much simpler manner the maximum likelihood estimator for the unknown parameters in the case of Bernoulli trials as well as in the case of normal distribution. Remember maximum likelihood estimator you have to find a likelihood function then take a log likelihood then its take derivative then you equate to 0 and then you solve the equation. Compared to this in these two examples you must have seen that finding a method of moments estimator MME is much simpler. But at times there is a disadvantage connected to this method of moments estimator or at times it is called moments matching estimator. Sometimes they are inconsistent with the data and number of times they tend to be biased estimator. This we are going to define later but here I would like to show that it may not be very consistent with the data that you have got. And one example will suffice for that. Please note this example I have picked up from the Wikipedia and I have given the reference at the end. You are also welcome to go through and read through it. It gives a very good description of method of moments. So let us consider a uniform distribution with two unknown parameters a and b. I hope you recall when x is distributed uniform with parameter a and b. It means that the density function of x of f of x is 1 over b minus a if a is less than x is less than b and it is 0 otherwise. This is the uniform distribution function which is a continuous uniform distribution function over the interval a, b. So let x be a uniform random variable where a and b are unknown parameters. So there are two parameters. So the first two raw moments of the data and the population need to be compared. So the first raw moment of the population is the average which is a plus b by 2. The second raw moment of the population is a square plus a b plus b square by 3. You are welcome to calculate it out. It is very simple because our function itself is very simple. Now you consider x1, x2 you remember every time we say that x1, x2, x3, xn is a random sample. So I am taking a specific sample from the uniform distribution and its realization is like this. It is 0, 0, 0, 0, 1. It is a sample of size 5 and these are its realization. You remember we used to write x1, x2, x3, xn small x1, x2, x3, xn. This is what it is. My n is 5. Then the first raw moment of the sample is the sample mean. So it is 1 over 5. And the second sample raw moment is summation of xi square divided by n. So it is also 1 over 5. Now you equate m1 with mu1 which is the first raw moment of the population and m2 with the second raw moment of the population. We get that a curl which is the mmE estimator of a has to be written as m1 plus or minus this. This is only simplification of these two equations. Please note that these two equations have been simplified. To write that a curl can be expressed in terms of m1 and m2 while b curl can be expressed in terms of only m1 and a. Further simplifying you will find that a curl now equals to 1 over 5 minus 2 square root 3 divided by 5 and b equals to 1 over 5 2 square root 3 divided by 5. So it is a is minus 0.49 and this is plus 0.89. Now you see we are saying that the data has come from this interval. Can you imagine how one can fall in this interval? Zeroes are fine. But one not fine. One cannot fall and this is the inconsistency. And this is what I wish to bring it out to you. That method of moment estimators are easy to calculate and very attractive. Very easily these days available on variety of software including R programming. But be careful when you use it. It is much better to use the maximum likelihood estimator compared to matching of moments estimator or method of moments estimator. Then why do we have it? It is a natural question. Why do we have this estimator? Well sometimes finding maximum likelihood estimator is difficult. Finding any other estimator involves lot of numerical calculations or very complicated equations. At that time we fall back on the method of moments estimator. But what this example tells us is that we have to be very careful. We have to solve it and we cannot say that these are the good estimator unless we put it through certain tests and certain observations. So with this we summarize what we discussed just now. We discussed the method of moments estimator and we said that it is based on comparing distribution raw moments with the respective sample raw moments. So if you have Q unknown parameters you take Q distribution raw moments and equate them with the corresponding Q sample raw moments. You get a Q set of equations and you have to solve them simultaneously. We gave two examples in which it actually resulted into maximum likelihood estimator only. It was a very simple example of Bernoulli trials where we tried to estimate probability of success. Then we took the normal distribution where we tried to estimate its mean and variance. But through the uniform distribution giving a one very specific sample that we may observe, we found that there is a disadvantage connected with the method of moments estimator that they give inconsistent result. Of course I have not showed you whether how they become biased but they also tend to be biased. I have given you the reference to the Wikipedia from where this example is taken. You are welcome to go through it. Thank you.