 اسمن اللہ علمہ والہوشانسی محقوق ہندانت سکتے ہیں اور میں ہی زیادہ کس طرحی پر کبھی مقابل ہوں میرے معنی آن گئے کے لئے میرے علمہ سکتے ہیں میں مقابل سازوں کے لئے میں شکری سا پڑتا ہوں اگر کھونتا جو مقابل کے لئے youth میں ہمarksہ بہتffeٹی شنی تھا سب بہت ہی ساتھ ناموں کے لگتے ہیں سب جو بہت سوچہ موجود ہے۔ لگتے ہیں لیکن انہوں جب کسی بردان مطلعہ ہے جب بسکل آپ جانے میں اللہ سے بہت سکتے ہیں کہ آپ یہ بہت کچھ کسی بہت معرفہ کرنے کے لگتے ہیں۔ اگر یہ بہت کچھ دیکھے ، آپ کے لئے آپ جانے کے لئے کیا آپ جانے کے لئے ہے انکی بارے میں کیشاہتے ہیں کہ ہم دل کی ملٹپلائے خوحتے سے ملٹپلائے کر不到 جس کے پر برائے ہم تراد ہوتے ہیں اور ہم لوگے کے ساتھ پڑھنا ہے لیکن کملاب آئے گا کیا آپ کیوں کہ احدیہ دن سے ممہا ہے؟ کیوں کہ یہ پتہ کیا آپ کیا آپ کیا آپ کی کوئی idea ہے؟ اور کنی اپنی idea کیوںکن opportunity کیا آپ کیا آپ کی کوئی idea ہے؟ لیکن پر اپنے کامید کیا کہ اسے لیکن ہے ہمارے بہت محققی کیا کہ محققی مقامی کیا ہے؟ اور یہ بہت محققی میں محققی کیا ہے اورureن allegedly and in this course all and for a long time I've stopped teaching proofs equations and formulas I've just started working on meaning and use but this class is a class in econometrics for econometrics students actually it was I would not be teaching in these materials but before econometricians you have to know اپنے باتمتکزی and the back you will be like going into the engine of the car to learn how to drive you don't need to know what is in the engine but for the next three lectures I'm going to be teaching you something about the engine because first of all it give you some confidence you're using something but you don't know what is you say okay here's the formula and some students ask you how do you get it some because my teacher said so this is not so good for an econometristian someone who is a user is not so good for an econometristian someone who is a user the computer says so it's good enough this is the answer computer gave me how it got it you don't know you don't care but for an econometristian you have to know how the computer got this answer and be able to actually replicate so we're going to دبون میں ابہاڈ کامیوری اللہ کی کسی بہتی ساڈ لیتے ہیں لیکن ابہاڈ کمیوری اللہ کو بڑے ہیں لیکن now کامیوری اللہ کو سب سے بہت منالیہ اور بہتی بیادی کی حدیث سے لیکن سب کیشہ سکتی۔ بہتی سب آپ کامیوری اللہ کو سب سے ستی پڑھتیں۔ اور اگر ہوں اور ب..., ہی میں سبحانہиновہ کے ساتھ آپ کو پہلے کیا ہے یہاں ، کیونکہ میں کلاپ آتا ہے کرنا مرسلہ کے دورت سے کل لنک ہے دائی کو کیا ہمارکا ہوتا ہے کہ میں کیا کہ ارمانکعم میں سارا کردے ہوں ہمارکا ہوتا ہے جب کبھی اپنے تلانے کی بارے میں بھی کبھی ہوتا ہے ہمارکا ہوتا ہے جو مبارکتا ہے کہ ایسا آپ کبھی ہوتا ہے فتح ہوتا ہے بھائی میں مبارک کیا کت وہن کن آپ دے ل چلیٹ پر بھائی ہے ہے۔ لہذا حوظ بلہ من الشیطان الرجيم ، بسم اللہ الرحم آن رہیم ، و من الناس و الدہ و آب و آن عام مختلف ان وانہو کدالک۔ لہذا مجھے کل چیز ہے لہذا کل چیز ہے ، لہذا کل چیز ہے ، ریلہ میں جب ہمیں جانے سے کل جاتے ہیں ، اور ویٹھوں کے پاس ہمینا کبھنے سارے黃ود Ralمن میں انگرانے سارے ہوں جو یہ سارے ہوں اللہ تعالی ہمینا کبھنے سارے ہمینا کبھنے سارے ہوں ہمینا سارے ہوں اس کی قام ت準ہ ویڈیلنے کے طرف کے جو بھی بارکی اور اگر نحو بال problema اور اگرنا مبنی مبنی جانتے ہیں آلمل مرشان سبین جانتے ہے but there's no such thing in the real world. You cannot find a line which comes fromabis infinity plus infinity but this is something which exists in this imaginary world, there many constructs in the imaginary world which is useful to learn about and so this is one of them, so this is the first definition that normal distribution density is defined at this ہے اگر یہاں جو ، سیکمہ اسٹا ہے تھا ، another way to write this is to write the square root of to square root extends over the sigma and then you put sigma squared so it's 2pi sigma squared and under root so both of these things are the same thing because sigma is always positive the variance is always a positive number if it was negative then there will be a problem and then you have this thing in the exponent have now what are the properties of this density and this is the shape of the density for different kinds of mu and sigma was a translation parameter and sigma is indeed the shape parameter so sigma by changing mu all that happens is that we shift the اورuo بیش دوسرا اورجو ہے۔及جاج،지는 ایک مطلب سے جھو ہے۔ ت components Trying کیا ہے؟ خیل ہے. دEE بچیا니ہ جانست کی کام و sac I wanted to look at another density to show how this works so Here's another density about the same formula except that's using E 1 and f 1 as the parameters so now I have here in the 2nd chart So in the 2nd chart, what we have done is Okay so the orange line is the 0 1 which is the standard normal density very important to understand the standard normal density is the sucedی ہو five mínn one soon the whole density has been shifted by one unit now if i change this mean to two what will happen it will shift you see it shifted a little more and if i shifted to four it will shift a little more so that's all this is the shift parameter now supposed i i جانتا ہوں اور now i want to experiment with what happens when we change the variance so as the variance increases it will become hey یہ کیا گیا اچھا i put it one zero zero enter okay so now both of them are coinciding one on top of the other now if i increase the variance what will happen it will spread out yes so let's put the variance at four you see it becomes very spread out if i put the variance at two it become sharp now suppose i put it at zero point five see now it becomes sharper and as i move it towards zero it will start to become fully concentrated there is the famous delta function in mathematics which has ماس one at zero at at one point and that is what it converges to so all of the masses becoming concentrated at single point so this is what the density looks like and so here you can see that the three first three densities are just changing the variance and the third density is shifting the now one more thing that is very important to learn about the normal distribution is the probabilities of one two three four five and six standard deviations where you don't need to know all of them just one two and three so the probability that x is between minus one and plus one that's probability of x being within one standard deviation of its mean is sixty eight point three percent the point three doesn't matter sixty eight percent is good enough similarly between two sd's it's there's 95 percent probability between plus or minus three sd's there's 99.7 also you can remember that between plus or minus 2.6 is 99 percent the normal distribution is so central that these number should be memorized because there are many times that you can make calculations of probabilities just without any paper and pencil if you know these numbers and so this part of experience as a statistician now numbers get very bad as soon as you go up so x greater than four sd is seven e to the minus five which is less than one in ten thousand five sd's you are at one in million and six sd's you are in one in trillion so there is something well known called the in business they call it six sigma quality control which means that basically your probability of defect should be one in a trillion now these numbers have meaning if you are talking about something like one in ten thousand four sd's that means that if you run a simulation of a normal variable and you run five thousand trials six thousand which is what we are often doing in your like like we do one thousand trials to get the p values in our simulations or we can do two thousand or four thousand you will never see a random variable which of normal which is at four so normal variable does not have outliers this is a very important property to understand at five it's one in a million so if you run fifty thousand or a hundred thousand simulations you will never see a value of five conversely so if you're running a random variable which is supposed to be standard normal under the null hypothesis and you see a five you can reject the null hypothesis that this is a normal it can't be a normal and six that's just impossible i mean so these are very very low p values for a normal variable i want to this is something that i mean one should know because again this is practical experience with normal variables which is usually people don't have all right just want to generate some normal variables to show you oh yes there's an experiment data analysis random number generation i want to generate two variables 100 okay so there's our 200 normal variables you will see that these numbers don't change very much i mean don't fluctuate very much let's look at this equals max so you see i'm trying to teach you hands-on understanding of normal variables not theoretical understanding which is actually what these lectures are about and so this 2.8 is the largest number and if you remember 99.7 percent is going up to 3 so this is about what we expect i mean in 9 oh sorry that's too much 99 percent is at 2.6 so in a hundred trials we should see about 2.6 as the maximum and that's what we are getting let's look at what happens to the other column 2.19 so this is much smaller than 2.6 again that's so you can get an idea now one thing that to understand is that everything has a distribution and this distribution is important to know so for example i want to find out what is the distribution of the max of a hundred normal variables this is no longer going to be normal and now by simulation i created one and i created another one and if i create a lot of them i will get a random sample of maxima of normals and so i can study what does the maximum of a hundred normal numbers look like and i can i can take a sample of a hundred of those by creating thousands and then i can create a density function for those to understand what the density function is i can calculate the mean the variance everything i want to know about the maximum of hundred normal random variables and just like that for any kind of random variable no matter how complicated it is and and this would be difficult to study by analytical methods by the formulas by the methods which we are going to study now but it's easy to study by simulations now what i wanted to show you here was something about how this normal is not the fact that the normal is very stable is not something that is true of all random variables it's a very specific special property of the normal so what i'm going to do is i'm going to show you what happens when you take the ratio of two normals it's a very simple variable i take a1 and i divide by b1 and i've got a ratio that's 2.23 no problem with that let's see what happens آہ you see unlike the normal you have 0.23 1 9 6 9 and then 9.32 yes this is a cozy distribution now one of the things of the cozy distribution is that it has no expected value the expected value is actually plus infinity minus infinity so it can be anywhere and the the it shows up in the numbers because the numbers become wild you see unlike the first case where you see if you look at the first three numbers and you say okay well i'm going to predict what the next number is going to be you're not going to run into serious troubles you can say okay i've seen three numbers and so they they range from minus 1.2 to 1.7 so the next one will also be within this range your experience counts here we look at 0.3 0.2 3 0.19 0.69 there is no way you're going to say that the next number can be 9 this are going outside of experience this is called the black swan phenomena you're all your experiences that all the swans that you see are white and then suddenly you see a black swan it's outside the range of your experience so just like this this cozy distribution has a tendency to generate variables that okay so we have nine max then most of the numbers are coming out very nice and reasonable two three minus one five nine oh minus 22 you see again you are this is going outside the range of your experience the whole range this is going from minus three to plus nine 22 is nowhere within the realm of possibility you would never think that the next number could be 22 so this is the kind of thing that can happen in um very easily in the real world and so um the whole um this nicholas nasim talib has been doing a lot of work on this the black swan phenomena he says that you know most of the models that we are using are junk because they are based on this assumption that black swans will not occur that our experience counts but the world is in many situations places where the experience doesn't count the experience is actually misleading like the global financial crisis things happened which had which were totally unrelated to the past no model could predict that so if you want to study a world in which there are black swans you have to use different techniques in the normal distribution exactly the opposite of the black swan this is and the normal situation holds control over the real world in the sense that it's fixed in the minds they were a famous financial hedge fund anyway two Nobel Prize winners in finance started this hedge fund and they used the theory to how genius failed how genius failed huh they use this theory to calculate the optimal strategies for the fund and they made a lot of money for two or three years much more than everybody else and so everybody thought this was great but then they had a massive collapse they not only collapse the whole fund but they collapsed the whole economy almost the federal reserve had to go in and bail out the economy because and and when the somebody asked the authors what happened you you know you were doing all these formulas they said well a six sigma event happened something exactly which has a probability of one in a trillion it couldn't happen according to their theory but the real world didn't follow the theory that's all the real world they were assuming normality for something then these were not normal distributions so that's the thing that is important to understand okay now we come to the mathematics so the expected value of any random variable is defined we've already defined it in the discrete case now in the continuous case it's the integral of x times f of x dx and we can calculate for the normal density which we have already defined that this is going to equal mu and i will show a little bit about how this can be done then the expected value of x squared is the integral of x squared f of x dx these actually both of these integrals can be done analytically unlike the integral of f of x dx you cannot calculate that by any easy method there is a difficult method which is used and i have given a link to youtube demonstration which shows how you do it basically it you cannot do it directly because f of x is not analytically integrable there is no function which you can differentiate which will give you f of x and basically if you if you don't have that then you cannot get an analytical solution but there are some mathematical tricks you can use to get the whole integral from minus infinity to plus infinity you cannot get the partial integral from a to b if you could get that then they would not be no need of tables but actually everywhere you see the normal is tabulated it's not you don't have any see any formula for it because there is no formula for it but integral x of x that you can integrate you can integrate exactly because there's a integration by parts you can do x dx is the as the derivative of the part that you can make and so similarly x squared you have to do two integrations by parts and you can get the answer and so the first integration gives you mu and the second gives you sigma squared plus ra squared mu squared this is called the second moment it is also called the second raw moment to differentiate it from the second central moment which is another important concept that we will introduce basically when you take a when you subtract the mean from a variable that's called centering so if x has mean mu then x minus mu is the centred variable and the mean of the centred variable is zero so the moments of the centred variable are the central moments and the moment of the original uncentred variable are the raw moments so the variance of x is the second moment of the centred variable x so it's is equal to expected wealth x minus mu squared and that is equal to this kind of calculation is something that you need to learn how to do when you have x minus mu squared you can expand it by the standard method it's x squared minus 2x mu plus mu squared and then expectation is linear so you can split it up so it's expected level x squared minus 2 times expected value of x times mu now the mu is a constant so it can be taken out e x is also mu so it becomes mu squared so this is minus 2 mu squared plus mu squared and that becomes e x squared minus mu squared and i just showed in the previous slide that e x squared the second raw moment is mu squared plus sigma squared so the mu squared subtracts from the mu squared cancels and you get sigma squared left which is the variance of the normal and the standard error of x is called the square root of the variance by definition and that's just sigma now one very important thing to understand here is that these are all pre-experimental numbers that means we are talking about the imaginary x which has all possible outcomes and the outcomes have probabilities and there is nothing um i mean these these are not things which are happening now post experimentally x takes a particular value and so if you take a sequence of observations on x then you take the average of those observations the average of those observations will have some مچ to the expected value the expected value is the pre-experimental concept and the average of the observations is the post-experimental concept similarly the standard deviation is a post-experimental concept you take the observations and then you take the list of numbers and you calculate a standard deviation the standard deviation is match to these theoretical pre-experimental standard error they will be similar to each other the law of large numbers that we will study later this lecture shows that if n goes to infinity the sample size goes to infinity the sample standard deviation will converge to the theoretical standard error and similarly the sample average will converge to the theoretical expectation which is also called the population i mean if you are taking the random variable as a random sample from a population okay so these are some critical theoretical mathematical concepts and here i have skipped the derivations but these are derivations that you should learn as econometricians and these are available in the notes now the most important concept for dealing with the normal is the moment generating function so first of all this is expected value of e to the theta x but to understand why this is important you have to expand the e to the theta x in its tailor series and so if you look at e to the theta x it's 1 plus theta x plus theta squared x squared over 2 factorial plus theta cubed x cubed over 3 factorial and so on so now when you take the expected value of this now there are some very fancy formal mathematical issues involved here this is an absolutely convergent series what that means is that for for most random variables what that means is that there is no that the tail and terms become very very small so if you take an approximation by taking a few terms of this you get a good approximation as opposed to series which are badly behaved and the tail terms become very bad so if you ignore them you get into trouble but this is a very well behaved series and so if you just take a few terms you get a good idea of what the whole series is doing because the the bigger terms are very small so they don't affect things even the sum of all of those tail terms is very small so they don't affect things so one of the important things to note here is that when you take the expected value and the expected value is linear so you get 1 plus theta times the expected value of x plus theta squared over 2 factorial times the expected value of x squared and so on so all of the moments are sitting in this function and that's exactly why it's called a moment generating function because all of the moments are embedded into this function and you can get them out that's the beautiful thing because what happens is if you look at this formula the way you get them out is you differentiate for suppose i differentiate this with respect to theta then what will happen the one will disappear the t will drop out t times e of x will be just e x and the all the other terms will have a t in it the t square will become t the t cube will become 3 t squared and so on so if i differentiate once and set t equal to zero i will get out exactly e x now this process can be continued because the t squared you see the two factorial is matching everything which is coming down that when you take the t squared the two comes down cancelled with the t squared and you get t times e x squared so the second time you differentiate the e x squared will be there the e x will disappear because it's a constant and when you set t equals all the higher order terms will disappear so it's a very simple thing you take the m g f differentiate it once set theta equal to zero and you get e x differentiated twice set theta equal to zero you get e x squared whichever power you want e x to the k differentiate it k times and set theta equal to zero you get e x to the k so this is the very nice and pretty and elegant thing about the m g f and this is not the only thing the m g f has also many other wonderful properties that's why it belongs to the platonic world where everything is perfect so here the m g f of the normal can be computed fairly easily i have a short set of notes in which is attached to the website in which there's a two page calculation so very simple and very nice and elegant calculation and when i was young i would have made you i would have gone through this calculation in this class and explained it all but now i leave it to you to go through it and check how this is done it's basically an integration actually the pretty thing about this is that you don't actually have to do any integration because you start out by knowing that the integral of f f x is equal to one so then when you do the add the e to the theta x then you can rearrange this term you just play with the algebra and you rearrange so you you have an integral which is an integral e to the theta x f of x dx now the e to the theta x comes into this exponent and you you can rearrange this term so that it looks like the integral of a normal density times something which doesn't involve x so when you get the integral of the normal density that's اٹمیٹکلی one and the part that doesn't involve x that's the moment generating function and that part that doesn't involve x is exactly what's written on the second line so this comes out after a lot of algae it's a very elegant calculation because you don't actually have to do any integration you just have to arrange things into a density function and then recognize that the integral is one so it's done in that set of notes but if you understand what the trick is you will be able to follow the calculation easily if there is a trick to integration which is not a standard trick i mean this is not one that is taught to you in the calculus textbooks it's a special trick in probability and it's used quite often when you are doing integrals in probability theory you just try to change things into density functions and you say that okay i know this integral already so i don't actually have to do any real integration anyway here's the x e to the theta x and i've already explained this point that the moments occur in the expansion of mgf and so there is a technique to get the moments out from the density function if you want ex to the k you differentiate the moment generated function k times and set theta equal to zero and now here is the moment generated function and we can do that so first time you differentiate what will happen e to the x you have to differentiate the what's in the inside the exponent so what will happen what's the derivative of what's inside the curly braces yes yes all of you are mathematicians yes yes the exponent is just going to reappear again and then you're going to have the derivative of what's inside the exponent what's that derivative no no no no no no no no we're just the derivative what's the derivative e to the something the derivative is e to the something again times the derivative of the something so what is the derivative of the something it's it's mu plus sigma squared times theta right now when you said theta equals zero e to the whatever becomes one because theta is zero e to the zero is one sigma squared theta disappears and all you're left with is mu so we have proven that the integral of x f of x dx is equal to mu that is what i claimed that the mean is mu and now the mgf is one way to prove it the other way to prove it is direct integration and you should also know that method but okay so the non-central moments these are the raw moments the first moment is just mu it's written wrongly written as m over here i don't know how that came about anyway i just copied it from wikipedia actually the second moment is mu squared plus sigma squared again you see you start with now now you have a complication this is not easy you have e to the something times mu plus sigma squared theta and now when you differentiate it you will get udv plus vdu so you get e times the derivative which will be sigma squared theta will disappear and so sigma squared will be one term the other term will be the mu plus sigma squared theta times the derivative of e to the something and so that will get mu plus sigma squared theta times itself again and times e to the something so that will give you the mu squared term after you set theta equal to zero and as you go on this will become more and more complicated first you have one term then you have two terms then you have four terms then you have eight terms and so on but if you but it's all routine and so these are the numbers that you come out now from the non-central moments you can get to the central moments and that's what i would like to show you so how do you calculate the central moments from the raw moments i already showed you how to calculate the variance effects by taking the quadratic and expanding it and then calculating so i'll show you how to do the third power and that will show you how to do the general case also so i want to calculate expected value x minus mu cube i can get the raw moments by differentiating the moment generating function even though it's a bit painful and you have to be very careful to keep track of the numbers otherwise you get make mistakes but it's it's a routine calculation and now there is these there are programs like maple mathematical which will actually do the symbolic differentiation so you feed them the function and it will produce the derivative of that function so you can do it without errors okay now so i can get the list i can get all the raw moments now how can i calculate the central moments well the central moments actually this is one way this is just to show you the manipulation the other way is to set mu equal to zero in the central moments when mu is zero then you automatically get central moments but i want you to learn this one also because this kind of calculation is needed in some places so expect the x minus mu cube this is the third central moment so what we do is we expand x minus mu cube so it will become x cubed minus three x squared mu plus three x mu squared minus mu cubed okay so we write that as the expectation and now we break up the three four terms so the first one is e x cubed uh the so e x cubed now we can we can go back to our table of us and of raw moments and write down what that is and that turns out to be mu cube plus three mu sigma squared then we have three mu e x squared so now i substitute for the second raw moment mu squared plus sigma squared so i have three mu times mu squared plus sigma squared then i have the third term which is three x times mu squared i take the expected value of x that's just mu so that becomes three mu cubed and then i have minus mu cubed which doesn't require any calculation so after you look at all of these terms and you arrange them you find that everything cancels and you get zero which is not surprising because all the odd moments of x are zero and we can also confirm that from the table if i set if i look at the third moment and i set mu equal to zero i will just get zero and the same is true for all the odd moments so um that's the um that's how you calculate the moments now some properties some very important properties of the normal distribution the crucial property is that the linear transform of normals are normals uh if x is normal with mean mu and variance sigma squared and y is equal to ax plus b then y is normal uh so once you know that y is normal then you can calculate the parameters that it must have because a normal is characterized by two parameters the mean and the variance and if i know that y equals ax plus b that's enough to allow me to calculate the mean and the variance of y so the mean of y e y is just equal to a times e x plus b and e x is mu so the mean of y must be a mu plus b so the normal distribution of y must have the mean a mu plus b the next thing is to calculate the variance now the variance of y equals the variance of ax plus b now adding or subtracting constant doesn't cause any change in the variance so the b will drop out of the calculation the variance of ax is a square times the variance of x that's very important formula that you must know and now the variance of x is known it's sigma squared so it's a squared sigma squared so it follows that the y is a normal distribution with mean a mu plus b and variance a squared times sigma squared okay now this has two implications very important one is standardization if i start with any x i can get to the standard normal density by subtracting the mean and dividing by sigma so if i take x minus mu divided by sigma then z must be normal i just apply the formula if i take a mu plus b you see this is a special case of the formula so i am subtracting i'm taking one over sigma times x minus mu over sigma so one over sigma times x means that i have mu over sigma minus mu over sigma that will convert to zero so the mean will become zero and the variance will be one over sigma squared times sigma squared so the variance will become one so this is standardization for any variable actually this is not just for normal variables if i take any variable and subtract the mean and divide by divide by the standard error that's called the standardization effects and the standardization effects always has mean zero and variance one by definition that's what it means to standardize there's another thing i mean there's centering centering any variable means to subtract the mean from it that will always have mean zero and there's also scaling scaling means to divide by the standard error and that will always have standard error one even though it may not have center zero so if you do both centering and scaling then you get mean zero and variance one so basically you can go from any variable to a standard normal by subtracting the mean dividing by sigma and conversely if you start with a normal zero one you can generate any normal variable by taking mu plus sigma z z is the standard symbol for normal zero one standard normal so if you multiply by sigma the variance will become sigma squared because when you multiply by constant the variance get multiplied by the square and when you but the mean will be zero when you add mu it will become mu so this is very important to be able to move back and forth from standard normal to general normal because and this is used routinely because we use the standard normal tables to calculate normal probabilities for any normal and that's done by this method that you take start with a normal any normal and you standardize it then you compute the probabilities for the standardized version of the variable now we come to the most important part which is the most difficult part also that basically all random variables converge to normals this is the central limit theorem actually there are two theorems one that i want to cover one is the law of large numbers which says that if you take the average of any sequence of variables it will converge to the mean of one of them this is true for identically distributed variables if all of them have the same mean the same theorem works if all of them have different means but then you have to do it slightly differently the way that you do it is that first you must center the variables so if you have one over n summation i equals one to an xi then you replace that by one over n summation i equals one to an xi minus mu i where mu i is the mean now this set of variables everything has mean zero so this is going to converge by the law of large numbers to zero now once you have this then you have that one over n summation xi minus one over n summation mu i is converging to zero so that the limit of these random variables the average of the random variables is the same as the limit of the constant means of those random variables so now you can even if those constant means don't converge you haven't you can approximate the random average by a fixed number and that's very important to be able to do so because the random numbers are fluctuating but the constant means are not so in large samples the average of any random variables will be close to the average of the means average of the means is a average of the means is a constant so if you want to approximate the average of random variables in large samples you can do so by one number which is the average of the means that's the meaning of the large of large numbers now similarly you can also approximate the distribution of the sum of number numbers by a normal distribution if you do it correctly and that's and both of these come out of a very simple well not simple it's very complicated but the key to this is using the cumulant generating function to key to understanding how this works and so i'm going to try to show you how this works and this is only a sketch but the real proof is not all that difficult i mean as econometricians you shouldn't be scared of the real proofs so what happens is that if you take the sum of n random variables i'm calling this s of n this is the first line in this slide so it's summation i equals 1 to n xi now the key to understanding asymptotics is the again the moment generating function expected value of the moment the moment generating function of the sum is the expected value of the exponent of theta times s of n and the exponent of the sum is the product of the exponents and now a crucial assumption here is independence if the random variables are independence then the expected value of a product of random variables is the product of the expected values so you can write this as the product of the expected value of x theta xi what this says is that the moment generating function of the sum is the product of the moment generating function of the individual variables that's a very convenient result and that's what allows you to add true asymptotics and most of the derivation that you see will be based on the moment generating function but that's very clumsy and awkward and makes the proof very non-transferent so if you look at any proof in the conventional books you will not find it very illuminating it is just a sequence of steps one after the other which doesn't lead too much insight as to what's happening so but what really makes things nice and easy if you just take one more step instead of the mgf you move to the cgf the cgf is called the cumulant generating function and it's just the log of the mgf but the point of taking the log is that when you take the log of the product it becomes the sum of the log of the mgfs so the original thing was a sum and now the cumulant generating function is also the sum and what is it some it's a very nice sum it's the log of the expected value of x theta x so there are two operations first you take the exponent of theta x then you take the expected value and then you take the log to undo the effect of taking the exponent so it's it's so basically uh they are so the cumulant generating function of the sum of random variables is the sum of the cumulant generating function this is what makes things nice and easy the other important property of the cgf first of all what are cumulants so just like we can write the we can expand the tailor series of e to the theta x as one plus theta squared over two factorial times x and so on which we have just done so you can expand you can write the tailor series of k theta x which is the log of that and that will be now this one doesn't have a constant term because at theta equals zero this has to be zero because at theta equals zero e to the the the mgf is always one because e to the theta x is just at theta equals zero is just one expected value of one is one so when you take the log of one that's going to be zero so at theta equals zero this is zero so there is no constant term in the expansion of the cumulant generating function so then it's theta times something whatever that something is we call it kappa one and that's the first cumulant and it turns out that this is just x you can just work it from the mgf now the second term is theta squared two times kappa two it turns out that kappa two is the variance and similarly you go to theta three over three cube kappa three this is the third cumulant and so on these higher order cumulants have complicated expressions in terms of the moments of x but there is something very nice which will show you soon okay so so you get the cumulants out of the cumulant generating function exactly like you get the moments out of the moment generating function you differentiate and set theta equal to zero you get the first cumulant you differentiate twice set theta equal to zero you get the second cumulant and so on so but the important thing is that if you take the first first important thing is that the cgf the sum is the sum of the cgfs that's almost obvious and the other important thing is that سپوز I have kf theta summation ai xi then the summation can be taken out because and the ai also behave in a very interesting way when they are multiplying xi instead of multiplying xi you can take them to multiply the theta because basically you're taking x of theta x so if a is multiplying x you can also make it multiply theta so it's the same thing so now the cumulants have a very interesting property that if you take kappa one of ax that's a times the expected level of x if you can take kappa two of ax it's a squared times the kappa two remember the variance is multiplied by the square the third cumulant is multiplied by the cube the fourth cumulant is multiplied by the fourth power and so on آہh let's see let's see i don't want to establish that and there's some simple way which i'll let you think about it's not difficult alright so now once we have this cumulant generating function in hand you see i'm just giving you a guided tour and highlights the all the path that the mathematical work i've left for you to do on your own or if you don't do it you can still یہ متصور آئیہ نیمک پرتشنے لگی گیا ہے بين خوبصور رہوں اور یہ جاند ہی ہی جاتا ہے ہے جب ان کو فرمہ دیکھتے ہیں تو یہ ہمیں اولا جاتا ہے ہمیں ہمی چاہتے ہیں سفننیو delay نیمک رہا ہوں ، اور ہم آپ کے جو سبِرنا اس سے ہی دیکھانا ہے جو جو سب سے عطا بیٹھا کیا تاکہ ان کا وحیات کو کچھ بہنی تاکہ انہوں کے جاتا جاتا ہے کا Morning ویرے بران کیا جاتا ہےچسا کی مل完 ہوتی ہے کہ اسی کی سبسیلہ ہے اس پر ہے جب ان کا مل میں سمجھائیں۔ تو ابا ہے کہ یہ قرآن بِسیلہ کا مشکل کیا تبoman ہوتی ہیں ہمیں جو ضرورتات نہیں ہے کیونکہ اس پر بلکالوں کو بہت سے ساری طریض نہیں ہے کہ کیونکہ اسی مل کسی جائے ہوتی ہے اور وہی عتشی کامی کو لگا ہے اس لے کیونکہ کے ساتھ اور بہت ساری طریappelle اور کے لیے سبت الحق کے لئے اور جو آپ کو یہ خطر ہی یہاں پر بحائل دیتا ہے اور آخرا wear سیکنہ کے لئے آپ کا خص béتیاں ہی تصمید کریں یا آپ نے بات ہے سیمیشن kle Marie اور سگمائی اور آپ کو خص بات کے لئے اس لقائق تصمید کر جائے اور یہاں آپ کو خص بات ہی دانا سکتا ہے جانتا ہے اور اس کے مدلتا ہے اور اگر آپ کو یہ شخص بات تصمید کر رہے ہیں جانتےSen ، آٹیچھ you have to you will do that. سو now what happens to the variants of Sn. سو first we standardize, so we get first we centered and we get rid of the means. everything has mean zero. سو now the variants of Sn is the sum of n terms and we just the variance of the sum is the sum of the variance, so the variance is n time sigma squared. اگر ہم ذبار عمر Doncs اگر ہم ہے ا forests ڈرمائی نہیں ہے کہ ایریان کی زیادہ کل کوئی ایک کل ایک کل شخص کی معدد ہے now look at variance of s of n divided by n this is the average value of the random variables so what will happen when 1 over n will come out with n squared the variance itself is n sigma squared so when we divide by n squared the variance will go to zero now is one very very simple result لیکن ایک ورائیبہ ہے کہ لیکن ورائینس 0 ہے لیکن ورائینس 0 ہے انٹگرل x ورائیبہ ورائیبہ ورائیبہ ورائیبہ بہت زیرو مین now. لیکن now if x ورائیبہ has any chance of being anything different from 0 the variance is going to be the non-zero it's going to be positive the only thing that has variance 0 is something which has 100% probability of being equal to 0 and nothing else so basically if the variance of this sfn over n is going to 0 that means s n or n equals 0 with probability 1 in live samples and that's that's exactly what the law of live numbers is that the probability that sfn over n is within plus or minus epsilon is going to 1 for any epsilon that's proven by the chubbyshev inequality but also it can be proven by this moment generating function method so if we leave sfn alone it diverges it becomes the variance goes to infinity if we divide it by n it converges to 0 now there is one exact rate at which it will neither diverge nor converge it will remain random and that is square root of n if I take sfn divide by the square root of n then on the numerator I have n sigma squared in the denominator the square root of n squares to become n and n cancels and I have sigma squared the variance converges sigma squared this is going to be a stable random variable the variance is not going to infinity it's not going to 0 and this is the only scaling which will do that and so now what happens and this is the center limit theorem beautiful magical which will show that the every variable every sum eventually becomes normal so it will be normally mean 0 and variance sigma squared and this means this is regardless of what these individual variables are you can put anything in there in the sum you like and ultimately as long as they are all random the sum will come out to be normal and that's why the normal has a very important role in the statistical calculations so now it can also come out from the cumulant generating function you look at this cumulant and this is this is the center this is the law of large numbers it just takes a little bit more work to convert this into a proof so k of theta and the average so this is going to be the sum of k theta xi over n and then I forgot to put the n with the theta instead of the x but in that expansion the n is going into the theta so it's going to be summation theta over n because theta has been replaced by theta over n and then kappa one and theta squared over n squared times kappa two times theta squared over n cubed times kappa three remember that these kappa threes are now just for one random variable so whatever they are they are just numbers so now because we have centered all the variables kappa one is zero so the first term is zero now the second term is the variance and whatever it is it's going to see oh yeah the thing is that there are n terms here so the first term is theta over n times some constant and we take n of those so it will become theta fortunately that thing is zero so the first term disappears we have chosen things to make the first term disappear now the second term it has an n squared in the denominator and there are a total of n terms and they are all the same but even if they weren't the same if they were varying a little bit it wouldn't matter as long as the kappa one kappa two is bounded because there are n terms of those so n theta squared times kappa two divided by n squared is going to be one over n order term it's going to go to zero so everything is going to zero here and that's what shows the law of like numbers that but not so pretty the pretty part is the central limit okay now we get the beautiful property of the normal remember that the normal see mgf is x of mu theta plus one half theta squared sigma squared so when I take the log it is mu theta plus one half theta squared sigma squared plus nothing this is very important so what does it mean it means if you calculate the ملنس the first derivative is mu plus plus theta squared plus theta sigma squared the two when you differentiate theta squared the two cancels and you get theta sigma squared so when you said theta equal to zero you get exactly mu that proves that the first cumulant of this normal distribution is mu what about the second now you have theta squared sigma theta sigma squared you differentiate that and you just get sigma squared and there is no theta inside so setting it equal to zero doesn't do anything and you get the second cumulant is sigma squared exactly as I claim and all higher cumulants are zero because theta has disappeared from the formula and that's the that's the unique feature of the normal distribution that is what makes the normal distribution central all higher order cumulants are zero any other distribution has non zero third third cumulant or fourth cumulant and that's what makes them different from normal so basically tests for normality are based on testing cappa three and cappa four a large number of tests test are you do you have a third cumulant if you have a third cumulant you're not normal if you have a fourth cumulant you're not normal if both of those are zero then you're normal so that's a standard the well-known test can I must go Jagdbhra exactly Jagdbhra tests for cappa three and cappa four being non zero so that's not the only one there are many other tests like that so this is the property and now we will see why this is this is the central limit because basically when you take the sums and divide by the square root of n all cumulants from cappa three above will disappear so automatically you will get normal distribution that's the that's the heart of the central limit theorem so this is the calculation k of theta 1 over root n now we are doing the same calculation now the we are taking the cumulant of the sum divided by square root of n so that this is that now we are doing the right scaling the one that will not make the variance go to infinity and will not make the variance go to zero it will make this stabilize the variance basically so again the square root of n can be merged with the theta so I get that this is a summation theta divided by square root of n cappa one now the cappa one has been scaled so this part is going to be this term is going to be zero now the second term is perfectly balanced you have theta squared over n divided by cappa two cappa two is exactly the variance so this is the sum of n terms each of them has one over n so the n the summation will cancel with the one over n so you will get theta squared times cappa two so this is theta squared times the variance now the third term has n terms its theta cubed cappa three and divided by n three over two square root of n raised to cubed so now you have n of those terms that's going to give you an n factor in the numerator but n three over two in the denominator so you'll get one over square root of n in the denominator which will drive this term to zero all higher order terms will also be driven to zero so basically you will get convergence to the cumulant generator function which will be just theta squared times cappa two and that's exactly the normal cumulant remember what's the normal cumulant it's mu theta so mu will be zero so it will be zero plus one half sigma squared a half missing here oh yes i missing it's missing because i didn't put it in there the formula is wrong there should be a two next to the n and theta cubed is going to be divided by three factorial not that it matters i just forgot to put the two so the two is there and then so this will be theta squared over two times cappa two which is exactly the moment generating the cumulant generating function of a normal variable with mean zero and where in sigma squared so here we have the central limit theorem very simply and nothing complicated about it and that's the end of the slide show and end of the lecture so okay some people say that in some of my lectures i cover the whole course and i think this is one of them like that one could take about and if we