 Heteroskedastis on yksi asia, jossa olet voinut käydä reikersson analysoidaan. Reikersson analysoidaan on yksi asia, että erotermiä on homoskedastista, joka tarkastaa, että erotermiä-variaatioita ei ole järjestelmässä. Deep House oli tämä asia. Se on todella hyviä asiaa, miten oletut pitäneet reikersson analysoidaan. Ja he sanoivat, että heteroskedastis on tämä asia. Se sanoi, mitä problemiä on ja mitä teille on. Heteroskedastis on kaikkiaan kaikkiaan reikersson analysoidaan. Samson ei ole reikersson analysoida, mutta se on erotermiä, joka ei ole reikersson analysoida. Se on yksi asia, että reikersson analysoidaan on yksi asia, jossa he ovat yksi asia, kun he ovat yksi asia. Tässä ei ole se, joten meillä on heteroskedastisinti-problemi. Tällä persoinnin erotermiä on paljon parempi, että se on täällä. Tämä on yksi asia, joka on kaikkiaan kaikkiaan reikersson analysoidaan. Mikä se olisi se, että se on problemi? Lähdemme yksi asia, miten heteroskedastis on reikersson analysoidaan. Tässä on kaksi asiaa. Tämä on yksi asia, joka on 500 osuus. Tämä on yksi asia, joka on 0-5 osuus ja se on yksi asia. Reikersson analysoidaan on täällä. Tämä on yksi asia, joka on heteroskedastisinti-problemi. Erotermiä on yksi asia, joka ei ole yksi asia. Tässä on heteroskedastisinti-problemi. Täällä on yksi reikersson analysoidaan. Tässä on yksi asia, joka on yksi asia. Tämyan syvin快 bears enjoyed at the end and then they spread out once we go… Tämä communities Have the exact same variance of the erotermi but here the variance increases to the right and here it's constant. When we estimate a single regression model, the results look pretty much the same, so we take a sample of 50 observations on the left hand side sample of 30 observations on the right hand side that correspond to these observations we do a regression partnership and here the variance increases to the right. So we take a sample of 50 observations on the left hand side sample of 50 observations on the left hand side sample sample of 50 observations on the right hand side that corresponded these observations. Meillä on regresson linja, jotka näkyvät tämän. Seuraavaa regresson linja on, että jos me olemme ympäristöpäivässä, me saamme ympäristöpäivässä samalla populaisuja, paljon, paljon kentärapeella ympäristöpäivässä ympäristöpäivässä. Me voimme nähdä näitä näitä näistä. Eli tässä, että ympäristöpäivässä, me voimme nähdä regresson linja, regresson linja on paljon ympäristöpäivässä. So the slope of the regresson lines over repeated observations vary a lot more than it does here. So this is a lot less variation than here. Part of it is related to the fact that here the regresson line is crossed here and they cross here a bit more to the right in the first plot, but that explains only a very small part of this difference. So why would that be a problem? It is a problem for two reasons. So the variance of the estimates increase is about 30% going from left to right and that means that OLS estimates are efficient on the left-hand side, but then on the right-hand side here they are inefficient. Why OLS estimates are inefficient, we have to consider alternatives. And an alternative estimator called weighted least squares would be more efficient than OLS that makes OLS inefficient. The reason is that here the observations are close to the regresson line, so these observations tell us quite precisely where the regresson line goes because the variance of the error term here is small and then the variance increases. So these observations are a lot less capable of telling us where the regresson line goes. The idea of weighted least squares estimation then is to wait these observations a bit less than these observations and then we get more precise estimates that way and increased efficiency. But that's a minor problem. There is a bigger problem. The bigger problem is that standard errors become biased. It is okay that our estimates are imprecise, but we should not be overstating the precision. So that is lack of efficiency we can live with, but being biased that's bad, being inconsistent that's worse. The reason why the standard errors are biased is that the formula for standard error here depends on three different things. So this is a simple regresson which is just one independent variable. So there is no r squared term. Variance is simply the variation of the error term divided by the variation of the independent variable. The independent variable is the same, so the only thing that matters here is the variation of the error term. And the variance of the error term is the same for both plots. It's just that this plot varies a lot more on the right-hand side than the left-hand side. So this equation gives the exact same standard error estimates on average for both these cases. But because these cases differ, the estimates vary a lot more here than they do here. This equation is only unbiased for the first plot but not the second plot. So the consequence is that our standard errors become biased. What do we do with that? We have now the options. The simplest option to deal with heteroskedasticity is just to live with inefficiency. The inefficiency difference may not be substantial anyway, but the standard error bias is a problem. Fortunately, we can just apply something called heteroskedasticity robust standard errors and that will produce as consistent standard errors even under heteroskedasticity. So some researchers go as far saying that you should always use these heteroskedasticity robust standard errors. For example, Antonakis in his 2010 paper on Cosa claims says that you should always, as a rule, apply these robust standard errors. But not everyone agrees. There are reasons why you shouldn't always be using robust standard errors. For example, John Fox, who has written most of the R packages that I use on this course, has said that we shouldn't really be using robust standard errors unless we need them. The reason is that when you relax some assumptions that always comes with a cost. So the regression analysis, the normal standard errors are unbiased in all sample sizes. But the robust standard errors, while eliminating the homoskedasticity assumption, come with a cost that they are only asymptotically unbiased, which means that they can be biased in small samples. So if you don't have a heteroskedasticity problem, or you have a heteroskedasticity problem that is very mild, then using the normal conventional standard errors is probably a better choice in small samples. If you have a large sample, thousands of observations, then using the robust standard errors as a rule is probably a good idea. Deep Houses paper presents a good example of how you reported heteroskedasticity. So as with any other regression assumptions, you have to state if you had any problems with that assumption. Then you explain how you identified the problem. They say that they plotted on the residuals against the predicted values or predicted values, and they identified a shape there. So that is the indication of heteroskedasticity. And then you explain what you do about the problem. So they used weighted least cross estimation, and that's not very common anymore, because we now have the heteroskedasticity robot standard errors that are implemented as a standard feature in the leading statistical packages. So doing just using robust standard errors is a lot easier, and it produces you almost the same as good result as using weighted least cross.