 Okay, thank you very much Yeah, just to acknowledge Murray's role in this paper too, so this could be relabeled the Murray session a little bit Okay, so I have one idea if this is all that you remember from this paper, then that's you've got a central idea And that is for those of you that actually work with acid and disease which Have become one more popular as people work on demographic and health surveys The idea is that what these acid indices do is that they look for correlations between different kind of assets that people hold so the car the TV the The the fridge and The idea is that what is common to to those assets were what is captured by that correlation Is often referred to as wealth the problem with this approach is that it breaks down if there is a subgroup within this population and in South Africa and in fact a lot of Southern African countries This breaks down in the rural areas where people hold Assets particularly livestock that are actually not well correlated with the fridges the cars the TVs And that what typically happens is that those assets actually? get a negative score on on these acid indices and what that does is that we then get People that actually have more stuff have livestock Actually end up writing lower on these scores than people who have nothing at all So that's the one idea which which is new This is integrated in a sort of bigger argument and this is the summary of the argument. I'm trying to make so I've really made the point that acid indices Can have these anomalies anomalous rankings and that in fact in our context what that tends to do is to exaggerate urban rural differences What I'll argue is that actually one doesn't have to construct acid indices in the way that This has been done thus far that we can actually get around this particular problem and that one of the nice side effects of actually Constructing acid indices in a different way is that we can actually give a cardinal interpretation to the indices Which means we can actually use Inequality measures on them I Apply this to the South African data and we'll find that On that measure acid inequality decreases between 1993 and 2008 So that is a quite a different picture from income inequality, which shows that actually income inequality has been fairly Static over that that period It turns out that the reason for this difference between the asset or the income view of inequality over this time is that In order to do the comparisons on the assets I effectively have to use the same asset Table for the two periods so if The holdings of assets increase over time as I show that they do the static schedule will then actually Definitely show that inequality has gone down whereas that is not the case in With incomes where basically they can continue to rise over the period and the final point that I want to make in this paper Is really that all of these methods of calculating acid indices whether it's the principal components or the this new Index that that I'm going to propose Actually have to be very careful when you construct them and actually interrogate them to make sure that The coefficients and the indices make sense. So the automated Procedure that is typically used. I have a problem with so that's essentially everything that's set up front So that if I run out of time, you actually already know what what everything is about So now I'm going to tell you what I'm actually I'm going to go through this I'm going to give a motivation for why this is interesting to look at Gonna look again a little bit at the standard approach for creating asset indices and think a little bit about what that does when we Use it on the typical asset schedules that people use in these Surveys which typically are binary variables and then do that fairly systematically I Will once I've gone through some Sketching out the the approach I will apply it to the demographic and health survey data from 1998 and then apply it to to the cross-time comparisons Okay, so my motivation for for looking at this is that acid indices have actually become very widely used in Development literature and that has been a function mainly of the fact that For a lot of countries we have demographic and health surveys But we have very poor other kind of information So a lot of people that want to to work in the area development have Have used these indices as a proxy for For incomes and if you just do Searches on Google Scholar you actually get thousands and thousands of Papers that actually have have have used these these DHS wealth indices or Indices on other data sets so There's been some external Validation of the indices so people have actually looked at if you have a survey in which you have both Acid indices and incomes. How do they stack up against each other and typically the acid indices do fairly well But what I will show is that There are actually these internal inconsistencies anomalies which Are actually a little bit troubling So What's nice about the acid indices? They're actually proved in the empirical work Really useful for separating out the poor from the rich But one big limitation of them is that inequality work has thus far been impossible on them because Of the way in which they create it and it would be nice to be able to say something about inequality in those cases but all that we have are These these acid indices Okay, so the main objective of the paper really is to question how these Acid indices are created to argue for an alternative method of doing that show that this method Works But then to warn that maybe it doesn't work quite as well as one would like so again a Sort of cautious Conclusion at the end Okay to start off on the literature The literature really goes back to the film and Pritchett paper 2001 Which argued that if you only had these acid Schedules by simply running a principal component over the schedule You could extract the first component and Whatever is common to these assets should be thought of as wealth and that in tiered of justification Got seized on by the people who ran the demographic and health surveys and they actually then Put that into them as a default approach for creating a wealth index Which then are released with every demographic and health survey Which is why these things are Used so commonly So just to make it Clear what these things do so basically the idea is that we have a set of K Assets we Think of them as being generated by a bunch of unobserved Factors a one through to a K and in the principal components idea is that these are thought to be Uncorrelated with each other or thogna to each other and a one the first Principal component is really that Thing which is underlying these These assets which explains most of their common common variants and That interpretation is why why people Think of it as wealth now one thing which Which follows from that sort of latent variable formulation and they're different versions of it factor analysis multiple Mca But but they're all really rely on this idea that that there's a sort of latent Common factor that is driving the assets that we see The mechanics in which it's done in principal components is you first standardize the variables And then the scoring coefficients come off the correlation Matrix as the first eigenvector. There are a couple of consequences That follow from that way of doing it. The first is that the acid indices are Constructed to have a mean of zero and That immediately means that you can't do standard inequality Analysis on that because you have these negative values But in fact what also follows from that is that Because you first divide through by the standard deviation That even when people report these scores that those are actually not the underlying weights on the original assets which are a combination of the score and And the the the inverse of the standard deviation But typically this stuff is never actually reported and certainly if you look at the DHS They don't tell you what what these what the weights are on the assets And and that means that you really it's like a black box which which people then then end up using So they have been validated as I mentioned earlier. It's a film and Scott in a 2010 article looked at how these different Scores acid indices compare against each other how they Compared to income where you've got per cavity expenditure. We've got it and They basically argue that That's sort of pretty much all do the same things, but that's because they were comparing essentially acid indices That's come out of the same underlying stable. So whether they're principal components or factor analysis or multiple correspondence analysis and they argue that actually Typically, they think that where this is not that well correlated with per capita expenditure. It may be actually that That the Assets actually picking up longer run Well-being whereas the the the per capita is picking up much much shorter run things and therefore It's not clear that actually this is a problem for the asset index There've been a bunch of criticisms that have been aimed at these asset indices mainly because the variables that go into them are discrete and So there's a sort of underlying the categorical structure can can influence how these Indices work there's questions around whether Infrastructure variables should be in there or whether one should just use as the the sort of durable goods But what hasn't been shown That's far is actually that you get these sort of anomalous Things which I'm going to talk about In a moment. So before we go on to actually Rethinking how one should go about Acid indices, I think it's useful to think about what one would want of a asset index and for me one of the underlying sort of desirable properties is that you really want if we've got a vector of asset holding so if I have Two people and they have care this one has K assets there K assets there if each one of them are Bigger than the others if I have a TV this person doesn't I have a car that person doesn't and so on Then my asset score for this holding should be preferably bigger than that one. So that would be The sort of monotonous T requirements and that of course works only if these are goods rather than bad But that is in fact I'll show in a minute is actually violated by by some of the current ones If I think about inequality, it would be nice that somebody has nothing So if these are all zeros that that absolute zero actually gets me a score of zero because then I know I can actually do do some construct measures that actually respect the the sort of typical inequality axioms and typically I would like a measure which is fairly robust in the sense that It shouldn't be that sensitive whether or not the variables are continuous or binary Okay, so let's start off by thinking if I only had one binary variables And I wanted to do inequality analysis on that well that's a little bit of a problem because typically none of the axioms actually work in this case I can't do a transfer from a richer to a poorer person while keeping their ranks constant. I can't scale everything up But even though I can't do all of these things I can still see what happens If I actually plot a Lawrence curve and calculate a genie on it And the Lawrence curve in this case would just look like that So basically this is the fraction of the population that doesn't have the asset This is the fraction of the population that does have the asset and the The Lawrence curve would just look like that and the genie coefficient if I measured on that is just 1-p Where p is the fraction of the population that has the assets so that kind of inequality measure kind of Does sort of make sense. I'm going to apply a genie to to this simple binary variable and That kind of gives me at least some confidence that That it's not completely stupid to think about Inequality in the context of assets even binary assets But the moment that I add a second asset to it. I have to start thinking now what happens when I Have Let's say I've got a car and a TV. So how do I compare? The car and no TV to the TV and no car. I have these four possible outcomes And if I wanted to think about an inequality measure on here, I have to think Not just how I rate these things but what happens if I Redistribute these things so if initially I have two people one who has the TV one who has the car I Have these same aggregate Outcomes if one of the people has nothing and the other person has everything And I want to somehow penalize that type of redistribution. So I want to make sure that that basically Increasing the concentration of the assets and one One of the individuals Should make my inequality measures go up I can See what happens if I take this sort of two binary variables and I do a Principal components index on it It turns out I can work out precisely what what the measures are that turn out to depend on the proportions who hold Each of the assets and the proportion that hold both but one of the things that Happens in this case Is precisely the possibility? So here I've given you a sort of a possible Scatter diagram Bunch of people who have nothing bunch of people who have both But most people have one or the other but not both and in this case This is a classic case in which if I do a standard principal component the Principal components going to pick up that negative correlation between these these assets and In fact, what it will do mathematically is it will score one of the assets as being Negative and why does it do that? Well, if you really believe this latent variable approach, which is built into all of these Assumptions the only way it can make sense of that Correlation negative correlation is if it views one of the assets is a bad So basically if there's a negative correlation one of the assets is a good the other is a bad and then you will in this binary case where you have To to the assets only it's going to score Somebody who doesn't have the first asset higher Than somebody who has both and somebody who has nothing will score higher than somebody who has has the asset and In fact, we can show cases where somebody has both assets will score even lower than somebody who has Nothing and the question as well is this just a mathematical curiosity? Out of thinking about this and it turns out that that actually there are cases where this matters so One of the ways in which of course one can think about these these Two assets or three assets or four asset cases is by looking at the literature of multidimensional inequality and indices But the problem is that that literature really all assumes that the underlying Variables are continuous. They don't really work if you have binary variables There is one approach by Banerjee paper called the multidimensional genie where essentially he thinks of them as continuous variables and He he basically is procedures like a principal components, but instead of standardizing the Variables first. He does not do that. He takes the unscented Variables and he divides each variable by its mean on This principal components it turns out that actually you can then Calculate an index to which you can apply a Genie coefficient and it turns out that this Procedure actually is guaranteed to give you only positive or zero cases and And in fact you get penalized for Concentrating the indices so that's basically what it it does and so Applying this Banerjee procedure to to this typical asset scale schedules You actually guarantee to get an asset index that Abays that principle of monotonicity that has an absolute zero and can be used to calculate a genie coefficients even when everything is just binary Okay, so let's apply that to the DHS And the first thing to note is that basically if you look at the South African DHS Implicit in the scores I backed them up by Reverse engineering or regressing them on the assets Those scores are strongly negative on on the livestock variables, which is kind of what what I was expecting It turns out that the same thing is true on multiple correspondence and factor analysis if you were to redo it On the DHS wealth index, I'm going to skip through a couple of slides. It turns out that when you Use this unscented version of the Banerjee index and I calculate genies on the assets in 1998 it turns out that That basically these are what the genies look like that the That the genie for South Africa 1998 on assets was point six two three and it goes up from these cities through to To the rural areas The other thing just to note that if we use these This unscent this this approach to actually just rank the bottom 40 percent You actually get more urban poverty than you do on the DHS and Partially, it's because the DHS really penalizes the rural assets. So basically it makes rural areas look poorer Then they do if you if you do the the unscented version of that principal components This is the story of South Africa income inequality. We already saw that didn't shift But the if you actually look at what happens to assets over this time Assets have increased asset holdings across the board have increased So if you actually now do the Lawrence curves on this asset index that actually shows a major decrease in inequality And that is basically because the asset register is fixed So they're really two different questions yet So my asset inequality measure really looks at the gap between the haves and have nots Which has actually decreased on that asset scale, whereas the income and inequality measures really Independent of really that that's at a very stark have have not correspondence. That's it