 Good morning. My name is Rachel Giselquist. I'm a senior research fellow at UNU Wider and I'm incredibly pleased to chair this first keynote session of our conference. Our keynote speaker is Francisco Chico Ferreira. I think he needs no introduction in this group, but I should introduce him. He is a Marchesen Professor of Inequality Studies at the London School of Economics, where he also serves as director of the International Inequality Institute, and he also is currently the president of the Latin American and Caribbean Economics Association. He will be speaking today on inequality of opportunity and mobility in Latin America. This is fascinating new work, highlighting two new data-driven approaches to addressing important gaps in the literature. So we're in for a treat, I think, this morning. Immediately following the presentation, we'll have comments from our discussant, Dr. Forehead Schilpe. She is a senior economist in the sustainability and infrastructure team at the World Bank Development Research Group. She works in several different areas, but in particular she works on intergenerational mobility in developing countries. And she draws, she'll be able to bring in said, I think also from outside of Latin America before joining the World Bank, she served as a research associate with the Bangladesh Institute of Development Studies. Following the discussant comments, we'll have, give Chico a chance to respond to the comments, and then we'll also open the floor for comments and questions, especially from you, the audience. So without further ado, let me turn over to Professor Ferreira. Buenos dias a todos. That was my Spanish for the day. Good morning, everyone. It's a great pleasure to be here. Let me start by thanking Kunal, San and Carlos Gradin and everyone at Wider, as well as Marcelo Slava and Leopoldo Ferguson and everyone at Uniendes for the invitations. Great pleasure to be here. Always very nice to be at Universidad Los Andes, which is really one of the premier centers for the study of economics in our region. So what I'm going to talk about today is, as was said, inequality of opportunity, and I'm calling it now intergenerational persistence, the opposite of mobility, in Latin America. It's joint work with Francois Bourguignon, Paolo Brunari and Guido Nairhofer. It's part of a brother project, which I hope you'll hear more about in the course of some sessions to follow. There are three sessions in this conference that are presenting work from this project we have called the Latin American and Caribbean Inequality Review, where we have these five themes you see there on levels and trends of inequality, inequality of opportunity, inequality in markets, the role of the state, taxation and redistribution, and politics, the interaction between inequality and political power. And this paper is one of 27 papers that are being written for that project. So the outline of what I want to do today is just a little bit of motivation. Then I'll do a very brief, an unfairly brief review of the literature, which is much richer than what I'll have time to discuss, but a little bit of a review of the literature on intergenerational mobility and inequality of opportunity in Latin America. Suggest, you know, some of what we've learned, but also some of the shortcomings. And then based on those shortcomings suggest a new approach that we've developed with with my co-authors. Talk a little bit about the data we have and then give some results for ex ante and expose inequality of opportunity before concluding. So the motivation probably is really not needed that much in this audience. But it is that social economic advantage in this region, I mean in much of the world, but particularly in this region is extremely persistent. And I'm going to show you just four figures that may be known to you already about some of that. What we have here are measures of cognitive development using the Peabody Vocabulary Recognition Test for a sample of Ecuadorian children. It's from work by Chris Baxon and Norbert Shadi in the JHR some time ago now. And they divided their sample by attributes that had nothing to do with these children's own performance, but inherited circumstances that they that they that that were shaping their opportunities. These are three to five year olds. These are young little kids before they go to school. And this is how they do, you know, you should think of this as a measure of cognitive development. And those in the households with the wealthiest households in the sample kind of stay around this level of 100, which is the norm for their age. So this is, you know, a measure, again, a cognitive development. It's normed. It reflects their ability to recognize words in Spanish. And, you know, the wealthiest 25% are those whose mothers had most schooling stay around that level. But those in the same sample who belong to the poorest households or whose mothers, not them, but their mothers had very low levels of schooling are falling behind. It's not that they're forgetting words. It's just that they're falling behind relative to the norm. Okay. And this is before they go to school. So unsurprisingly, when we look at the distribution of schooling achievement and learning in our region, we also find big differences by, you know, when we look at distributions that are conditional, not on the kid's own efforts, but on things that these kids inherited on, on things that persist through generations. So here are five density functions of test scores in reading in Spanish from PISA 2006 conditional on father's occupation. So we just divided the occupations arbitrarily into a low-ranked group and a high-ranked group. And for these five countries here, you can see very big differences in learning achievement. So it starts with cognitive development before school. It continues into your schooling process with learning being quite different by these different characteristics. And unsurprisingly then, it manifests itself in differences in standard of living. When people are already adults, if you look at their distributions conditioned on their parents, this is the sort of picture you observe. These are cumulative distribution functions of household consumption. So standard of living, if you like, per capita household consumption. But instead of having a single one for each of these five countries, Colombia, Ecuador, Guatemala, Panama, and Peru, we have three conditional on mother's education. So we divide the population of these countries in these surveys by whether their mothers had no formal education, incomplete or primary complete. The dates for this range from the late 90s to the early 2000s. They're not the most recent surveys. But you can see that the children with who's the adults, the adults living in households today, whose parents, the head's parents, the head's mother had no education, have much lower standards of living than the other two groups. So these are just kind of illustrations, prima facie evidence of the existence of very substantial inequality of opportunities and of the persistence of social economic status across generations. I've showed you just correlations and conditional distributions of outcomes in one generation conditional on parental variables and parental outcomes. And it goes beyond economics. Here's a slide that I kind of stole from Leopoldo Ferguson, who's here and his co-authors. This gentleman here is Don Juan Vasquez de Coronado, who was born in 1523 in Spain in Salamanca. And he was one of the conquistadores specifically in the region of Costa Rica. And Samuel Stone in a 1975 book traced, you know, various branches of descendants from Juan Vasquez de Coronado and found that it accounted for 31 presidents and 285 deputies in the history of Costa Rica. So again, substantial evidence of persistence across, in this case, many, many, many generations of political power and concentration. Now, these are descriptives. Of course, there is an established literature, and there are some eminent contributors to that literature on mobility here, like Marco Sienti and many others. You know, the economists have looked basically at this and tried to measure it in two ways. And they're related, as I'll discuss. The two ways are a literature on intergenerational mobility, the converse of opportunity. And the other literature is a literature on inequality of opportunity per se. And I'll describe them both. And this has been done for many outcomes and for many variables, but I'll focus here on income and education, which have been the ones used most frequently in the world and in the region as well. So you should think of this as a survey of the literature in an, in, on attempts to quantify, in a rigorous way, the extent of intergenerational persistence that I've described through those very few slides a moment ago. So before I start on the review, let me make a brief remark on the two approaches, which I'll call IGM for mobility and IOP for opportunity. You know, in very simple terms, IGM is about how strongly correlated or associated are the same kind of variables across two generations. So, you know, you all know this sort of Galtonian regression of parental, of child income or parental income or, or child education on parental education. That's what it is about association between two variables. IOP into inequality of opportunity can be defined and has been defined in many ways, but one of them is basically, okay, what share of today's inequality, if you look across the distribution across the population today, across Bogota, as Marcelo was saying, what share of all the inequality that you see is attributable to inherited factors, to things that they've inherited genetically or otherwise from their families. And these two things, if I describe them in that way may seem different, but actually they're related in some fairly strong ways. And so to do that, let me just show you some very, very simple maths here. This is the standard Galtonian regression for inequality, for intergenerational mobility, right? And people use this intergenerational correlation, this intergenerational regression coefficient or elasticity, if it's in logs, as a measure of persistence. Or you could use the correlation coefficient, which is a transformation of the beta that, that corrects for differences or adjusts for differences in the marginal distributions of parents and children. And that's a measure of the association across generations. It's just a correlation coefficient. Okay, how do these IOP people try to measure inequality? Well, you know, again, there are many different ways, and I'm really simplifying here. But one way would be to just say, okay, what share of the variance or the inequality in current income is explained, quote unquote, by circumstances. So, you know, the inequality in predicted incomes here, you know, would be the share that's explained by circumstances over the total. Now, of course, this looks like an R squared, right? You could use, if you use the variance here, as the inequality measure, that would be the R squared of that regression. And of course, if you square the correlation coefficient, that's also what you get, the R squared. In fact, if you replaced C by YP, so if the only circumstance you looked at was parental income, and you used variance as your measure of, of inequality, then the two measures would be identical in this case. Okay? So, although they are defined and explained in these different ways, they're very closely related, this idea of mobility and association, and the idea of the share of today's inequality accounted for statistically by predetermined factors. So, I'd like you to bear that in mind as I take you a little bit through the review. So, let me start with the literature on intergenerational mobility. Again, one slide on education, one slide on income. So, there are a number of papers here that I am listing, going back all the way up to 1999, and ending now with some very nice new work, very granular new work by Munoz in 2021. I don't have time to go into the contributions of each of these papers, but to highlight the main findings, all of these papers find correlation coefficients, that row that I was showing you here, to be quite high in Latin America. We did it for our own data, that I'm going to use it later here, and the numbers are between 0.45 and 60, which match a lot of the results in the literature. Those would compare for this years of schooling variable to about 0.33, 0.35 for the U.S., not too far away in Italy and Spain and other countries. So, a greater degree of persistence in this origin-independence sense in Latin America than in developed countries that we compare with. Looking at trends, there have been improvements in absolute mobility in the region. So, if you look at measures of mobility, which are not just the relative, you know, the correlation coefficient, but if you look, for example, at the proportion of children that do better than their parents, or the proportion of the new generation whose parents didn't complete secondary school, but who do complete secondary school themselves, these kinds of measures that we often now associate with Raj Chetty's work in the U.S., which are very standard measures of, you know, directional absolute mobility, upward mobility, those have been improving in the region, in large part because of just big expansions in the schooling system, in the quantity of education provided in the region. But the relative mobility, the rank correlations, have been stabler or at least have grown by much less. So, that's kind of the upshot of this. There's also literature on mobility in income. So, what have we found about intergenerational mobility in income? Well, there are a number of studies, but all of them deal with a fundamental difficulty in the case of Latin America, which is to really measure intergenerational mobility in income well, you'd like to observe the income of today's children, today's adults, the children's generation, today's adults, and the income of their parents matched one to one over many different years in their lifetime. That's what you would like to have. So, you have to have that link at the individual level, and ideally many periods. This is extremely rare in Latin America, almost unheard of, with the exception of some recent paper that uses admin data for Uruguay, and I was told yesterday by my friend Sergio Firpo, I don't know if he's here, that there's one for Brazil as well by Breno Sampaio, which is still a working paper. So, there are some new attempts to use admin data to look at this, but it's plagued by the fact that unlike in Finland or Denmark, if you use administrative data, most administrative data in Latin America, you are missing the informal sector typically, which is half the population, so that's a problem. These people know this and they try to correct for it, but it is an issue. The bulk of the work before that used two sample, two stage, these squares approaches, where basically you use information on pseudo-parents from an older survey, so you don't observe the parents and the parents' incomes of today's generation, but you're kind of estimated for people who look like the parents in today's generation in terms of observed variables back there. And this is what could be done, but my discussion, Forad and her co-authors, along with other people, have a number of papers that show that two sample, two stage, these squares does have substantial shortcomings and may lead to biased results. So, we have very high estimates of debaters here, but they are problematic. Okay, so continuing with my review a little bit, this was mobility. What can we say about inequality of opportunity? So, there have been, again, papers on inequality of opportunity in income and in education. I mentioned some of the papers there. This big table, which I don't want you to focus on, but just to look a little bit at this column here, gives you estimates, older estimates for that, okay, exactly from regressions of this kind. So, this is a very simple what people call ex ante approach, you know, the first generation estimates. And what we found is that if you look at these countries here, you know, the share of income inequality or consumption inequality that is accounted for by some predetermined circumstances, basically place of birth, your parents' education, your father's occupation, and so on, range in the region from around 26%, 25% to 36% or so for income and all the way up to about half in Guatemala. But for income between 25 and 36% or so, if Colombia actually, interestingly, being one of the least unequal, though, you know, there are reasons to be suspicious of that because they don't have some of the data on circumstances. So, the comparisons are difficult to make. Now, because of the new approach I'm going to introduce in a moment, let me say when we did this, this is from an old paper with my co-author, Jeremy Jingyu, when we did this, we had to take a bunch of circumstances and divide them, partition the population into these groups that have identical circumstances, which people call types. And we did this more or less arbitrarily. So, again, a big table here, but this is what we had for each country. We had variables on people's ethnicity or race, on father's occupation, on mothers and father's education, on birth region. And we divided them into these little groups, these cells. We said, well, birth region, I don't know, what do we do? Well, let's say Sao Paulo and Brasilia, which are rich cities, and then, you know, these regions and then these regions, or if I go to Colombia, I go to, okay, departments of the periphery, central departments, and then Bogotá, San Andrés, and Providencia. You know, we looked at the means of variables and we said, let's group them into richer and poorer and more or less come up with a partition of the population into these types, these groups of people who share identical circumstances. The inequality between them is basically the inequality of opportunity that we were measuring. In the table that I showed you before, you have these non-parametric estimates, which are just exactly the compositions of overall inequality into between group inequality between these groups. The parametric version is just the same, assuming a linear relationship. Okay, so I mentioned this because just as the absence of good linked data across generations is a problem for the intergenerational mobility literature on income. So, you know, for understanding transmission and persistence of inequality looking at income, the problems in Latin America are that for the IgM literature, that just isn't the ideal data in terms of linking the income of parents and children. And in terms of this literature, well, there are many problems, but one of them is that we were using these relatively arbitrary partitions. So, what we want to do is to propose a new approach, which is a very data-driven approach to try and remedy the shortcoming of the arbitrary partition for the inequality of opportunity approach. And before I show you the method, let me very briefly just introduce to those of you who may not be familiar with this, these two different views of inequality of opportunity. So, inequality of opportunity is all about dividing the population into groups that have the same circumstances. So, if you only have gender and race, you know, you'd have, I don't know, black men, white men, black women, white men. If you have parental education, if you have all these other variables, you end up with potentially hundreds or even thousands of groups, as I'll show you. The idea is that the inequality between these groups, they are all defined by circumstances by things that you had no influence on. It's not your own education, it's your parent's education. So, inequality between these groups reflects, in some sense, inequality of opportunity. Now, there are two ways of looking at that. One is, basically, to assume that you have equality of opportunity if the means across all these groups, so this is the partitioned pie, these are the types, the subgroups, if the means across all these groups are the same, then in some sense, the value of their opportunity sets across these groups are the same, and this is one definition of equality of opportunity. And if you use that, then inequality of opportunity is just going to be differences in means. So, some sort of inequality between means is going to tell you what inequality of opportunity is. Another approach is a more demanding. It says, no, equality across means is not enough. I want that the distribution, the full distribution of income in each type be the same. So, whatever quantile of the distribution you are in, you would actually earn the same amount across types. This is known as the exposed measure of inequality of opportunity. And then what you need is a measure of inequality of opportunity, is some aggregation of inequality across the quantiles. So, you take this distribution function and invert it, you have a quantile function in each type, so a conditional quantile for a given set of circumstances, and you aggregate inequality across them, and it would be your inequality of opportunity measure. This is first proposed by Vito Peragini and Daniela Kecki. For that, you need estimates of the type-specific quantile functions. Now, the key thing is that in both approaches, the first thing you need is a partition. You need to agree, given a bunch of circumstance variables, how you're going to divide your population into these types. Okay? But how should the population be partitioned? I just mentioned to you a moment ago, looking at our own work, my own work of Jeremy, that we partitioned it fairly arbitrarily. And the reason we were grouping all those regions, Bogota and Providencia and São Paulo and Brasilia, was that we were vaguely aware of this problem here. Okay? So, in the data set that I'm going to use today and show you results from today, take a country, Bolivia. We have, after all our restrictions and taking away the missing observations, and so on, we have 6,000 observations in Bolivia. For them, we observe information on sex, two categories, ethnicity, seven categories, occupation of father and mother, nine categories each, education of father and mother, four categories each. So, if I did the fine partition, right, if I identified type so that everyone has exactly, you know, the same circumstances using all of these groups, I would have 18,000 groups. So, if I try to put 6,000 observations into 18,000 groups, I am overfitting the data massively, right? On the other hand, if I don't use all of that observation, I am in some sense missing information on circumstances that are, that is important for accounting for inequality of opportunity. These are problems that would not arise if you had census or Scandinavian registry type thing. These are problems that arise from the sampling nature. But it does mean that when we are using a sample, as we typically are, and certainly almost always in Latin America, to measure inequality of opportunity, we are facing effectively two competing biases. What is this downward bias from omitted circumstances? The little drawing here just says the following. Suppose you had two circumstances, mother's education, father's education. So, you partitioned the population into these nine groups. You look at the inequality between the means or between the distribution functions. That's your measure of opportunity. What happens if you add an opportunity you didn't have? Suppose you now have race, black and white, and you put that across here. You can see that inequality between types as I do, as I add another partition, can only go up. It's like an R square type thing, right? It can only go up. So, every time I don't use observation on the partition, every time I don't include information on a circumstance variable or on defined categories, I may be underestimating inequality of opportunity. By the way, in case you were thinking, gosh, this is complicated, why don't we just stay with the mobility stuff. The same thing is true there. That's why when they put grandparents' income in the regression, they suddenly find the grandparents' income are significant. The parents' income are not a sufficient statistic. So, if you're interested in what the parents' income explains of today's inequality, I mean it's a measure of association, but it suffers from exactly the same omitted variable problems as this would, okay? So, you'd like to include as many as possible for this, but as I've just shown you in the case of Bolivia, if you use all of the possible partitions, you may overfit the data. So, it's a sampling statistical problem. So, given a data set, how do we choose the partition? So, what we did in the very beginning is we eyeballed it and we said, well, this looks reasonable, right? Now, we've got to do a bit better than that. And so, that's what we're trying to do in this work, which really owes a lot to, particularly to my co-author, Paulo Brunotti, who's been doing a lot of this. And that is to use a data-driven approach to actually select the partition. And also, particularly in the exposed case, not only to select the partition, but to estimate features of the conditional distribution within groups that we're going to use for our estimate of inequality of opportunity. For the ex ante approach, because equality is equality of means and inequality is inequality between means, we're going to focus on the means between times. But for the exposed approach, we will actually have estimates of the quantile functions. And one of the neat things about this approach is that for those of us who've worked on inequality of opportunity for a long time, there are these two different approaches that are different conceptually. And this kind of machine learning approach we're going to use here is particularly well suited to estimate the information that we need for both of them. So, the spirit is, given a data set, don't pick a random partition the way that we did before that I showed you, but use a statistical approach to inform what the optimal partition is in a well defined statistical sense. So, this is a very tedious slide. So, I'm going to just tell you what's actually done by this conditional inference trees. These are the conditional inference trees. It's basically the following. I'm trying to make this lively. So, these guys did this conditional inference trees. You have a bunch of circumstance variables and you have income, say, the algorithm tests the correlation between the outcome, say, income and the circumstances. Okay, if there's no correlation, no significant correlation above some p-value, then okay, then one exits the algorithm, no inequality of opportunity. You reject the null hypothesis, you pick the variable with the smallest p-value that is the variable that is most statistically significantly correlated with income. So, the most important one, and you use that to, oops, you use that to partition the population. You use that to partition the population, and then you ask, okay, you had those birth regions that I use for Columbia, but what's the right breakdown? So, we guessed it, right? What the machine does is it tries every possible breakdown and chooses the one with the lowest p-value, the one that is most significant. So, it draws from the data, the partition that is most significant in explaining differences. That's the spirit of the thing. It's not using machine learning because machine learning sounds sexy, it's using machine learning because it does exactly what you want it to do. It tests all the possible hypotheses and picks the one that maximizes the explanatory variable. So, this is what this does, okay? It does so for means. It's all the tests are about means. So, now in the next two slides, I'm basically going to say how this is done for the exposed approach. The next two slides have some complicated equations which I want you to abstract from, but they are there for a reason which I'll tell you about in a moment. So, for the exposed approach, the solution to this for the exposed approach when you're looking at quantile functions can be written, indeed is written by Hawthorne and Zeilis 2021 as the solution to a local adaptive maximum likelihood problem, okay? So, you first have to assume that when you're interested in estimating the quantile function of each type that there is a family of parametric functions that can do a good job in doing that, okay? So, you have to assume that you can estimate these parameters theta that will approximate this thing. Theta then is chosen to maximize the likelihood given the data. So, Li theta is the likelihood of each observation i, right? Given theta. So, you basically are trying to fit a bunch of conditional distributions to the data set to minimize the distances between your data set and those conditional distribution functions, all right? Where you're using the parameters to define the distribution function and also you are choosing the parameters so as to partition the population. The weights here are basically weights that are going to be an indicator function that tells you into which cell each observation fits. So, it looks fancy. The reason I put it there is because I said to you before that this is the optimal solution to a problem in a well-defined statistical sense. This is the well-defined statistical sense. So, it's not some magic machine that tells you what the partition is. The partition is basically maximizing a likelihood that you can explain the data by breaking down the population into types in estimating their distribution functions, okay? And this basically says how you do that. It's very similar to the Exenta case, but here you have you're estimating not the differences in means but the differences in shapes and the distribution if you like by testing for the robustness of the for structural breaks and the parameters if you like. So, right, so much for the for the methodological stuff. All I would like you to remember from that as you leave this room is the idea of this method is instead of the researcher picking a partition more or less at random to try and guess where the optimal point is between the downward omitted variable bias and the upward bias that comes from overfitting, this is a specific well-defined statistical way to choose that optimal point, okay? And it derives a partition through that. So, I'm going to show you some results now. We've done the hard part now as Philippe Aguillon used to say, we can begin to harvest. We've done the planting, we can begin to harvest. This is the data we use there are 28 household surveys that cover nine countries. You can see them there, Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guatemala, Panama, and Peru. You know, very different numbers of waves. Peru, we're going to be able to look at a bit of a time series in Colombia, Brazil, and Argentina. We're not, okay? The surveys cover this period. They are all from Sadlock, big thanks to Leo Gasparini and his team at Sadlis at the Universidad de la Plata for sharing the data with us. The data were chosen only from surveys that contain information on parental background, and in particular on things like race, well, on your own, sex and race and ethnicity, on your place of birth, but also on your father's and mother's education, and father's and mother's occupation wherever possible, okay? And in no cases do we use fewer than five of these seven circumstance variables. There's some details on age ranges and so on and so forth, and I can address those maybe in Q&A, but we won't have time for them today. So that's the data where we are going to apply this technique. Okay, so what do we do? This is a result. This is the first result. So this is a conditional inference tree. So if you go and you, you know, go to your R code that exists there, and you say, do this, break down the Colombian, the Bolivian population in this way. So as to find, you know, the maximum, always at each stage, break, have these binary breaks that maximize the statistically significant difference, you end up with a tree that says, well, first with partition by father's education, okay? So the first partition in Bolivia, the most salient partition in Bolivia was between people with college degrees, people whose parents had college degrees, and people whose parents had any other level of education. Now those guys there, they get divided up only once more by mother's occupation, and they yield these two types at the end, okay, which are the two richest types. In Bolivia and they account, the population shares there, they account for seven percent of the Bolivian population, people whose parents had college degrees. The other 93 percent, what about them? Well, they next get divided by birth area, between urban and rural. Okay, there's migrants there, don't worry too much about them, but urban and rural. And then in rural, and if you've been to Bolivia or if you're from Bolivia, it starts making sense to you, that's a big, big cleavage in society, and then the machine tells you, for the rural people, the next thing you want to do is some indigenous groups over here, the Catuas, Amaras, and so on, and then non-indigenous, i.e. whites, and a few other indigenous groups on the other side. And so the trees, not only will provide you with a partition in the end, which in this case is very economical partitions, only 10 types. Remember I had 108 and 54 in my arbitrary partition before. Here the machine is telling you, these are the 10 key types, and then given the data at the level of significance that you've set and the restrictions you've set, I will not decompose them further. If you change the alphas and so on, you could go on, or if you had much more data you could go on. Given the data, those are the 10 types. In each case, you can always say something about the richest and the poorest. So here, for example, what's the richest type? It has 2.7 times the mean income in Bolivia. The poorest type has 0.4, so it's a difference of 7 times between these two groups. And these guys are college degree, their parents had college degrees, their mothers had some sort of certain groups of occupations here. The poorest people are people who were born in rural areas and who are indigenous with uneducated fathers, which makes a lot of sense. So it tells you something. But if you wanted a measure like those indexes that we have, we have to then just take the inequality between the population weighted inequality between these 10 groups. And if you do that for all of our 27 countries, because that's too many, I just show you here the comparison for the latest wave for each of our nine countries. So there's Bolivia that we were looking at earlier, okay? That's 25, 0.25. These are genie coefficients. So you take, here's what I want you to do, okay? 10 subgroups in Bolivia. Give everyone in those subgroups the mean income of that subgroup. And then take the genie of the whole population, where everybody has the same income in that group. That's what James Foster calls the smooth distribution. So only 10 different numbers, but different numbers of people in each thing. You get a genie of 0.25, which happens to be bigger, I believe you hear, happens to be bigger than the genie for the whole Slovak Republic, where everybody has their own income. The highest level of inequality in this sample is Guatemala 2006 with a genie of 0.3, which is again just amongst those types, okay? You've eliminated all the inequality within types. You've got a genie of 0.3, which is equal to the overall genie in the Netherlands, okay? So that's the sort of thing you do. There are two bars. The conditional inference tree is the orange one, is the one I've just shown you. Then without getting into the techniques too much, but trees are estimators that have properties. One of their properties is a variance of the estimator. Trees, conditional inference trees, have high variances. One way of correcting for that is to do a lot of trees, which is called A, forest, and the forests are supposed to be lower variance estimators. Again, if you want more details on that, you have to ask my co-author Paolo, because it's really not my specialty here, but that's what they do, and you can see in this case the trees in the forest don't actually do that differently, okay? So you can look at that as a share. Those are the genies. As a share of the overall inequality in the population, I mean I just showed you it was bigger than inequality in the Netherlands, but what share is it of the overall inequality in Guatemala? Well, it's almost 60 percent. Guatemala, I don't remember how many types it has. It's more than 10, but if you go to Bolivia, just 10, just those 10 types that you and I could never have guessed by ourselves, but the machine told us the inequality between those 10 types is half of all the inequality that you observe in the data for Bolivia, okay? You could do a little bit of time series. You can see there, Peru, you know, this is just stuff you can do, okay? I should have taken that out. You know, some of you who are inequality decomposition experts might be saying, why is he showing us the genie? Does literature typically use decomposable measures like the mean log deviation? If there are questions about that, I can address them at the question and answer time, but here I just show you the correlation between these measures using the genie and the mean log deviation. There are big differences in levels, but the correlation is 0.98, okay? So, they really pick up the same information and rank countries in the same way. So, the other thing you can do, and I'm just kind of showing you some, you know, what this thing generates, is you can decompose the relative importance of each of the circumstances, which is obviously not going to be a causal estimate. Again, the beta in the IGM regression is not a causal estimate either because every other determinant of outcome is missing in that regression. So, these are not causal estimates. This is just the decomposition, and those are the averages for Latin America. On this side, we have all the countries for which we observe all circumstances. I don't want them compared to that side, because there, some countries, for example, Colombia, doesn't have information on father's occupation and mother's occupation. So, clearly, some of what mother's education and father's occupation is picking up here is that those things aren't there, okay? These have all, these have some of the circumstances. But in the end, you know, this is three-quarters. Three-quarters of the variation, on average, in Latin America, is explained, explained by parental background. The occupation and education of father and mother. Then there are lots of fun things you can do. I mean, you can see place of birth. In Ecuador, matters nothing. In Argentina, it's the biggest determinant, right? Ethnicity. Ethnicity is going to be big in Guatemala and in Peru, but almost nothing in Argentina. And, you know, it tells you something about, again, the structure of opportunity in those countries. So, I have only five minutes left or so. So, I'm going to kind of just give you a flavor of the exposed stuff. So, in the exposed stuff, okay, you have a tree like this, sorry. The tree in the exposed, it just doesn't generate just the mean at the end. It generates an estimate of the whole distribution. Remember, using the parametric function. So, now I have inequality within the type estimated as well, using these parameters, okay? For each of those, so it's different. So, Bolivia had 10 types in the ex ante case, has 14 types in the exposed case. They're a little bit different. You can see the density functions there. I don't have time to go into the partition, but you can plot each of those 14 types, their cumulative distribution functions. So, these are the cumulative distribution functions of the types. All the green ones are urban. All the blue and gold ones are a mix of urban and migrant. And the inequality of opportunity estimates are going to come from the horizontal distances in these quantiles. You know, these are CDFs. Inverted, we've got a quantile function. So, these horizontal distances are going to give you the measure of inequality of opportunity. You can do little mountain graphs where you can see where the rural population is, where the urban population is. You could do this for many different. These are the 14 types. This is the density function of Bolivia as a mixture of the 14 distributions that underlie it for each type. Same kinds of results in terms of comparisons. I want to use the little time that I have again, the shares of the population, you know, above 50 in most cases, in some cases going into the 60s in this exposed decomposition. Now, I want to use the last bit of time to say something about the comparison between the ex ante and the exposed. Comparison between the ex ante and the exposed. So, you might be saying, okay, so this guy is using this inequality of opportunity stuff. One is ex ante is looking at means. The other is looking at the whole distribution. Does it make a difference? Why is it different? One tree gives me 10. The other tree gives me 14. Well, so the takeaway for me is the following. They are positively correlated at 0.85. So, the ex ante and exposed measures, identity, gives you more or less the same picture in terms of ranking of inequality of opportunity across these nine countries. But it's not perfect. Now, there are two reasons why it's not perfect. One is, these are estimators. These are statistical apparatus. So, they are sampling errors. That could be part of it. But the other reason is what I want to end with is that they are trying to pick different things. Okay? So, to do that, let me show you some excerpts of trees. These are not complete trees. I've chopped the trees to show you the specific bits. And I want you to focus here in the case of Panama 2003. The two poorest types in Panama 2003 are divided by ethnicity. This is one and then this one here subdivided by by father's education. But basically, this is one ethnic group. There are few people. Here's a distribution function. This is another ethnic group. Okay? Those are the two poorest in the exposed analysis for Panama. All of them are in this group. I'll show you why in a moment, which is a single group. Okay? In the ex ante tree. Okay? Why is that? Because the two groups are these guys. Okay? And you can see they have very different levels of inequality but not very different means. They're expected cumulative distribution functions cross close to the median. Their means are not very far apart. The ex ante machine is going through and saying these guys are not that different at the mean if that's what you care about. But if you care about the whole CDF, they are different. And so I'm going to split those two groups. And here you see this is a mapping of the exposed partition to the ex ante partition. And here are those two groups, the four and the six, these two guys here. They were partition in the exposed but they are all in the five in the ex ante. In my beautiful alluvial diagram. I didn't know this was called an alluvial diagram but Paula told me the other day. Okay? Last thing and I'll conclude. One thing I want to say is that if you compare the shares of inequality explained by this method with the share of inequality explained by other things. For example here I have the squares of those correlation coefficients. Which as I showed you in a slide earlier the squares of the correlation coefficients are the r-squares of the regression of child's education on parental education. So it's the share of explained inequality in some sense. And as you can see they are much much lower than they are for this. Ideally you'd do this with income. There are some issues. We're still working on it. We're working in progress. We'll do it with income. But just to show that the choice of partitions and techniques here gives you a lot of insight into the structure of the persistence of inequality that you wouldn't that you don't get necessarily by looking at a single outcome like education. And I think these are my conclusions. I mean everything that's there. I have said to you I've run out of time so let me not repeat it and let me end here by thanking you very much for your attention. It doesn't show there. Sorry it's not showing there. Thank you very much. I am going to start saying that I'm really delighted to be in this conference and thank you very much for inviting. And I'm also very delighted to be commenting on Chico's presentation. It's a very difficult task. He does such a great job that it's really difficult for the commenters to follow or match his level of presentation. So a couple of years ago Chico gave a presentation titled Inequality as Cholesterol and that has been my go-to example of making a distinction between fair and unfair inequality. And in this actually in that talk he also talked about how you can actually use this idea of inequality of opportunity to give a measure of unfair inequality. And this time I think you know he took a much bigger step forward in terms of the empirics rather than relying on subjective choice of partition here he came up he and his co-author came up with a objective and parsimonious partition of population which is actual as you can see from the results very effective in explaining the variation in inequality that you see there. I do think that this is going to become a frontier benchmark for this literature and probably every paper that happens later on has to follow this approach. So I am actually that bullish about about the methodology that's being used here. And I also like that Chico related inequality of opportunity with the IGM literature. These two literature are parallel to each other there are a lot of similarities between the two. In terms of looking at the evidence I think the evidence I'm not going to repeat it. I don't see anything to disagree about the evidence that he has presented and I think this is something that's coming out from other recent papers on Latin America region too. So what I am going to do in these comments is to see if we can actually broaden the research agenda to you know make it more informative for policy discussions and I'm going to focus on a couple of things that Chico has mentioned in the paper but did not have time to mention in the presentation today. The very first thing that's actually is in the okay let me skip this slide that is actually in the abstract is that because the partition is very parsimonious you can almost tell that the estimates are very conservative somewhat underestimate. So I am going to start with saying can we actually broaden this and come up with a much broader estimation of inequality. Fortunately in literature there is something there already which is known as sibling correlation. The idea there is that siblings growing up together faces many things common not just parents education or parents and ethnicity but you can think of family network social network neighborhood that they are going growing up the schools they are going all of this and advantage of this sibling correlation is that you are not just focusing on some observables in both the IOP and IGM literature are focusing only on the observables this actually captures the unobservable influences as well and my co-author and I had done this estimation for 53 developing countries and of course we have about seven lakh countries in there too and in the graph here you can see that these estimates are much much bigger much much bigger than you know the whether we look at IOP estimates or we look at the IGM estimates that's but I also want to point out that there is a bit of good news here over time if you look at the age cohorts the estimates has gone down a bit so that's for at least for lack region that's a good news okay the second observation that Chico had and then this is the graph that I actually want the left-hand side figure is actually taking from Chico's presentation the earlier version of it the one of the evidence there was that parents education explains quite a bit and that is the one of the major driver of inequality of opportunity or whether or IGM whichever you look at it but when you look at it it actually you know explains about 25 percent of the variation of children's education that's not a huge lot and that's exactly what we found in our sample too and that explains about 30 percent about a third of variation in sibling correlation but these estimates are done under many restrictive assumptions I pointed out some of it here for example you know in that better coefficient that was in Chico's regression assumption is that that coefficient does not vary it's same for the whole population but you can think of that varying across households similarly you know there is also the assumption that parents education is not affecting the variants of the residuals from children's equation these are restrictive assumptions if we relax these assumptions here I made the same estimation after relaxing the relaxing the assumptions here you can see that the father's education alone can explain 75 more than 75 70 percent of the variation in children's education so in other words you know under all those assumptions whatever we are estimating that is actually grossly underestimated in parents influence you may ask why you want why you see this why parents education should be influencing variants of children education to give you a sense of this consider a developing countries country where two families one poor one rich if there is a shock it is possible that the poor family have to you know withdraw the child from school and that tend to increase the variants at the lower end of parental distribution education distribution and that is an intuition that we find you know using you know evidence in support of that using data from three different countries in all of these the variance is higher at the lower end we also develop a you know measure which includes not just the influence of parent on the mean but all also on the variance and once you do that you can see here that the relative persistence which was the beta coefficient in Chico's equation that is no longer constant across the distribution of parents equation parents education more importantly what you can see that we are majorly underestimating the you know persistence for children who are the born in the most disadvantaged situation so in other words you know I am to just to conclude what I'm saying is that whatever estimates the traditional estimates that we are using to look at inequality are you know underestimates we are underestimating parents influence and we are underestimating it in big way for children coming from more disadvantaged background and so I think the urgency for you know policy action is quite clear and I see that I ran out of time so I'll skip my final point and stop here thank you I've been given strict instructions about time so I am going to ask the audience to pose their questions and then I'll give Professor Ferrera a chance to respond to to all together including the discussant comments let's open it to to you the audience who would like to pose a question I have lots of questions but I think you should have an opportunity to ask a question Kunal well let me start the questions I hope as more people think about questions she got this very interesting presentation and you know I agree with for I should be that seems to take a part-breaking approach to think about how to estimate RUP both exactly exposed I do questions on the method itself I don't understand exactly what it is first would it not be possible that you have more than one equilibrium because you have a you have algorithm it's non-linear you're using it to maximize maximize the maximum active function that you're getting it's quite possible to get two more than one optimal solution if that's the case would it not be sorry yeah so if so would it not be possible to get more than one optimal solution when you're looking at all the partitions right and if that's the case how do you choose across all the optimal solutions second is that it seems that it's a patient approach because using a patient prior here about the partitions and then you're adapting your revising the priors using this adaptive method so how much is it going to be making a patient approach and I was wondering whether that's a way to think about this approach thanks so let's collect a few questions there's a question in the fourth row there thank you very much for the very interesting presentation I know Professor Fepera has done some work on inequality of opportunities and using data from sub-Saharan Africa I was just wondering if he's tried this new approach using the same data and how the results fair compared to what he did in in the past thank you thanks can I just I should have mentioned would you introduce yourself too my name is Monica Lampon Quiffy from the University of Ghana thank you and then here in the on the right hello hello my name is Brinalini I'm from Delhi my question was regarding the new method so from what I understand this new method which will use a machine algorithm to really decide which circumstances we are going to introduce we're going to keep those estimates that we get for IOP using these machine determined circumstances if you will will also be a lower bound estimate is what I've understood if it is so and we do it across countries or across states and we find that different countries or different states have different number of circumstances and then we use the sharply decomposition to get the share for each of these circumstances across states or across countries my question is will it be wise to then compare the share across different units if they are really the lower bounds and they would vary depending on the total number of circumstances that we are considering across spaces I don't know if that made sense but thank you I think I'll I'll give Chico a chance to respond we have discussant comments to respond to and also three questions and then if time permits we'll open again for another round of questions that's me okay so I thank you these were great questions and thanks very much to Fohard for the for the comments let me start from from there I think the sibling correlation results that you showed are really interesting and it's something that's that's missing I think in the review that we're doing in this so I'll follow up with you later to get those and I think they're definitely interesting again as with many of these different approaches at the question of persistence they're complementary right because I think one of the nice things about these trees and I have to emphasize that really Brunari, Huffe and Mahler were the the first people to apply that to true inequality of opportunity the exante methods so it's an innovation that we are adopting here in this paper you know I one thing I like a lot about them is is the structure right to telling you what the first partitions were and so on and so forth and but but definitely looking at sibling correlations I think is very important your point in the final slide about measurement as a starting point and we what we really want is to understand how policies affect these issues is is completely well taken and that that's a set of other literatures on policy evaluation and so on which I think again are complementary to this hopefully though the measurement stuff is important in setting certain benchmarks I think if if hopefully if the policy debate in a country is informed by the fact that 10 or 14 or 18 groups across the population chosen on the basis not of what they are doing of their education of their performance in the labor market or of their entrepreneurship but defined exclusively in terms of their family background account for 60 percent of inequality in that country with a parsimonious partition as you said hopefully that sort of is you know part of the political and policy discussion but I agree that policies are more important on the questions so for let me just say that I was always hard of hearing I'm even more hard of hearing now after certain things this year so I heard most of what you said but please bear with me so Kunal's questions on multiple optima so as far as I understand that problem that I outline is has a single peak solution so I don't think there are multiple optima in that maximum likelihood problem formally however it is each optimal is sample specific and relatively slight variations in the sample can lead to a different optimal and this is what the forests are meant to address so we're basically using a sample through effectively bootstrapping and bagging and various other kinds you know take a circumstance out take a subset of the sample out to some various tinkering is around with the sample to try and shed some light on what the sampling error is doing each of those can have different optima and then you know they are combined in some way through these forests for a single sample I think there's a unique solution always I also think again I wish my co-authors were here but I'm fairly sure this is not a Bayesian method in the sense that in fact you don't start from a prior at all you other than your prior is as you start that there is you test the first no hypothesis that there is no inequality of opportunity there are no differences and you test that from the data and then you the data simply tells you what the partitions are but there is no sort of prior about what circumstances are important or are not in that sense there was a question about the sub-Saharan Africa can I just could the person who asked that question thank you very much and did you say that you mentioned I was running to get my pen you said someone had done a lot of work on that did you say who did you say Pablo Bernoulli yes well Pablo Bernoulli did that work with Vito Peragin and Flavia Napalmissano who are there so if you don't mind I'll defer to them and you can ask them at the break that they would know much more than me and then I was taking notes but then what was the other about the method and is it wise to use the lower bound yes so on the lower bound let me say you know I'm probably partly to blame in that when I first wrote about these these things I was focused very much on this missing circumstance variable problem and you know if if you had a if you had a population that is the only bias there is but with samples there is this overfitting issue so in effect with samples there's a trade-off between between the two so these estimates that we provide here are not a lower bound in in in that sense they are given the parsimony of the information that we have given that there are lots of other circumstances that we don't measure like wealth or like the quality of early childcare or you know genetic things lots of things that we don't measure in all probability there are underestimates but it's different to say in all probability there are underestimates then to say they're lower bound they're not a formal lower bound you know I cannot prove that there isn't an estimate below it they are likely to be underestimates because the partitions are so parsimonious one thing that could be done and would be interesting in terms of extensions as well is to take a very large sample or an administrative data set where we had information on covariates then do it for a sample run this on a sample and then run it on the quote-unquote population or much larger sample the trees do get deeper as the sample size increases there's an issue of the power of the test that's going on here so again you know much like with any other method siblings correlations or anything else you are constrained by the data that you have by the sample that you have and by the sample size that you have this is very much on the foreground here but it affects any methods so I don't think that answers your full question but if you want to just shout again what the main point I may have missed was but you have to really shout because of my hearing I'm sorry am I am I audible yeah okay so mainly my question was that is it fair or is it okay to compare the the circumstances shares that we get using Shapley decomposition across two units say states if the number of circumstances that I've considered to get those get that inequality of opportunity is different very good very good question no it's not fair which is why for example I separated the ones where I had all the circumstances from the ones where I had only some so yes if you if you don't start from the same set of circumstances to begin with if the algorithm excludes a circumstance completely I think it's okay to make the comparison the algorithm is telling on the basis of this data that circumstance doesn't matter but the algorithm must have been given the same set of circumstances so the initial set of circumstances you feed to the algorithm must be the same otherwise the comparisons are not appropriate which is you know why we separated that graph into slides so I've been under very strict orders to keep to time so I'm unfortunately not going to open up for another round of questions but I think this has been a fantastic start to the the conference I know that there are a number of other questions and I hope you'll grab our our speakers in the coffee break to to raise your questions I think in the last few moments I wonder if our speakers want to make any final remarks I can start with with forehead not really other than maybe I can talk a little bit about you know the policy point that Chico already answered which is that in the end you know we do need these measures to know where to start but we also need to know you know what policy you drive down this inequality and how to evaluate those policies I work a lot more on education and in the context of education I can tell you that there are many many studies which looks at you know how for example education certain intervention will lift average board but not much about you know how you know it's going to lift the boards from coming from different section of population I think that's why we need to go and I think the inequality of opportunity approach that Chico had talked about in length today would be one way to go there are other ways too so that's my that would be my comment well I wasn't I'm gonna have anything to say but but given what Forha just said which I I think is very important I want to give you a one-minute summary of another talk I've been giving this summer which is on the role of human capital investments in reducing inequality of opportunity which I think is related to what Forha was just just saying and just to give you the upshot of that I you know I review much of the excellent recent literature in development economics you know identifying the size of effects of early childhood interventions of teacher training interventions of health interventions of various kinds including payment for providers and so on and so forth and all of these have the following in my view my reading of the literature in the end is it matters a lot what you know how you do it not all inputs are the same dealing with teachers it seems to be much more effective than dealing say by giving people iPads or uniforms so there's a whole range of insights you can get from that literature but when you compare that literature to and if you know the kinds of effects that they have to Mr. Juan Coronado and his descendants in Costa Rica who are presidents and so on or to the fact that if I look at the opportunity profile in Brazil today 200 years after the end of slavery almost all of the bottom groups are still black and Afro-Brazilian I think we have to come to terms with the fact that these incremental human capital interventions are incredibly important but are probably not sufficient on their own to systematically transform what was happening one of the upshots of a conference in Córdoba there is at least one other person who is here who was there at that conference given by Juan Mauricio Cardenas who's a professor of this university was the revolution will not be nudged and I will end on that I think so that's a good place to end and kick off the rest of our conference thank you very much and thank you to our speakers