 of our conference. Our keynote speaker is Francisco Chico Ferreira. I think he needs no introduction in this group, but I should introduce him. He is a Marcia Sen Professor of Inequality Studies at the London School of Economics, where he also serves as Director of the International Inequality Institute. And he also is currently the President of the Latin American and Caribbean Economics Association. He will be speaking today on inequality of opportunity and mobility in Latin America. This is fascinating new work, highlighting two new data-driven approaches to addressing important gaps in the literature. So we're in for a treat, I think, this morning. Immediately following the presentation, we'll have comments from our discussant, Dr. Forad Shilpi. She is a senior economist in the sustainability and infrastructure team at the World Bank Development Research Group. She works in several different areas, but in particular, she works on intergenerational mobility in developing countries. And she draws, she'll be able to bring in said, I think, also from outside of Latin America. Before joining the World Bank, she served as a research associate with the Bangladesh Institute of Development Studies. Following the discussant comments, we'll give Chico a chance to respond to the comments and then we'll also open the floor for comments and questions, especially from you, the audience. So without further ado, let me turn over to Professor Ferreira. Buenos dias a todos. That was my Spanish for the day. Good morning, everyone. It's a great pleasure to be here. Let me start by thanking Conal San and Carlos Gradín and everyone at Wider, as well as Marcelo Slava and Leopoldo Ferguson and everyone at Uñandes for the invitations. Great pleasure to be here. Always very nice to be at Universidad Los Andes, which is really one of the premier centers for the study of economics in our region. So what I'm gonna talk about today is, as was said, inequality of opportunity and I'm calling it now intergenerational persistence. The opposite of mobility. In Latin America, it's joint work with Francois Bourguignon, Paolo Brunari, and Guido Naidhofer. It's part of a brother project, which I hope you'll hear more about in the course of some sessions to follow. There are three sessions in this conference that are presenting work from this project we have called the Latin American and Caribbean Inequality Review, where we have these five themes you see there on levels and trends of inequality, inequality of opportunity, inequality in markets, the role of the state, taxation and redistribution, and politics, the interaction between inequality and political power. And this paper is one of 27 papers that are being written for that project. So the outline of what I wanna do today is just a little bit of motivation. Then I'll do a very brief, an unfairly brief review of the literature, which is much richer than what I'll have time to discuss, but a little bit of a review of the literature on intergenerational mobility and inequality of opportunity in Latin America. Suggest some of what we've learned, but also some of the shortcomings. And then based on those shortcomings, suggest a new approach that we've developed with my co-authors. Talk a little bit about the data we have and then give some results for ex ante and expose inequality of opportunity. Before concluding. So the motivation probably is really not needed that much in this audience, but it is that socioeconomic advantage in this region. I mean in much of the world, but particularly in this region is extremely persistent. And I'm gonna show you just four figures that may be known to you already about some of that. What we have here are measures of cognitive development using the Peabody Vocabulary Recognition Test for a sample of Ecuadorian children. It's from work by Chris Paxson and Norbert Shadi in the JHR some time ago now. And they divided their sample by attributes that have nothing to do with this children's own performance, but inherited circumstances that they were shaping their opportunities. These are three to five year olds. These are young little kids before they go to school. And this is how they do. You know, you should think of this as a measure of cognitive development. And those in the households with the wealthiest households in the sample kind of stay around this level of 100, which is the norm for their age. So this is, you know, a measure again of cognitive development, it's normed. It reflects their ability to recognize words in Spanish. And you know, the wealthiest 25% are those whose mothers had most schooling stay around that level. But those in the same sample who belong to the poorest households or whose mothers, not them, but their mothers had very low levels of schooling are falling behind. It's not that they're forgetting words. It's just that they're falling behind relative to the norm. Okay, and this is before they go to school. So unsurprisingly, when we look at the distribution of schooling achievement and learning in our region, we also find big differences by, you know, when we look at distributions that are conditional not on the kids' own efforts, but on things that these kids inherited, on things that persist through generations. So here are five density functions of test scores in reading in Spanish from PISA 2006, conditional on father's occupation. So we just divided the occupations arbitrarily into a low-ranked group and a high-ranked group. And for these five countries here, you can see very big differences in learning achievement. So it starts with cognitive development before school. It continues into your schooling process with learning being quite different by these different characteristics. And unsurprisingly then, it manifests itself in differences in standard of living. When people are already adults, if you look at their distributions, condition on their parents, this is the sort of picture you observe. These are cumulative distribution functions of household consumption. So standard of living, if you like, per capita household consumption. But instead of having a single one for each of these five countries, Colombia, Ecuador, Guatemala, Panama, and Peru, we have three conditional on mother's education. So we divide the population of these countries in these surveys by whether their mothers had no formal education, incomplete, or primary complete. The dates for this ranged from the late 90s to the early 2000s. They're not the most recent service. But you can see that the children with whose, the adults, the adults living in households today, whose parents, the head's parents, the head's mother had no education, have much lower standards of living than the other two groups. So these are just kind of illustrations, prima facie, evidence of the existence of very substantial inequality of opportunities and of the persistence of social economic status across generations. I've showed you just correlations and conditional distributions of outcomes in one generation, conditional on parental variables and parental outcomes. And it goes beyond economics. Here's a slide that I kind of stole from Leopoldo Ferguson, who's here in his co-authors. This gentleman here is Don Juan Vasquez de Coronado, who was born in 1523 in Spain, in Salamanca, and he was one of the conquistadores, specifically in the region of Costa Rica. And Samuel Stone, in a 1975 book, traced various branches of descendants from Juan Vasquez de Coronado and found that it accounted for 31 presidents and 285 deputies in their history of Costa Rica. So again, substantial evidence of persistence across, in this case, many, many, many generations of political power and concentration. Now, these are descriptives. Of course, there is an established literature and there are some eminent contributors to that literature on mobility here, like Marcos Yanti and many others. Economists have looked basically at this and tried to measure it in two ways. And they're related, as I'll discuss. The two ways are a literature on intergenerational mobility, the converse of opportunity. And the other literature is a literature on inequality of opportunity per se, and I'll describe them both. And this has been done for many outcomes and for many variables, but I'll focus here on income and education, which have been the ones used most frequently in the world and in the region as well. So you should think of this as a survey of the literature on attempts to quantify, in a rigorous way, the extent of intergenerational persistence that I've described through those very few slides a moment ago. So before I start on the review, let me make a brief remark on the two approaches, which I'll call IGM for mobility and IOP for opportunity. You know, in very simple terms, IGM is about how strongly correlated or associated are the same kind of variables across two generations. So you all know this sort of Galtonian regression of child income or parental income or child education on parental education. That's what it is, about association between two variables. IOP, inequality of opportunity, can be defined and has been defined in many ways, but one of them is basically, okay, what share of today's inequality, if you look across the distribution, across the population today, across Bogotai, as Marcelo was saying, what share of all the inequality that you see is attributable to inherited factors, the things that they've inherited genetically or otherwise from their families. And these two things, if I describe them in that way, may seem different, but actually they're related in some fairly strong ways. And so to do that, let me just show you some very, very simple maths here. This is the standard Galtonian regression for inequality, for intergenerational mobility, right? And people use this intergenerational correlation, this intergenerational regression coefficient or elasticity, if it's in logs, as a measure of persistence. Or you could use the correlation coefficient, which is a transformation of the beta that corrects for differences or adjusts for differences in the marginal distributions of parents and children. And that's a measure of the association across generations, it's just a correlation coefficient. Okay, how do these IOP people try to measure inequality? Well, again, there are many different ways and I'm really simplifying here, but one way would be to just say, okay, what share of the variance or the inequality in current income is explained, quote, unquote, by circumstances. So the inequality in predicted incomes here would be the share that's explained by circumstances over the total, right? Now, of course, this looks like an R squared, right? You could use, if you use the variance here as the inequality measure, that would be the R squared of that regression. And of course, if you square the correlation coefficient, that's also what you get, the R squared. In fact, if you replaced C by YP, so if the only circumstance you looked at was parental income and you used variance as your measure of inequality, then the two measures would be identical in this case. So although they are defined and explained in these different ways, they're very closely related. This idea of mobility and association and the idea of the share of today's inequality accounted for statistically by predetermined factors. So I'd like you to bear that in mind as I take you a little bit through the review. So let me start with the literature on intergenerational mobility. Again, one slide on education, one slide on income. So there are a number of papers here that I am listing, going back all the way up to 1999 and ending now with some very nice new work, very granular new work by Munoz in 2021. I don't have time to go into the contributions of each of these papers, but to highlight the main findings, all of these papers find correlation coefficients that row that I was showing you here to be quite high in Latin America. We did it for our own data, that I'm gonna use it later here and the numbers are between 0.45 and 60, which match a lot of the results in the literature. Those would compare for this years of schooling variable to about 0.33, 0.35 for the US, not too far away in Italy and Spain and other countries. So a greater degree of persistence in this origin independent sense in Latin America than in developed countries that we compare with. I'm looking at trends that there have been improvements in absolute mobility in the region. So if you look at measures of mobility, which are not just the relative, the correlation coefficient, but if you look for example at the proportion of children that do better than their parents or the proportion of the new generation whose parents didn't complete secondary school but who do complete secondary school themselves. These kinds of measures that we often now associate with Raj Chatty's work in the US, which is very standard measures of directional absolute mobility, upward mobility, those have been improving in the region. In large part because of just big expansions in the schooling system and the quantity of education provided in the region. But the relative mobility, the rent correlations have been stabler or at least have grown by much less. So that's kind of the upshot of this. There's also literature on mobility in income. So what have we found about intergenerational mobility in income? Well, there are a number of studies, but all of them deal with a fundamental difficulty in the case of Latin America, which is to really measure intergenerational mobility in income well, you'd like to observe the income of today's children like today's adults, the children's generation, today's adults and the income of their parents match one to one over many different years in their lifetime. That's what you would like to have. So you have to have that link at the individual level and ideally many periods. This is extremely rare in Latin America, almost unheard of, with the exception of some recent paper that uses admin data for Uruguay. And I was told yesterday by my friend Sergio Firpo, I don't know if he's here, that there's one for Brazil as well by Brenna Sampai, which is still a working paper. So there are some new attempts to use admin data to look at this, but it's plagued by the fact that unlike in Finland or Denmark, if you use administrative data, most administrative data in Latin America, you are missing the informal sector typically, which is half the population. So that's a problem. These people know this and they try to correct for it, but it is an issue. The bulk of the work before that used two sample, two stage, these squares approaches where basically you use information on pseudo parents from an older survey. So you don't observe the parents and the parents' incomes of today's generation, but you're kind of estimated for people who look like the parents in today's generation in terms of observed variables back there, okay? And this is what could be done, but my discussant, Forad and her co-authors along with other people have a number of papers that show that two sample, two stage, these squares does have substantial shortcomings and may lead to biased results. So we have very high estimates of debaters here, but they are problematic, okay? So continuing with my review a little bit, this was mobility. What can we say about inequality of opportunity? So there have been again, papers on inequality of opportunity in income and in education. I mentioned some of the papers there. This big table, which I don't want you to focus on, but just to look a little bit at this column here, gives you estimates, older estimates for that, okay? Exactly from regressions of this kind. So this is a very simple what people call ex ante approach, you know, the first generation estimates. And what we found is that if you look at these countries here, you know, the share of income inequality or consumption inequality that is accounted for by some predetermined circumstances, basically place of birth, your parents' education, your father's occupation and so on, range in the region from around 26%, 25% to 36% or so for income and all the way up to about half in Guatemala. But for income between 25 and 36% or so, if Columbia actually interestingly, being one of the least unequal though, you know, there are reasons to be suspicious of that because they don't have some of the data on circumstances so the comparisons are difficult to make. Now, because of the new approach that I'm going to introduce in a moment, let me say when we did this, this is from an old paper with my co-author, Jeremy Jingyu, when we did this, we had to take a bunch of circumstances and divide them, partition the population into these groups that have identical circumstances which people call types. And we did this more or less arbitrarily. So again, a big table here, but this is what we had for each country, we had variables on people's ethnicity or race, on father's occupation, on mothers and father's education, on birth region. And we divided them into these little groups, these cells. We said, well, birth region, I don't know, what do we do? Well, let's say Sao Paulo and Brasilia, which are rich cities, and then these regions and then these regions are, if I go to Columbia, I go to K, departments of the periphery, central departments and then Bogotá, San Andrés and Providencia. You know, we looked at the means of variables and we said, let's group them into richer and poorer and more or less come up with a partition of the population into these types, these groups of people who share identical circumstances. The inequality between them is basically the inequality of opportunity that we were measuring. In the table that I showed you before, you have these nonparametric estimates which are just exactly the compositions of overall inequality into between group and equality between these groups. The parametric version is just the same, assuming a linear relationship. Okay, so I mentioned this because just as the absence of good linked data across generations is a problem for the intergenerational mobility literature on income. So for understanding transmission and persistence of inequality looking at income, the problems in Latin America are that for the IgM literature that just isn't the ideal data in terms of linking the income of parents and children. And in terms of this literature, well, there are many problems, but one of them is that we were using these relatively arbitrary partitions. So what we want to do is to propose a new approach which is a very data-driven approach to try and remedy the shortcoming of the arbitrary partition for the inequality of opportunity approach. And before I show you the method, let me very briefly just introduce those of you who may not be familiar with this, these two different views of inequality of opportunity. So inequality of opportunity is all about dividing the population into groups that have the same circumstances. So if you only have gender and race, you'd have, I don't know, black men, white men, black women, white men. But if you have parental education, if you have all these other variables, you end up with potentially hundreds or even thousands of groups, as I'll show you. The idea is that the inequality between these groups, they are all defined by circumstances, by things that you had no influence on. It's not your own education, it's your parents' education. So inequality between these groups reflects, in some sense, inequality of opportunity. Now there are two ways of looking at that. One is basically to assume that you have equality of opportunity if the means across all these groups, so this is the partitioned pie, these are the types, the subgroups. If the means across all these groups are the same, then in some sense, the value of their opportunity sets across these groups are the same, and this is one definition of equality of opportunity. And if you use that, then inequality of opportunity is just gonna be differences in means. So some sort of inequality between means is gonna tell you what inequality of opportunity is. Another approach is more demanding. It says no, equality across means is not enough. I want that the distribution, the full distribution of income in each type get the same. So that whatever quantile of the distribution you are in, you would actually earn the same amount across types. This is known as the exposed measure of inequality of opportunity. And then what you need is a measure of inequality of opportunity, is some aggregation of inequality across the quantiles. So you take this distribution function and invert it, you have a quantile function in each type, so a conditional quantile for a given set of circumstances, and you aggregate inequality across them and it would be your inequality of opportunity measure. This is first proposed by Vito Peragini and Daniela Kecki. For that you need estimates of the type-specific quantile functions. Now the key thing is that in both approaches, the first thing you need is a partition. You need to agree, given a bunch of circumstance variables, how you're gonna divide your population into these types. But how should the population be partitioned? I just mentioned to you a moment ago, looking at my own work of Jeremy, that we partitioned it fairly arbitrarily. And the reason we were grouping all those regions, Bogota and Providencia and Sao Paulo and Brasilia, was that we were vaguely aware of this problem here. So in the data set that I'm gonna use today and show you results from today, take a country, Bolivia. We have, after all our restrictions and taking away the missing observations and so on, we have 6,000 observations in Bolivia. For them, we observe information on sex, two categories, ethnicity, seven categories, occupation of father and mother, nine categories each, education of father and mother, four categories each. So if I did the fine partition, if I identified type so that everyone has exactly the same circumstances using all of these groups, I would have 18,000 groups. So if I try to put 6,000 observations into 18,000 groups, I am overfitting the data massively. On the other hand, if I don't use all of that observation, I am in some sense missing information on circumstances that is important for accounting for inequality of opportunity. These are problems that would not arise if you had census or Scandinavian registry type thing. These are problems that arise from the sampling nature. But it does mean that when we're using a sample, as we typically are, and certainly almost always in Latin America, to measure inequality of opportunity, we are facing effectively two competing biases. One is this downward bias from omitted circumstances. The little drawing here just says the following. Suppose you had two circumstances, mother's education, father's education. So you partition the population into these nine groups, you look at the inequality between the means or between the distribution functions, that's your measure of opportunity. What happens if you add an opportunity you didn't have? Suppose you now have race, black and white, and you put that across here. You can see that inequality between types as I add another partition can only go up. It's like an R squared type thing, right? Can only go up. So if I don't use, every time I don't use observation on the partition, every time I don't include information on a circumstance variable or on defined categories, I may be underestimating inequality of opportunity. By the way, in case you were thinking, gosh, this is complicated, why don't we just stay with the mobility stuff? The same thing is true there. That's why when they put grandparents income in the regression, they suddenly find the grandparents income are significant. The parents income are not a sufficient statistic. So if you're interested in what the parents income explains of today's inequality, I mean it's a measure of association, but it suffers from exactly the same omitted variable problems as this would, okay? So you'd like to include as many as possible for this, but as I've just shown you in the case of Bolivia, if you use all of the possible partitions, you may overfit the data. So it's a sampling statistical problem. So given a data set, how do we choose the partition? So what we did in the very beginning is we eyeballed it and we said, oh, this looks reasonable, right? Now we've got to do a bit better than that. And so that's what we're trying to do in this work, which really owes a lot to, particularly to my co-author, Paolo Brunotti, who's been doing a lot of this, and that is to use a data-driven approach to actually select the partition. And also, particularly in the exposed case, not only to select the partition, but to estimate features of the conditional distribution within groups that we're gonna use for our estimate of inequality of opportunity. For the ex ante approach, because equality is equality of means, and inequality is inequality between means, we're gonna focus on the means between times. But for the exposed approach, we will actually have estimates of the quantile functions. And one of the neat things about this approach is that for those of us who've worked on inequality of opportunity for a long time, there are these two different approaches that are different conceptually, and this kind of machine learning approach we're gonna use here is particularly well suited to estimate the information that we need for both of them, okay? So the spirit is, given a data set, don't pick a random partition the way that we did before, that I showed you, but use a statistical approach to inform what the optimal partition is in a well-defined statistical sense, okay? So this is a very tedious slide. So I'm gonna just tell you what's actually done by this conditional inference trees. These are the conditional inference trees. It's basically the follow. I'm trying to make this lively, okay? So these guys did this conditional inference trees. You have a bunch of circumstance variables, and you have income, say, the algorithm tests the correlation between the outcome, say, income and the circumstances. Okay, if there's no correlation, no significant correlation above some p-value, then, okay, then one accesses the algorithm, no inequality of opportunity. You reject the no hypothesis. You pick the variable with the smallest p-value, that is, the variable that is most statistically significantly correlated with income. So the most important one, and you use that to, oops, you use that to partition the population, you use that to partition the population, and then you ask, okay, you had those birth regions that I used for Columbia, but what's the right breakdown? So we guessed it, right? What the machine does is it tries every possible breakdown and chooses the one with the lowest p-value, the one that is most significant. So it draws from the data, the partition that is most significant in explaining differences. That's the spirit of the thing. It's not using machine learning because machine learning sounds sexy. It's using machine learning because it does exactly what you want it to do. It tests all the possible hypotheses, and picks the one that maximizes the explanatory variable. So this is what this does, okay? It does so for means. It's all the tests are about means. So now, in the next two slides, I'm basically going to say how this is done for the exposed approach. The next two slides have some complicated equations which I wanted to abstract from, but they are there for a reason which I'll tell you about in a moment. So for the exposed approach, the solution to this for the exposed approach when you're looking at quantile functions can be written and it is written by Hawthorne and Zeilis, 2021, as the solution to a local adaptive maximum likelihood problem, okay? So you first have to assume that when you're interested in estimating the quantile function of each type, that there is a family of parametric functions that can do a good job in doing that, okay? So you have to assume that you can estimate these parameters theta that will approximate this thing. Theta then is chosen to maximize the likelihood given the data. So Li theta is the likelihood of each observation I, right? Given theta. So you basically are trying to fit a bunch of conditional distributions to the data set to minimize the distances between your data set and those conditional distribution functions, all right? Where you're using the parameters to define the distribution function and also you are choosing the parameters so as to partition the population. The weights here are basically weights that are gonna be an indicator function that tells you into which cell each observation fits. So it looks fancy. The reason I put it there is because I said to you before that this is the optimal solution to a problem in a well-defined statistical sense. This is the well-defined statistical sense. So it's not some magic machine that tells you what the partition is. The partition is basically maximizing a likelihood that you can explain the data by breaking down the population into types in estimating their distribution functions, okay? And this basically says how you do that. It's very similar to the X-Venta case, but here you have, you're estimating not the differences in means, but the differences in shapes in the distribution if you like by testing for the robustness of the, for structural breaks in the parameters if you like. So much for the methodological stuff. All I would like you to remember from that as you leave this room is the idea of this method is instead of the researcher picking a partition, more or less at random to try and guess where the optimal point is between the downward omitted variable bias and the upward bias that comes from overfitting, this is a specific well-defined statistical way to choose that optimal point, okay? And it derives a partition through that. So I'm gonna show you some results now. Now we've done the hard part. Now, as Philippe Aguillon used to say, we can begin to harvest, okay? We've done the planting, we can begin to harvest. This is the data we use. There are 28 household surveys that cover nine countries. You can see them there. Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guatemala, Panama, and Peru. You know, very different numbers of waves. Peru we're gonna be able to look at a bit of a time series in Colombia, Brazil, and Argentina, we're not, okay? The surveys cover this period. They are all from SEDLAC, big thanks to Leo Gasparini and his team at SEDLAS, at the Universidad de la Plata for sharing the data with us. The data were chosen only from surveys that contain information on parental background and in particular on things like race, well, on your own sex and race and ethnicity, on your place of birth, but also on your father's and mother's education and father's and mother's occupation wherever possible, okay? And in no cases do we use fewer than five of these seven circumstance variables. There's some details on age ranges and so on and so forth and I can address those maybe in Q and A, but we won't have time for them today. So that's the data where we're gonna apply this technique. Okay, so what do we do? This is a result. This is the first result. So this is a conditional inference tree. So if you go and you go to your R code that exists there and you say, do this, break down the Bolivian population in this way so as to find the maximum, always at each stage, break, have these binary breaks that maximize the statistically significant difference, you end up with a tree that says, so first we partition by father's education, okay? So the first partition in Bolivia, the most salient partition in Bolivia was between people with college degrees, people whose parents had college degrees and people whose parents had any other level of education. Now those guys there, they get divided up only once more by mother's occupation and they yield these two types at the end, okay? Which are the two richest types. In Bolivia and they account, the population shares there, they account for 7% of the Bolivian population, people whose parents had college degrees. The other 93%, what about them? Well, they next get divided by birth area between urban and rural. It is these migrants there, don't worry too much about them, but urban and rural. And then in rural, and if you've been to Bolivia or if you're from Bolivia, it starts making sense to you, that's a big cleavage in society. And then the machine tells you, for the rural people, the next thing you wanna do is some indigenous groups over here, the Cachavus, Amaras and so on, and then non-indigenous, say whites and a few other indigenous groups on the other side. And so the trees, not only will provide you with a partition in the end, which in this case is very economical partitions, it's only 10 types. Remember I had 108 and 54 in my arbitrary partition before. Here the machine is telling you these are the 10 key types, and then given the data at the level of significance that you've set and the restrictions you've set, I will not decompose them further. If you change the alphas and so on, you could go on, or if you had much more data, you could go on. Given the data, those are the 10 types. In each case, you can always say something about the richest and the poorest. So here, for example, what's the richest type? It has 2.7 times the mean income in Bolivia. The poorest type has 0.4. So it's a difference of seven times between these two groups. And these guys are college degree, their parents had college degrees, their mothers had some sort of certain groups of occupations here. The poorest people are people who were born in rural areas and who are indigenous with uneducated fathers, which makes a lot of sense. So it tells you something. But if you wanted a measure like those indexes that we have, we have to then just take the inequality between the population-weighted inequality between these 10 groups. And if you do that for all of our 27 countries, because that's too many, I just show you here the comparison for the latest wave for each of our nine countries. So there's Bolivia that we were looking at earlier. Okay, that's 25, 0.25. These are genie coefficients. So you take, here's what I want you to do, okay? 10 subgroups in Bolivia. Give everyone in those subgroups the mean income of that subgroup. And then take the genie of the whole population where everybody has the same income in that group. That's what James Foster calls the smooth distribution. So only 10 different numbers, but different numbers of people in each name. You get a genie of 0.25, which happens to be bigger, happens to be bigger than the genie for the whole Slovak Republic where everybody has their own income. The highest level of inequality in this sample is Guatemala 2006 with a genie of 0.3, which is, again, just amongst those types, okay? You've eliminated all the inequality within types. You get a genie of 0.3, which is equal to the overall genie in the Netherlands, okay? So that's the sort of thing you do. There are two bars. The conditional inference tree is the orange one, is the one I've just shown you. Then without getting into the techniques too much, but trees are estimators that have properties. One of their properties is a variance of the estimator. Trees, conditional inference trees have high variances. One way of correcting for that is to do a lot of trees, which is called A, forest. And the forests are supposed to be lower variance estimators. Again, if you want more details on that, you have to ask my co-author Paola, because really not my specialty here, but that's what they do. And you can see in this case, the trees and the forest don't actually do that differently. Okay? So you can look at that as a share. Those are the genies. As a share of the overall inequality in the population, I mean, I just showed you it was bigger than inequality in the Netherlands, but what share is it of the overall inequality in Guatemala? Well, it's almost 60%, okay? Guatemala, I don't remember how many types it has. It's more than 10, but if you go to Bolivia, just 10, just those 10 types that you and I could never have guessed by ourselves, but the machine told us, the inequality between those 10 types is half of all the inequality that you observe in the data for Bolivia. Okay? Could do a little bit of time series. You can see there, Peru, you know, this is just stuff you can do. Okay? I should have taken that out there. You know, some of you who are inequality decomposition experts might be saying, why is he showing us the genie? Does literature typically use decomposable measures like the mean log deviation? If there are questions about that, I can address them at the question and answer time, but here I just show you the correlation between these measures using the genie and the mean log deviation. There are big differences in levels, but the correlation is 0.98, okay? So they really pick up the same information and rank countries in the same way. So the other thing you can do, and I'm just kind of showing you some, you know, what this thing generates, is you can decompose the relative importance of each of the circumstances, which is obviously not gonna be a causal estimate. Again, the beta in the IGM regression is not a causal estimate either because every other determinant of outcome is missing in that regression. So these are not causal estimates, this is just the decomposition, and those are the averages for Latin America. On this side, we have all the countries for which we observe all circumstances. I don't want them compared to that side because there, some countries, for example, Colombia doesn't have information on father's occupation and mother's occupation. So clearly some of what mother's education and father's occupation is picking up here is that those things aren't there, okay? These have all, these have some of the circumstances. But in the end, you know, this is three quarters. Three quarters of the variation on average in Latin America is explained by parental background, the occupation and education of father and mother. Then there are lots of fun things you can do. I mean, you can see place of birth. In Ecuador, matters nothing. In Argentina, it's the biggest determinant, right? Ethnicity, ethnicity is gonna be big in Guatemala and in Peru, but almost nothing in Argentina. And, you know, tells you something about, again, the structure of opportunity in those countries. So I have only five minutes left or so, so I'm gonna kind of just give you a flavor of the exposed stuff. So in the exposed stuff, okay, you have a tree like this, sorry. The tree in the exposed, it just doesn't generate just the mean at the end. It generates an estimate of the whole distribution. Remember, using the parametric function. So now I have inequality within the type estimated as well, using these parameters, okay? For each of those, so Bolivia, it's different, okay? So Bolivia had 10 types in the exemptive case, has 14 types in the exposed case. They are a little bit different. You can see the density functions there. I don't have time to go into the partition, but you can plot each of those 14 types, their cumulative distribution functions. So these are the cumulative distribution functions of the types. All the green ones are urban. All the blue and gold ones are a mix of urban and migrant. And the inequality of opportunity estimates are gonna come from the horizontal distances in these quantiles. These are CDFs. Inverted, we've got a quantile function. So these horizontal distances are gonna give you the measure of inequality of opportunity. You can do little mountain graphs where you can see where the rural population is, where the urban population is. You could do this for many different. These are the 14 types. This is the density function of Bolivia as a mixture of the 14 distributions that underlie it for each type. Same kinds of results in terms of comparisons. I wanna use the little time that I have again, the shares of the population, above 50 in most cases, in some cases going into the 60s in this exposed decomposition. Now I wanna use the last bit of time to say something about the comparison between the exxante and the exposed. Comparison between the exxante and the exposed. So you might be saying, okay, so this guy is using this inequality of opportunity stuff. One is exxante is looking at means. The other is looking at the whole distribution. Does it make a difference? Why is it different? One tree gives me 10. The other tree gives me 14. Well, so the takeaway for me is the following. They are positively correlated at 0.85. So the exxante and exposed measures identify give you more or less the same picture in terms of ranking of inequality of opportunity across these nine countries. But it's not perfect. Now there are two reasons why it's not perfect. One is these are estimators. These are statistical apparatus. So they are sampling errors. That could be part of it. But the other reason is what I want to end with is that they are trying to pick different things. So to do that, let me show you some excerpts of trees. These are not complete trees. I've chopped the trees to show you the specific bits. And I want you to focus here in the case of Panama 2003. The two poorest types in Panama 2003 are divided by ethnicity. This is one, and then this one here is subdivided by father's education. But basically this is one ethnic group. There are few people. Here's a distribution function. This is another ethnic group. Those are the two poorest in the exposed analysis for Panama. All of them are in this group. I'll show you why in a moment. Which is a single group in the exxante tree. Why is that? Because the two groups are these guys. And you can see they have very different levels of inequality, but not very different means. They're expected cumulative distribution functions cross close to the median. Their means are not very far apart. The exxante machine is going through and saying these guys are not that different at the mean, if that's what you care about. But if you care about the whole CDF, they are different. And so I'm going to split those two groups. And here you see, this is a mapping of the exposed partition to the exxante partition. And here are those two groups, the four and the six. These two guys here. They were partitioned in the exposed, but they are all in the five in the exxante. In my beautiful alluvial diagram. I didn't know this was called an alluvial diagram, but Paula told me the other day. Okay, last thing and I'll conclude. One thing I want to say is that if you compare the shares of inequality explained by this method with the share of inequality explained by other things. For example, here I have the squares of those correlation coefficients, which as I showed you in a slide earlier, the squares of the correlation coefficients are the R-squares of the regression of child's education on parental education. So it's the share of explained inequality in some sense. And as you can see, they are much, much lower than they are for this. Ideally, you'd do this with income. There are some issues. We're still working on it, just work in progress. We'll do it with income. But just to show that the choice of partitions and techniques here gives you a lot of insight into the structure of the persistence of inequality that you wouldn't, that you don't get necessarily by looking at a single outcome like education. And I think these are my conclusions. I mean, everything that's there. I have said to you, I've run out of time, so let me not repeat it. And let me end here by thanking you very much for your attention. In this conference, and thank you very much for inviting. And I'm also very delighted to be commenting on Chico's presentation. It's a very difficult task. He does such a great job that it's really difficult for the commenters to follow or match his level of presentation. So a couple of years ago, Chico gave a presentation titled, Inequality as Cholesterol. And that has been my go-to example of making a distinction between fair and unfair inequality. And in this actually, in that talk, he also talked about how you can actually use this idea of inequality of opportunity to give a measure of unfair inequality. And this time, I think he took a much bigger step forward in terms of the empirics, rather than relying on subjective choice of partition. Here he came up, he and his co-author came up with the objective and parsimonious partition of population, which is actual, as you can see from the results, very effective in explaining the variation in inequality that you see there. I do think that this is going to become a frontier benchmark for this literature and probably every paper that happens later on has to follow this approach. So I am actually that bullish about the methodology that's being used here. And I also like that Chico related inequality of opportunity with the IGM literature. These two literature are parallel to each other. There are a lot of similarities between the two. In terms of looking at the evidence, I think the evidence, I'm not going to repeat it. I don't see anything to disagree about the evidence that he has presented. And I think this is something that's coming out from other recent papers on Latin America region too. So what I am going to do in this comments is to see if we can actually broaden the research agenda to make it more informative for policy discussions. And I'm going to focus on a couple of things that Chico has mentioned in the paper but did not have time to mention in the presentation today. The very first thing that's actually is in the, okay, let me skip this slide. That is actually in the abstract is that because the partition is very parsimonious, you can almost tell that the estimates are very conservative, somewhat underestimate. So I am going to start with saying, can we actually broaden this and come up with a much broader estimation of inequality? Fortunately, in literature, there is something there already, which is known as sibling correlation. The idea there is that siblings growing up together faces many things common, not just parents' education or parents' ethnicity, but you can think of family network, social network, neighborhood that they are growing up, the schools they are growing, all of this. And advantage of this sibling correlation is that you are not just focusing on some observables in both the IOP and IGM literature are focusing only on the observables. This actually captures the unobservable influences as well. And my co-author and I had done this estimation for 53 developing countries, and of course we have about seven lack countries in there too. And in the graph here, you can see that these estimates are much, much bigger, much, much bigger than whether we look at IOP estimates or we look at the IGM estimates. But I also wanna point out that there is a bit of good news here. Over time, if you look at the age cohorts, the estimates has gone down a bit. So that's, at least for lack region, that's a good news. Okay, the second observation that Chico had, and then this is the graph that I actually want, the left-hand side figure is actually taking from Chico's presentation, the earlier version of it. One of the evidence there was that parents' education explains quite a bit, and that is one of the major driver of inequality of opportunity, or whether or IGM, whichever way you look at it. But when you look at it, it actually explains about 25% of the variation of children's education. That's not a huge lot, and that's exactly what we found in our sample two, and that explains about about a third of variation in sibling correlation. But these estimates are done under many restrictive assumptions. I pointed out some of it here, for example, in that better coefficient that was in Chico's regression, assumption is that that coefficient does not vary. It's same for the whole population. But you can think of that varying across households. Similarly, there is also the assumption that parents' education is not affecting the variance of the residuals from children's regression. These are restrictive assumptions. If we relax these assumptions, here I made the same estimation after relaxing the assumptions. Here you can see that the father's education alone can explain more than 70% of the variation in children's education. So in other words, under all this assumption, whatever we are estimating, that is actually grossly underestimated in parents' influence. You may ask why you see this, why parents' education should be influencing variance of children's education. To give you a sense of this, consider a developing country where two families, one poor, one rich, if there is a shock, it is possible that the poor family have to withdraw the child from school and that tend to increase the variance at the lower end of parental distribution, education distribution. And that is an intuition that we find using evidence in support of that, using data from three different countries. In all of these, the variance is higher at the lower end. We also develop a measure which includes not just the influence of parent on the mean, but also on the variance. And once you do that, you can see here that the relative persistence, which was the beta coefficient in Chico's equation, that is no longer constant across the distribution of parents' education. More importantly, what you can see that we are majorly underestimating the persistence for children who are born in the most disadvantaged situation. So in other words, I am to just to conclude what I'm saying is that whatever estimates, the traditional estimates that we are using to look at inequality are underestimates. We are underestimating parents' influence and we are underestimating it in big way for children coming from more disadvantaged background. And so I think the urgency for policy action is quite clear and I see that I ran out of time. So I will skip my final point and stop here, thank you. Audience, to pose their questions and then I'll give Professor Ferreira a chance to respond to all together, including the discussant comments. Let's open it to you, the audience. Who would like to pose a question? I have lots of questions, but I think you should have an opportunity to ask a question. Kunal. And you know, I can agree with for us, should be that seems like a part breaking approach thinking about how to estimate RUP both exactly exposed. I do questions on the method itself. I don't understand exactly what it is. First, would it not be possible that you have more than one equilibrium? Because you have a, you have algorithm, it's nonlinear, you're using it maximized, maximized the maximum activity function that you're getting. It's quite possible to get two more than one optimal solution. If that's the case, would it not be, sorry? Yeah, so would it not be possible to get more than one optimal solution when you're looking at all the partitions, right? And if that's the case, how do you choose across all the optimal solutions? Second is that it seems that it's a patient approach because using a patient prior here about the partitions and then you're adapting, you're revising the priors using this adaptive method. So how much is it kind of mimicking a patient approach? And I was wondering whether that's a way to think about this approach, thanks. So let's collect a few questions. There's a question in the fourth row there. Thank you very much for the very interesting presentation. I know Professor Fepera has done some work on inequality of opportunities and using data from Sub-Saharan Africa. I was just wondering if he's tried this new approach using the same data and how the results fair compared to what he did in the past. Thank you. Can I just, I should have mentioned, would you introduce yourself too? My name is Monica Lambon-Cuiffy from the University of Ghana. Thank you. Hello, my name is Brinalini, I'm from Delhi. My question was regarding the new method. So from what I understand, this new method which will use a machine algorithm to really decide which circumstances they're going to introduce, they're going to keep, those estimates that we get for IOP using these machine-determined circumstances, if you will, will also be a lower bound estimate, is what I've understood. If it is so, and we do it across countries or across states, and we find that different countries or different states have different number of circumstances, and then we use the Shapri decomposition to get the share for each of these circumstances across states or across countries. My question is, will it be wise to then compare the share across different units if they are really the lower bounds, and they would vary depending on the total number of circumstances that we are considering across spaces? I don't know if that made sense, but thank you. I think I'll give Chico a chance to respond. We have discussant comments to respond to, and also three questions, and then if time permits, we'll open again for another round of questions. It's me, okay. So I've, thank you. These were great questions, and thanks very much to Fohard for the comments. Let me start from there. I think the sibling correlation results that you showed are really interesting, and it's something that's missing, I think, in the review that we're doing in this, so I'll follow up with you later to get those. And I think they're definitely interesting. Again, as with many of these different approaches at the question of persistence, they're complementary, right? Because I think one of the nice things about these trees, and I have to emphasize that really, Brunari, Huffe, and Mahler were the first people to apply that to inequality of opportunity, the exante methods. So it's an innovation that we are adopting here in this paper. One thing I like a lot about them is the structure, right? Telling you what the first partitions were, and so on and so forth. But definitely looking at sibling correlations, I think is very important. Your point in the final slide about measurement as a starting point, and we really want us to understand how policies affect these issues, is completely well taken. And that's a set of other literatures on policy evaluation and so on, which I think, again, are complementary to this. Hopefully though, the measurement stuff is important in setting certain benchmarks. I think if, hopefully, if the policy debate in a country is informed by the fact that 10 or 14 or 18 groups across the population chosen on the basis not of what they are doing of their education, of their performance in the labor market, or of their entrepreneurship, but defined exclusively in terms of their family background, account for 60% of inequality in that country with a parsimonious partition, as you said. Hopefully that sort of is part of the political and policy discussion, but I agree that policies are more important. On the questions, so let me just say that I was always hard of hearing, I'm even more hard of hearing now after certain things this year. So I heard most of what you said, but please bear with me if I haven't. So Kunal's questions on multiple optima. So as far as I understand, that problem that I outlined has a single-picked solution. So I don't think there are multiple optima in that maximum likelihood problem formally. However, each optima is sample specific and relatively slight variations in the sample can lead to a different optima. And this is what the forests are meant to address. So we're basically using a sample through effectively bootstrapping and bagging and various other kinds. Take a circumstance out, take a subset of the sample out to some various tinkering around with the sample to try and shed some light on what the sampling error is doing. Each of those can have different optima and then they are combined in some way through these forests. For a single sample, I think there's a unique solution always. I also think, again, I wish my co-authors were here, but I'm fairly sure this is not a Bayesian method in the sense that in fact, you don't start from a prior at all. Other than your prior is as you start that there is, you test the first no hypothesis that there is no inequality of opportunity. There are no differences. And you test that from the data and then the data simply tells you where the partitions are but there is no sort of prior about what circumstances are important or are not in that sense. There was a question about the sub-Saharan Africa. Can I just, could the person who asked that question? Thank you very much. And did you say that, you mentioned I was running to get my pen. You said someone had done a lot of work on that. Did you say, who did you say? Pablo Bernoulli, yes. Well, Pablo Bernoulli did that work with Vito, Peragin, and Flaviano, Palmissano, who are there. So if you don't mind, I'll defer to them and you can ask them at the break. They would know much more than me. And then, I was taking notes, but then what was the other? About the method and is it wise to use the lower bound? Yes. So on the lower bound, let me say, I'm probably partly to blame in that when I first wrote about these things, I was focused very much on this missing circumstance variable problem. And if you had a population, that is the only bias there is. But with samples, there is this overfitting issue. So in effect with samples, there's a trade off between the two. So these estimates that we provide here are not a lower bound in that sense. They are, given the parsimony of the information that we have, given that there are lots of other circumstances that we don't measure, like wealth, or like the quality of early childcare, or genetic things, lots of things that we don't measure. In all probability, they're underestimates, but it's different to say, in all probability, they're underestimates than to say they're lower bound. They're not a formal lower bound. I cannot prove that there isn't an estimate below it. They're likely to be underestimates because the partitions are so parsimonious. One thing that could be done, and would be interesting in terms of extensions as well, is to take a very large sample, or an administrative data set where we had information on covariates, then do it for a sample, run this on a sample, and then run it on the quote unquote population, or a much larger sample. The trees do get deeper as the sample size increases. This is an issue of the power of the test that's going on here. So again, much like with any other methods, siblings correlations, or anything else, you are constrained by the data that you have, by the sample that you have, and by the sample size that you have. This is very much on the foreground here, but it affects any methods. So I don't think that answers your full question, but if you want to just shout again what the main point I may have missed was. But you have to really shout because of my hearing, I'm sorry. Am I audible? Yeah. Okay, so mainly my question was that is it fair or is it okay to compare the circumstances shares that we get using Shapley decomposition across two units, say states, if the number of circumstances that I've considered to get that inequality of opportunity is different. Very good, very good question. No, it's not fair, which is why, for example, I separated the ones where I had all the circumstances from the ones where I had only some. So yes, if you don't start from the same set of circumstances to begin with, if the algorithm excludes a circumstance completely, I think it's okay to make the comparison. The algorithm is telling you on the basis of this data that circumstance doesn't matter, but the algorithm must have been given the same set of circumstances. So the initial set of circumstances you feed to the algorithm must be the same, otherwise the comparisons are not appropriate, which is why we separated that graph into slides. Did I get everything? Yep. Okay, so I've been under very strict orders to keep to time, so I'm unfortunately not gonna open up for another round of questions, but I think this has been a fantastic start to the conference. I know that there are a number of other questions, and I hope you'll grab our speakers in the coffee break to raise your questions. I think in the last few moments, I wonder if our speakers want to make any final remarks. I can start with forehead. Not really, other than maybe I can talk a little bit about the policy point that Chico already answered, which is that in the end, we do need these measures to know where to start, but we also need to know what policy would drive down this inequality, and how to evaluate those policies. I work a lot more on education, and in the context of education, I can tell you that there are many, many studies which looks at how, for example, education, certain intervention will lift average board, but not much about how it's gonna lift the boards from coming from different section of population. I think that's why we need to go, and I think the inequality of opportunity approach that Chico had talked about in length today would be one way to go. There are other ways too, so that would be my comment. Well, I wasn't gonna have anything to say, but given what Farhad just said, which I think is very important, I want to give you a one minute summary of another talk I've been giving this summer, which is on the role of human capital investments in reducing inequality of opportunity, which I think is related to what Farhad was just saying, and just to give you the upshot of that, I review much of the excellent recent literature in development economics, identifying the size of effects of early childhood interventions, of teacher training interventions, of health interventions of various kinds, including payment for providers, and so on and so forth, and all of these have the following, in my view, my reading of the literature in the end is, it matters a lot how you do it, not all inputs are the same. Dealing with teachers seems to be much more effective than dealing, say, by giving people iPads or uniforms, so there's a whole range of insights you can get from that literature, but when you compare that literature to, and if the kinds of effects that they have, to Mr. Juan Coronado and his descendants in Costa Rica who were presidents and so on, or to the fact that if I look at the opportunity profile in Brazil today, 200 years after the end of slavery, almost all of the bottom groups are still black and Afro-Brazilian, I think we have to come to terms with the fact that these incremental human-capital interventions are incredibly important, but are probably not sufficient on their own to systematically transform what was happening. One of the upshots of a conference in Córdoba, there is at least one other person who is here who was there at that conference, given by Juan Mauricio Cardenas, who's a professor of this university, was the revolution will not be nudged, and I will end on that, I think. So that's a good place to end and kick off the rest of our conference. Thank you very much and thank you to our speakers.