 Thank you very much, Tony, and apologies for imposing a new French accent burden today in this room. Okay, so very much of the introduction to this presentation has been done by Nora, and she already said a lot about those databases, so let's really go quickly in describing those databases where people are responsible for them, and then let's get into the comparison and let's get into the detail of what they do and some of the problems that we find in trying to understand why there are in some cases yielding different estimates of what is going on in terms of inequality in Latin American countries. CEPAL, or what I would call CEPALSTAT, which is a statistical office of the Economic Commission for Latin America and the Caribe in Santiago in Chile, they are publishing since quite some time their own equality measures on the basis of microdata that are getting directly from member countries. One of the problems with this database is that when you go to their website, there is no up-to-date methodological document which would describe precisely the way in which those data are being generated. I've been told by people responsible for this database that they are preparing a new document, at the same time they will take that opportunity to modify some of their methodologies, but at this stage, in some cases I really have to guess what they do because when you go back in time, people who are responsible for producing those data in the 90s are not anymore there, so you are not completely sure about the methodology that was followed. They say that the methodology is based on a rather famous paper by Oscar Altimir in 1987, which was really a comparison between household surveys and national accounts, and the main conclusion of that paper was that it was absolutely crucial to correct, to adjust, survey data in order to fit the national accounts, and we will see that this is the main difference between CEPALSTAT and the alternative database in the region. Poverty data are, again, their own, that are based on poverty lines which have been estimated a long time ago by CEPAL people based on the cost of a minimum diet, and then expanded with standard Orchansky coefficient, and again, this has nothing to do with international poverty lines like the 1.25 dollar a day, etc. Those poverty estimates also differ from the national poverty estimates available in the various countries, again, basically because poverty lines may be quite different. Another database, SEDLAC, is the name SEDLAC, the acronym SEDLAC is for Socioeconomic Database for Latin America and the Caribbean. This is a joint venture between a center of research at the Universidad de la Plata in Argentina and the World Bank Poverty and Journal Group in the Latin America and the Caribbean region in the bank. So they publish their own harmonized inequality measures on the basis of the microdata, the same microdata which is used by CEPALSTAT, and basically all those countries which are in this MECOVI program, which was a program where it was intending to harmonize our national survey in Latin America. They have a very well documented and fully up-to-date methodology, and from that point of view, they are very reasonably close to what could be the best practice today, that is LIS and POFCAL in the World Bank. So the database is very regularly updated, anytime there is a new database available immediately or in a few weeks, you have the corresponding data on the database, and this is very well organized. Finally the poverty estimates are those from POFCAL. They have an agreement with the World Bank in the center, they are providing the harmonized microdata to the World Bank for the World Bank to compute poverty measures based on the $1.25 a day and the $2 a day, but SEDLAC also is computing other poverty lines which I think are better in the Latin American context, which is $2.5 and $4 a day. Okay, the other data which are covering LAC countries, the POFCAL database in the World Bank I already mentioned, LIS has some Latin American countries in this database, although most often only a very few observation, a few data points, Mexico may be the only one for which they have a rather long series of points, and the OECD of course, now that Mexico has been in the OECD for quite some time, but now that Chile is in the OECD, both countries will be also followed by the OECD. And of course the secondary database that Nora mentioned. So the question I want to ask are how close are the inequality measures and the poverty measures which are reported by both databases. I want also to ask about the differences in the treatment, basically of missing data and under-reporting, and this issue of the national account household survey gap, and I will end up with a few other methodological issues. This is another view to a chart that Nora has shown before. This is a difference between the two databases in terms of levels of inequality, these are genetic coefficients. On the vertical axis, this is the SEDLAC estimate, on the horizontal axis, this is a SEPAL stat, you have the 45-degree line, and then as Nora was saying a little earlier, you see that definitely SEPAL stat tends to overestimate genetic coefficients in comparison with SEDLAC. In some cases, you can see that the difference is substantial, that's not me. And what is quite, I think, striking is the fact that the ranking of the countries is not the same. You see that for SEPAL stat, Brazil, sorry, Marcelo, is the most unequal country in the region, but for SEDLAC, Guatemala, Colombia, and Honduras are more unequal than Brazil. So because very often you have people-papers which are mentioning this kind of ranking, I think that this is not something which is completely innocuous. If instead of looking at the levels, we look at the series, then we see that as, again, Nora was saying that the trends are more or less the same. Here I have mentioned only a couple of countries, Argentina, on the left-hand side. So the top series is SEPAL, and the bottom series is SEDLAC, and the Pofcal data are still there, but they are very close to SEDLAC. The vertical lines correspond to years where we have an observation available in the two databases, which is not always the case. And the circles correspond to the case where you have discrepancy between the two databases. In one case, a database says inequality is increasing very much, whereas the other database says inequality is decreasing, or maybe it's remaining the same. And you see that in the case of Argentina, even in the recent past, in the mid-2000s, there are discrepancies which are quite important, and this is really problematic because as those data are scrutinized by politicians, by the press, by all observers, for a few years to have this kind of discrepancy is introducing a lot of noise in the debate. Bolivia, we cannot say very much about this. I mean, there is consistency. The reason why I'm mentioning Bolivia here is that Bolivia is a country where if we believe those databases, in a few years, the Gini coefficient went down by 15 points. And I don't think this is something which is possible. I know that a lot has been going on in Bolivia. There has been a change of regime in Bolivia. But to say that inequality has gone down by 15 points is something absolutely enormous. And it is quite amazing to see that in neither SEDLAC nor CEPALSTAD, there is no mention about what is going on there. And I think this is not a good job from databases of this type. Here you have Brazil. So again, there is some discrepancy on when did inequality started to decrease in Brazil between the two databases. And the final ones in Mexico, where we have almost the same estimates for all the years, except a few years, again, the circle where there is definitely one database saying that inequality is going down, whereas the other one is saying that nothing much is happening. So definitely there are discrepancy. Here we have the same thing for poverty measures. Brazil over there is OK. We have the trends are more or less the same. On the right and at the bottom, we have Mexico. And we see that at least in three years, there is a complete disagreement about the way in which poverty evolved. Costa Rica at the top is quite nice. And Colombia at the bottom here is showing that there is a kind of a change probably of methodology taking place in SEPARSTAT, which has not been documented, and which means that for a while there is a complete opposition between the two database. So the overall evaluation is frequent sizeable differences in levels. And as a matter of fact, to some extent, systematic differences for some countries. So time evolution is in general consistent of a long period, but very not infrequently there are big divergences. And one thing which is worth mentioning is that when, and this is a comparison, it was really very heavy to do that because it was requiring to cover or to review all the papers showing some treatment of those microdata in the literature. But in general, it is true that ZLAC is in agreement with research papers trying to see or to analyze the way in which inequality is changing in one of these countries. This morning listening to Marcelo, I was trying to see whether his picture was completely consistent with that. I didn't have time to do a proper job, so send me the slide and I will do it. And I'm not sure this is a good argument in favor of ZLAC in the sense that very often people or researchers are using data which are directly coming from ZLAC. And if they are using the same database, we have to hope that they are using Stata in a rather efficient way and they will not find a very different result. So I'm not sure about what to do with that, but this is a point that is worth mentioning. There is an issue with updating because it happened to me that by looking at the database set like I think at two time intervals, I didn't find the same result because there was some updating taking place. And one of the problems when you have that is that because there is no, you don't have access to archives, you don't really know what happened. So it may be the case that at some stage, there was some series and then that a few months later or two years later, the series has changed. And this is definitely something that should be documented. Let's get to the adjustment for missing data and reporting. In ZLAC, there is practically no imputation made when data are missing, when somebody is saying that he or she is working as a wage worker with this level of education, et cetera. And no wage income is being reported. In the case of ZLAC, they would say, okay, fine. So either this is a major component of the household income. In that case, we simply drop this household from the sample. Or it is not, and we keep the household and there will be a zero for that income and in the simple stat, they are imputing the value. They are imputing value simply by doing some matching or what they call the hot deck, which is basically to draw randomly somebody with more or less the same kind of characteristics and to use that, to duplicate to some extent that observation. But the main correction which is done in said pulse stat is because of underreporting by comparison with national accounts. So their procedure is following and they will maybe illustrate that with what they do in the case of Chile. Here in the case of Chile in this table, you have for various years the ratio between national accounts data taken from the household account in national accounts divided by the corresponding mean obtained in the household survey. So for example, for wage and salaries, you see that surveys are more or less okay. We are very close to the national account estimate. We have a ratio which is oscillating around one. If we look at self employment, it is not the case. Now we have a ratio which fluctuates around two. And what is problematic is that in some cases we have big changes. When you go from 1.85 in one year to 2.10 in another year, if self employment income are distributed in a non-mutual way with respect to total income, then we know that we will be producing big change in the distribution. And of course, the worst is for property income, where the underreporting is close to 30% or national accounts are present in some cases three times what you find in household surveys. And again, what is problematic is the fact that there is a lot of irregularity in that theory, look 2009, 1.94, 2011, 3.51. Again, because when we do that, instead of scaling up all those incomes in order to fit the national accounts, they for property income, they only look at the top, they only do that for the top quintile. So they impute to the top quintile all the difference between the survey and the national accounts. And this is, of course, quite problematic. When you try to do some simulation to see what is the implication of simply doing that and what is the kind of change that we obtain simply through this adjustment. And I've done some very simple simulations in the case of Chile. And if you look at the columns on the right-hand side, for example, in Chile 2011, because the correction for property income is quite big, we see that without the correction, the genetic coefficient could go from 44.8 to 46, which is something which is not negligible. But in 2009, because the discrepancy between national accounts and the survey was much smaller, then the correction of the genie is much smaller. If you were to do that in the case of Brazil, where in 2005, the underestimation of property income, according to national accounts, was much bigger. In that case, it would generate a much bigger change in the genie. So, definitely, this is something important. OK, because I'm short of time, I will go very quickly to the main points I want to make. The issue here is, what should we do? Should we indeed make an adjustment or not? There is definitely something we just heard about top incomes. And we just know that because in many cases we are ignoring or we are missing the top income people, because of that we are underestimating the inequality. So, should we make this adjustment to national accounts or should we simply stick to household surveys? And I don't think that there is a definite answer to that question. But because there may be many ways to do the adjustment to national accounts, it might be certainly good, first, to make sure that people would have access to the microdata, that somewhere there would be information what we get from the survey. And at the same time, it would be important to make some consistency check with national accounts. Is it OK that when we make the kind of comparison between household accounts in the national accounts and the surveys, is it OK that we observe some big change in one of the ratios I've shown in the case of Chile or not? And if you don't have access to household accounts in national accounts, some countries do not have a very detailed household account, then at least what we could do is look at a comparison between mean consumption expenditure in national accounts and the mean income in household surveys. Presumably, there must be some kind of consistency, some kind of parallelism between those two series. And this is what is done here. And if you do that, and if you look at where the inconsistencies are, you see that in many cases, if you do that, you say, OK, 2011, in the case of Brazil, there is a big drop in this ratio, we should go to see the survey, go to see the national accounts, and try to understand what is going on. What is the source of that change? And only by doing that, I think that we'll be making a big progress for the consistency of the distribution data or the consistency of the national accounts. And I think that this is something which is very simple and which could be done at a very low cost. Three, I need two minutes, Tony. One remark on each of these points, non-response. We don't see the non-response. When you receive a survey, you have all the people who have responded. But the surveyors know that they got to a place and there was nobody in that place or the somebody in that place said, I don't want to answer your questionnaire. We know that this non-response is not random. There is a selection. And some people have worked on this, in particular Martin Ravallian has written some papers on this in the case of the US. So at least the national statistical offices should give information on the non-response. And this would help. And this is something which applies to all household surveys, not of course to SEDLAC or CEPALSTAT. Equivalent scales. When you compare those databases with lists or with the OECD, you'll find that to be different is that there is less inequality in lists for Mexico or in the OECD basically because lists and OECD are using an equivalent scale. Whereas SEDLAC and CEPALSTAT are using income per capita. And they are doing that simply because they want to be consistent with poverty estimates. And poverty estimates, poverty lines are defined in terms of income per capita. So the point here is to say, try to make sure that you are reporting data with at least some equivalent scale and per capita. But it's not a big cost to produce this multiple result. Inputed rents, maybe to start talking about this. This is an important topic. One thing the point I wanted to make on this, inputed rent is fine. I mean, when you have the owner of a house, for example, in SEDLAC, if there is no information except the fact that the household is on its dwelling, they say, OK, we add 10% of the income to that household. Why not? But if the household is acquiring the house, they have to repay the mortgage. Shouldn't we take into account this? So the way in which inputed rent or dealt with is not completely satisfactory. Special differences in the cost of living, it doesn't cost the same thing to live in rural areas and to live in urban areas. SEDLAC simply says rural areas plus 15%. Why not? But of course, it would be much better to have some data on prices in the two areas in order to do that. And finally, when I'm saying multiple poverty lines, all those people are using different poverty lines. This is fine. Why not? But why don't they publish all what they find with the various poverty lines? Instead of saying the poverty count is so much, they should say poverty count is so much with $1.25 a day. It is so much with two, with four, with five. And maybe it is so much with one poverty line that we have used. And this would be clarified very much in everything. I stop here.