 Hi, I'm Marcus Janti, I'm at Stockholm University and I'm presenting joint work with two researchers at the UNR Revenue Authority, Research Department Millie Nalukwago and Ronald Weiswa, who I don't blame for anything silly, I say, and this is part of what I hope is a somewhat longer project about exploring the use of tax register data from Uganda for income distribution purposes. I start with top incomes because that's the, in some sense, least demanding thing to work with, so it's a reasonable starting point, it seems to me. Now the Uganda tax registers are potentially a very rich source of data for income distribution research, but of course because they cover only formal activities by definition, almost, or the tax recipient part of this, it covers, I mean it covers formal activities also on the VAT part, but those activities may cover some informal sector workers as well, but all the same, I'm looking only at the formal part. So it does work for topping community quality with some reservations, but if one wants to look at the overall distribution, additional auxiliary sources are needed, I may say something about that at the end once we get there. So here I'll be just kind of demonstrating that the formal, informal thing really is quite striking in Uganda, especially for somebody like me who mostly works with rich country data, I'll look at where Uganda places in inequality very quickly, but then I'll talk a little bit about using these tax dates to study income inequality, and in particular, topping community quality, and I'll then show you some results of what we find. There's a wider working paper which kind of covers everything that I'm saying here. There is quite a bit of research on inequality, income inequality in Uganda or economic inequality. In Uganda, Palo Brunoria and Palisana and Vito Piragina have several papers looking that do include Uganda and they also look at mobility. Tony Atkinson did a paper that's published as a working paper, it hasn't been journal published looking at multiple African country in colonial times, which it is interesting to compare with and the World, the Paris School of Economics project has, there's a, I think it's still a working paper on looking at, again, many African countries largely in the distributional national accounts framework, but this also includes Africa. I have in the papers some comparisons to these, and then there's work done by the wider team. There are multiple papers using these same data, I cite one particular one here, but there are many papers that make use of these exact same data, but often looking at distinct questions, more policy relevant ones. So the, with some digging of Uganda national accounts data, it's possible to look at the composition of GDP from the nominal and from the formal and informal point of view here I show in two years chosen here because they coincide largely with the start and the end of the period I'm looking at for income distribution of within broad industry groupings of the composition of GDP by formal and informal sector, and you see that especially in agriculture the formal kind of totally dominates, but the informal really is very large also in industries and in services, and note that this is GDP. If we look at employment shares here I have a number of comparison countries, but we have Uganda here at the far right in, if you look at employment it really is the informal sector again kind of completely dominates employment here. So looking at the tax register data we're really looking at really very few people up here, but what I'm going to effectively assume that these people here are at the top, so these, the formal sector activities essentially are, as income ranges come and go are a distinct subset, that tax registers are going to be the people who are up here. Uganda, this is unfortunately the most relevant label is being cut off here, but this is Uganda, so these are taken from the widers weed data, there's no observation on income inequality from Uganda, this is consumption inequality, but with its neighboring countries it kind of has mid-level inequality, it has been inching upwards possibly a little bit, but it's like mid-level, no dramatic things have taken place here, yes. This is the whole distribution, this is weed, I'm just in a sense I'm placing Uganda in context of its neighboring countries. Now so since about 2009 Uganda Revenue Authority collects tax assessments electronically from individuals and in particular from employers and they do this using a kind of highly structured spreadsheet file and I'll be using the pay-as-you-earn register which is the bulk of the information. Now they do have tax identification numbers, but it turns out and of course it's the individuals rather the employers tax identification numbers, I'm interested in here, but far from all employees have one, so for many employers do report what they pay in wages and what they withhold in taxes from employees, but not all of those are equipped with identifying information and of course the fraction of the labor force that has a tax identification number also changes across years quite a bit which is a problem. So the main information I use here is the monthly returns for each employee, the information is in principle quite rich so lots of relevant information is filed for each employee within a firm that is a registered employer but on a monthly basis. There are also annual returns but it's really a tiny, tiny fraction who actually file an annual return. So it's the pay-as-you-earn the monthly data that are my main source of information. So that's not particularly interesting. One of the main limitations is that I have to try to, I use the monthly information but I aggregate it to an annual level and I need to do some guessing of in order for those people who don't have a tax identification number I have to do a little bit of guessing of who is who to get the aggregation approximately right within the year. I'm not claiming it's perfect, there's a lot more work that needs to be done on that score but I do use a kind of algorithm and small variations and it doesn't change things a lot. But to get the top incomes I need some additional things. One is I need control totals, I need to decide what the relevant target population is in what I report here I take the population as being the population of 15 to 74-year-olds. That's being a little generous at the high end of who belongs to the labor force but I've decided that's reasonable. I vary this a little bit or we vary this a little bit and it doesn't seem to be a huge thing and then for the income control total what we do is that we take from the national accounts data is published by the Uganda statistical agency the household final consumption expenditure. Again we vary this to some extent it doesn't seem to matter tremendously but essentially I use these construct the overall average and then I use the tax register data to work out what the average income is of the not covered population which works as my okay that's the next equation here. This is not particularly enlightening but all the same these things we need to be able to figure out what the top incomes are. The variations in these don't matter a lot but that's what I do. All numbers I'm showing I've converted to USD purchasing power, parity adjusted United States dollars and expressed in 2017 prices using the Uganda consumer price index I converted to dollars just to be able to get some sense myself of what the numbers are rather than using Uganda shillings directly. So this is I'm just showing here the what the aggregate gross income which is the income concept I'm using here the gross income as aggregated from the tax registered as a fraction of formal sector GDP. It rises from the beginning of the period 2011 it's a little less than 10% and already in 2014 and then in 2018 it was around 15% of the overall. The I classify things like so I look at fiscal year is 2010 to 2017 the fiscal year runs from July to end of June and it's indexed by the month by the year in June. So this is 2016 July 2016 to June 2017. This is the overall average income per capita average income that is income per capita. This is the average for those who are not covered essentially those who are working mostly in the informal sector. This is the formal sector GDP it's considerably higher but in fact this doesn't increase tremendously over the period and in fact it reflects the fact that the scope of taxation is increasing across time so more and more of the population is being covered so we see that the proportion of the population that's actually in the tax registers goes from about 1.7% to 6.5% by the end of the period so this is expanding quite rapidly. This is now these are the income shares I've had to work with unusually small quanta groups because so little of the population is covered so the fraction that in a sense works best over the whole period is the smallest largest fraction that works the best over the whole period is the 1% and you see here that the overall share of the 1% goes from what's this around 7.5% to about 25% over the period. We have a little bit of an increase in the 1.1% of the 1% across time. The very very top does increase its share also a little bit but I'll show you at the end one of the striking things which you can't see from these is that if you do this just jointly it actually turns out that the poorer majority of the 1% is actually increasing its share so inequality within the 1% seems to be declining a little bit while this of course suggests that inequality is increasing at least in the top inequality sense. These are the average incomes measured here on the log scale the choice of points here is very strange of course but all the same we see that there's kind of very rapid growth early on when the tax system kind of starts expanding through this whether or not it's causal or not but anyway while the electronic stuff is expanding we get a big increase but then income evolution is reasonably flat for the very top as you see here in the red band and somewhat better for for the top 1% and this is these are the disjoint shares well I'm sorry these are the disjoint shares so this is now this is the share the blue line here is the share of the top 1% to the what's it less top one-tenth of a percent so this is the share of those who are below the top per miller and and that's increasing actually quite quite rapidly while the others are not so much so I'm interpreting this is essentially that while topping community quality has been increasing over the period it's actually kind of evening out so it's not now the very very richest who are doing all that well but but somewhat less so now that brings me to the end of this I'll in fact I'm gonna largely ignore my concluding remarks and just say that that I'm hoping now to to work you can have quite good survey data in fact different options both the panel survey and the household survey and if the panel story they actually ask if the employer pays with holding tax so it's possible to combine the survey data with these registered data to try to assess the overall inequality the years don't quite always coincide so a little bit of fiddling will need to be done but it does have the added benefit that there are good consumption and also in fact a reasonably well-measured set of income data and the survey data so this I'm kind of hopeful now that these can be used to to also look look at and other questions that many of us also also find interesting like the overall cross-sectional inequality also be able to actually look a little bit about at income inequality using both the survey and the register date income ability to look at survey data combining the two sources that's all thanks