 Thank you so much for inviting me and for having me here to talk about the South African tax data. So today, I want to talk a little bit about the South African experience of tax data and where it started and where it's sort of headed. And then to give you some examples of some of the research that I've done and others have done that we think have been policy relevant. So I want to be clear that this is the South African experience. And I think some of what Millie is going to talk about will mirror some of our experience as well. But that this is perhaps in its infancy. So the project initially under wider the regional and regional growth project started in 2014. And this project was working with the National Treasury and SAAS to make tax data available at the National Treasury. And it was available within Treasury for its own researchers, but then also made it available to researchers outside. And so for part of that project, there were more than about 20 research papers written. Some better than others, some that spend a lot of time highlighting some of the problems with the data. But definitely a good first shot at what could possibly come out of the data. What was really super about the previous project as well is that the tax data was used to create a matched employer-employee panel. And this has been documented quite well and it's been updated at the moment. So when I talk about tax data, what exactly do I mean? So in the South African scenario at the moment, I'm talking about company income tax data. I'm talking about value-added tax data, customs data, and then PAYE or individual tax data. There are other data sources. These are the ones that have been most required and most interesting so far. As also the relationship holds, we see that researchers want to know, for example, about labor brokers. And I'll present some research about that. And then we go back to the South African revenue service and we say, well, what information do you have about labor brokers and how do they report taxes and what exactly do you know about them? And then they're able to match some of our requests. So some of the advantages of tax data, I mean, it's much larger in size. So in a lot of ways we think it's the full population. So if you think of everybody who's employed in South Africa in a company that is eligible to pay tax, it has to be in this data set at some point. And then because companies are eligible, well, are required and people are required to pay taxes every year, the data set is then naturally longitudinal. So we think the data set, well, we like to think that the data set sometimes more dependable because it doesn't suffer from your same survey issues on attrition or non-response. And as you could say, perhaps it's a low-hanging fruit that the cost, I mean, the collection happens every year, the data exists, e-filing takes place, and since it's there, perhaps we should, you know, there are good ways in which it can be used. So some of the disadvantages of tax data, and it's perhaps sometimes frustrating for researchers, is not everything we want to know is actually collected in the tax data. And so in the South African context, for example, race is very important when you want to think about inequality and wages. And this is just not recorded in the tax data, and there's no interest in recording it. So that's one big disadvantage we've seen so far. Then the data quality, this is, I mean, I think it's actually improved from what I've seen, but sometimes it's just incomplete. So your earlier years of tax data, nobody can explain why there's just so much missing data. Nobody understands what was going on, and then sometimes it's out of date. So your location, your address data, I don't think firms and even individuals are very responsive once they move. I don't think they think immediately the first thing they think is, I need to let the South African Revenue Service know that I've moved. I've changed my location. So there's a bit of a lag in that. And then, specifically, I think, in the tax data that we have at the moment is people and companies submit revisions to the Revenue Service. So you'll have Amina listed five or six times in the data set, but it's not actually clear which one was Amina's final submission, which is the one that saw us accepted eventually. So the matched employee panel, so it's actually a collection, and we call it sometimes the CIT IRP-5 panel. It's a collection of the CIT data, the individual, the IRP-5 data, but it also includes some information on the VAT and the customs data. So this was created and it's sort of been updated slowly with each extraction of data from SARS. And as Carl said, there's sort of this two-year lag in getting this data. So even though we have data now, we actually think that the data for 2016 is now starting to look complete for companies. Then there's always the challenge of changing forms, and so the company income tax form has changed radically in this period as well. And so the tax data kind of looks quite different in the previous period, in the recent period. But we, as I mentioned, there's a general, I think, improvement in the data that we've seen. Even the documentation and the metadata that we've received from SARS has been better and better over the years. The nice thing about working though with researchers and getting researchers in to come and work on the data is that they've actually fed back what they found in the data. Industry variables not very well coded or the customs numbers are not adding up and those sorts of things. And we've had very good response from researchers to say, actually, this is the problem, but this is also how you sort it out. Or is this a natural problem? This exists at SARS, and we go back to SARS and say, well, this is what's going on, and is this true, and back and forth? So data set that we're working on at the moment to create is what we want to call the employment panel. And this is your anonymized individual level data panel. So this is merging two sets of data. This is both the company issued IRP-5, so the tax certificate that you get for working, but also those who report their own income on the side or income from property that you might owe or shares or dividends altogether. So it'll include your formal employment period, all the types of incomes that you see, and then it'll have your taxable income as well as your tax liability. And we're hoping that this will be available to researchers to start working on in early October. So why do we think tax data is important? So SARS and Treasury, they know the value of this, they've been using it for a long time, and it's part of their process to produce their national tax statistics. But it's slowly starting to be used for evaluating public policy. And some of the examples I want to tell you about on how we have fed into the policy-making process, some of the research. So the first example is on the Employment Tax Incentive. And this work started in 2016 and it's been ongoing. So the Employment Tax Incentive is a employee side, or rather employer side, low-wage subsidy for youth, and it's for youth earning below 6,000 a year. It started at the beginning of 2014. It was supposed to work for about three years. It was quite popular. And at the end of 2016, the government decided to extend it after some evaluation, and the policy now ends in 2019, and then they're busy evaluating it again. So when you want to think about evaluating this policy, what you can't do is actually go out and survey youth. And the reason for this is that it's actually, they might not know that the company for which they're working for is claiming the subsidy for them. And the way in which the subsidies claimed is actually through the tax system. It becomes a tax credit to the company. And so the best way to evaluate this policy is through tax data. So this is an example of how the policy wasn't, it was very difficult to evaluate the policy without access to this data. And then the question arises of whether the policy actually created any jobs, so South Africa's got a very high youth unemployment rate, and this policy's aim was to create jobs for youth. So you require firm level data, and so now this is presented. So I use, I guess, both the income, so your individual tax data, as well as some of the company level information at the company level, and for the years 2011 to 2015. So the tax year in South Africa, so this is some of those difficulties that you want to think about in tax data. Individuals are taxed in the period from the 1st of February each year until the following year, rather the 1st of March, until the end of February each year. But companies are not taxed in that same period. Companies are actually taxed in periods, in very different periods. So they are taxed when they decide the end of the financial year is. So for example, a company who has a financial year, and at the end of December, let's say for 2016, is only eligible to file their taxes within two years after that. So they may not have even filed their taxes yet. So it becomes quite difficult, and then trying to match up some of those timelines. Anyway, we do our best. And so what I do in this research is I look at firms that claim the ETI, I look at firms that don't claim the ETI, I look at all their characteristics, I conduct a propensity score matching, and then I do a difference analysis and I compare what they look like, what the employment is like before and after the policy. And so this is the first outcome of that. So this is, I think, about 14,000 firms. We look at what they look like, the employment, youth employment looks like before, and we see that it increases after. So we see positive results for youth, but we also interestingly see positive results for the non-targeted group. But in line with other studies that do similar work, we actually see positive or increased employment for the whole firm. So we also think that, basically, small firms might be responding to the policy in different ways that large firms might. So if you think of a small firm with maybe five employees, you probably don't have an HR department or a tech or an accounting department that'll be able to claim the subsidy quite easily, whereas a firm with more than 200 employees probably has those departments and knows how to do this far better. So we look at the results within these different size groups and we see, let's see if this works. So the very small firms, you're seeing quite a big increase in employment for youth. In the smaller firms also, your medium-sized firms, we're also still seeing an increase. Here the story is a little bit different. We think maybe jobs being saved rather than being created. But in your very, very large firms, your firms with more than 200 employees, we see actually no trend break. We see no job creation happening. So this is just represented here. And so you see no significance in your very big firms. OK, so why do we think this is important for policy making? So in 2016, this study and two others actually formed part of the evaluation of the policy that the government was required to do. And again, that process is now taking place in 2018. And both the results of all of the research that's been done has been made available to government and labor and business. And we've presented extensively to them to tell them, well, this is what we think is going on. Whether they listen, whether they take on the policy, whether they listen to our recommendations, that's a different story, perhaps. OK, so the other paper I wanted to touch on is on how small firms respond to tax schedule discontinuities. So in the South African context, firms that are resident in South Africa are subject to a flat rate of 28% tax. Small firms, however, have a graduated progressive corporate income tax rate or corporate income tax schedule. And the idea behind this was a lower tax rate in order to stimulate small businesses to grow and economic activity and create jobs. That was what the intention was, at least. And so what does this look like? So for the period, I guess they call this 2012, if your taxable corporate income was below $60,000, it was 0%, up to $300,000, it was 10%, and then above it was actually you got back to your 28% tax rate. And they find, actually, that you see at these kink points that there's excess mass of firms located kind of before these kinks, and there's quite a bit of bunching. OK, so this is the graph that they share that I want to point out. So if you can just focus for a moment on the observed 2012, so the dotted line. So if your first kink is over here, where you go from 10% to 28%, you see there's lots of bunching over here. And if you follow that line, it kind of continues down, down, down. And there's no bunching over here. In 2013, they actually changed the tax schedule, and it was no longer $300,000 or $350,000. And within the year, you see this bunching that happens around here go immediately to the next kink point. And a large percentage of those firms who were bunching over here actually are the ones that are in a bunching over here as well. So what's the story, and what does this mean exactly? So it's suggestive that some of this response is driven by tax avoidance, or as South Africans like to call it, very good tax planning. But from a policy perspective, this is not really encouraging the sort of economic activity that the government's trying to make sure small businesses are doing. And so there might need to be some rethinking on these incentives, if you want. The last paper I want to talk about is the wage penalty on the labor broker sector in South Africa. So temporary employment has grown quite a bit in South Africa and in other countries. And there is sort of this expectation that there's some sort of wage differential between them. And this hasn't been well tested in South Africa so far, but there's been lots of public debate on decent work and making sure that folks working for labor brokers are actually getting the benefits and the pay that they should. And in 2015, they actually amended the Labor Relations Act, and this was meant to regulate the sector, but even offer more protection for your temporary employment workers. So they do find this penalty for being a temporary employment worker. And they find it, I mean, quite high. I think 30% is quite high. And they try and use as much of the variables in the data set and the tax data to tell the story. But like I said, you're short on your race, for example. You're short on education information in your tax data. But they do the best they can. And this is 30% is what they think it is. But they also think that some of this is actually due to the benefit contributions that are actually quite low for TES employment. And this is your pension or your medical aid or your unemployment insurance that doesn't actually exist for these workers. And this is one of their first looks. And then you see the temporary employees working or their wages sort of further on the left of your regular employees. So they don't actually get a chance to investigate the change in the Labor Relations Act, because they started the work last year. And actually, the data was only available to the 2015 tax year. The 2015 tax year actually only rates the 2014 calendar year. So they'll have to wait a little bit longer for the data to come out before they can actually figure out what the Labor Relations Act has actually done. But I think this highlights quite nicely what can still be done with the tax data. So in summary, for me, I think there's lots of potential for the tax administrative data in South Africa. We're working very hard on documenting the data far better, doing a better job of cleaning the data. There's lots of issues. And then we've started making geospatial data available. So locations of firms and individuals. And so this presents some new areas of research. And so over the next three years, we're looking at 60 wider research projects that are actually coming out of the data, which I think is pretty phenomenal. And which we hope that will feed into improved policymaking. And I think it's advantageous for this to be based within the national treasury in South Africa, because then you have those interactions with policymakers while you're looking at this. Thank you.