 You're watching NewsClick and today we are actually going to focus on a new paper which has come out which kind of brings to a closure all the questions that have been asked about India's statistics. Closure is an odd word to use here because it actually opens up a can of worms. Someone I know said that it's quite damning as to what this paper says about what has been happening in the last few years in the way in which there has been a lot of opacity in the data that has been collected and then not being shown to the public. The way in which some of the data which has been collected and then not being released. The way in which the government agencies have actually have put a lid on the data which has been collected by the state's own agencies and then changed them. So I have with me probably one of India's most celebrated statisticians right now. He was India's first chief statistician and Dr. Sen, thank you so much for joining us, Dr. Pranab Sen. If I'm not mistaken, you were originally an academic, right? You did your PhD from Johns Hopkins and then were an academic and then sometime in the early 90s you joined government. And again, from what I remember reading about you that you essentially started off your first task was to kind of reduce India's software policy. You were tasked with deciding what software policy should be and that we know what happened after that. Now, before we go to this new paper which kind of puts together a lot of evidence of how this kind of obfuscation has been done with data collected officially, I just want to explain to our viewers why is it important to collect data for a government to have this data for anyone to have this data? Well, you know, I think the simplest way of responding to that is that the flavor of the day and it's now been here for a while is what's called evidence-based policy making. That is, you don't make policy on the basis of assumptions and gut feeling, but on the basis of hard evidence about what is happening on the ground, right? And to give you that evidence, data is absolutely central. So, you can of course have policy making where you make policy on the basis of I think this is the problem, I think this is the nature of the problem. Alternatively, you can actually use data to both define the problem and the nature of that problem. That's why data is important. The second thing that data does is it enables you to track the progress of your policy. Otherwise, you implement a policy and you don't have any idea whether it's working or not. If it's not working, then presumably you have to do something about it, but without data, you can't take that course. So, that's an interesting point you're raising that without that data, you can't implement policy and you actually don't have any feedback as to whether that policy is working. Now, we are seeing that I think even before this government came to power, there had been questions raised about data that was being created, the changes also that were made to how GDP or GVA would be calculated, though all those were questions. Now, my question is this, that we live in a time where people rule in the name of the people. So, there's a government which names in the rule of the people, they have to go back and get votes, they implement policies to gain those votes as well. I'm being very cynical here. Then why do they not want the data? Everything I think Kalindo depends upon how important the evidence is compared to the narrative. Now, if you have a narrative which is the driving force behind what you do and if the data contradicts it, then having that data available opens you up to criticism and attack. And in politics, and we should be very clear about this, this is not new, narratives have been the stuff of political discourse and this has always been true and still not in India has been true all over the world. In democratic societies, narrative are the way politics work. The thing is that in the Indian context, now increasingly in other countries as well, what is happening is that the narratives have to be purchased by facts. And this has been happening more and more in the Indian context. People then actually start looking for collaborative evidence in terms of so now what happens? If the data contradicts the narrative that you wish to project, then one of two things must happen. Either the narrative must change or the data must vanish. I mean, when you were in government, you must have when you first came into government and you must have faced a little bit of pushback from the established bureaucracy because they must have said, who are you? You're a technocrat. You've done a PhD. Who are you to decide and you don't know anything. You don't know how things work. How did you find the times at that time? And when you actually later on became the chief statistician and you were representing the government in a certain sense, what changes took place from that time to let's say at the time when things were working relatively better? When I joined government, remember I'm not a statistician by profession. I'm an economist. So when I joined the government, my essential problem was to be able to convince the bureaucracy that I could bring to the table that something that they could. And the main weapon that I had in my armory was that I understood the data. The data that was available, I could use it essentially to make an argument which they would have a hard time refuting. And so my establishment of my reputation was essentially driven by the fact that I put proof what I was saying. So I'm just interrupting you. Sorry, if you were to come in right now into the current government, you wouldn't really have any data. I mean, you would have a problem with the data. No, that's not strictly true. I mean, you know, a lot depends upon which specific area you're dealing with. So if you're talking about macro data, it's available. The national account estimates are available. There are other data sets, mostly those which pertain to the conditions of life and living in India that have become problematic. So there is some data available, certainly. And we can't say that data has vanished. But you know, I've read, I've been for several years tracking your views on this as well. And I know that at times, you have also been skeptical about the data which has come out in even in terms of national accounts. If one looks at the GDP data itself, there have been questions raised about the GVA portion. If one looks at industrial output, then there is a question to be raised. That is, is there an overdependence, overestimation made on the basis of what is what we would call organized corporate sector? So even there, there is an issue there. And has that issue always existed? Or do you think that issue has increased when it comes to even macro data, which you're saying it exists and is relatively reliable? Has that issue increased? Well, you know, Alindo, you talked about Arvind Subramaniam's criticism. And you're just parroting that criticism. The fact of the matter is that problem always existed. Except, and here's the big if. Earlier, we were not producing quarterly data. We were producing only annual estimates. Now annual estimates by their very nature gives you more time to collect whatever data you can get your hands on. Quarterly estimates, you can't. So when you do quarterly estimates, what you have to do is to be able to have a sense of what data you can get on a quarterly basis or more frequently than that. And how does this data relate to other data that you're unable to measure? Right? So there is a lot more estimation that gets involved when you're producing quarterly GDPS. That's where the problem comes up. So now, as you said, that what is happening in the quarterly data is that as far as the unorganized sector or to be more accurate, the non-corporate sector. For quarterly data, we have no data at all. Zero. The data that we do have are from corporates. Right? Now, so long as the relationship between corporate performance and non-corporate performance is stable, that's not a bad way of estimating what the non-corporates perhaps are up to. The problem comes at the points of stress where we have reason to believe that the corporates and the non-corporates are not performing similarly. All right. And I've seen that you have argued that there is this discrepancy between what is the MSME sector and what is the big corporate sector. And the corporate sector. And again, this is not true. In 2008, when the global crisis hit us, right? The corporate sector was affected much worse than the non-corporate MSME sector. Right? So when we measured on the basis of the corporate sector, we were actually underestimated. Yes. Today, post-COVID, indeed post-dehabilitation, we look at this, we have reason to believe that the MSME sector has been doing worse than the corporate sector. So when we use corporate sector data to protect the MSME, we are overestimated. So this brings me to one of the points that you've raised that as long as the ratio between what you would call the corporate sector and the non-corporate sector remains more or less stable, then one can use the corporate sector data to say, okay, this is a good proxy for me to talk about the entire sector. Now, would you agree that one of the things that has happened is that coming in of GST has moved a significant portion of, let's say, the market. It has increased the market share of the corporate sector compared to the MSME. Not just COVID, I'd say demonetization, GST. And in a sense that we're seeing a double overestimation. One is that we continue to use the corporate sector and assume that that is a good proxy. And the second is we look at GST collection and say that this is also an example of how the wealth economy is doing, as you said, about a narrative. So if the corporate sector's weight in the overall economy has increased relatively, GST collections would also go up in a certain sense. Yes, they would. And the point is that GST collection can go up in at least two ways, right? One is that everybody is growing, including the corporate, and therefore GST collections are. The other way is if the corporates are cannibalizing the market share of the non-corporates, in which case what you have on the ground is corporates increasing, the non-corporates actually going down. And so you'd get a higher GST collection and then you assume that everybody is growing. And effectively, you'll also get a higher GDP, right? Because if there were many people who were not paying taxes earlier and now their tax collection is increased, so you're adding that in terms of your net taxes and getting a higher GDP from GVA, right? No, no, no, all of those statisticians are not that stupid. Adjustment is made for the number of reporting units, okay, when this is done, otherwise that would be a completely unacceptable way of doing things. No, I'm saying that when you look at the total tax collection of the government, right? It is possible for the tax to go up, taxes to go up, tax revenues to go up, even when the economy is overall not doing well, because those who pay taxes may be getting a higher share of total income. That's exactly correct. So then in that case, we'll not see a great GVA growth, but we'll see a reasonable GDP growth, right? Because well, in reality, but since GVA would be measured in a manner weighted in the GST collection actually gets explained by a higher GVA because the GDP is itself based on corporate. Oh, corporate, yeah, you're absolutely. In the recent period, we've seen that quite often, especially in the quarterly data, the GDP growth rate has been higher than the GVA growth rate, right? Several times you've seen that. Now, for a person who looks at it as a layperson, I look at it and I see that the economy is actually not producing more, but the government is extracting more taxes or reducing subsidies, net subsidies at a time when it should be doing the opposite. So, are we seeing that happen as well? No, I think there are two different things going on out here. First of all, taxes are certainly going up, I think, faster than GDP, at the moment. And that's not entirely the outcome of the fact that the corporate sectors are growing at the expense of the MSNU. That's one factor which we've already talked about. There is a second factor which is if the consumption basket is moving towards items which have higher GST rates, that is your moving from necessities to luxury, your GST collection will go up even if income doesn't go up at all. Yes. Right, because then the average GST rate becomes higher. Which in a sense would also be a reflection of increased inequality because one would assume that it is a very good reflection of increased inequality. So, then it would make sense for government to actually massage the data and present it in a particular way. It's exactly what you're saying. The narrative, it is for the narrative that one would have to do that. No, for the narrative, you don't have to break it down, right? You just look at the raw data and the narrative comes out automatically. The economy is doing well. I mean, you know, GST is growing strongly, much higher than anybody had expected. Every month we are breaking a new GST collection record. So, it fits in very well with the narrative. You don't have to massage a thing. And of course, no one talks about the fact that GST is always in current prices. So, you don't have to adjust it for inflation. It's still growing at a very fast rate. So, again, I come back to your experience because when you joined the government, I'm assuming at that time there was a sense that something big is happening in the early 90s. There was a sense that people who were in academia were talking about things and not being heard could come in and now be part of a process. At that time, a lot of people joined, came in from various institutions to be part of the government. And I assume you were part of that as well. How did that change? And I wanted to know about that in the sense of the system of collecting data to implement policies, think of new things to collect data about. How has that changed till the time that you finally left? Well, a lot. Remember earlier, the principle user of data in government of India was the planning condition. The finance ministry used some data, but it was mainly very aggregated data such as the national account test months. But the planning commission was the one entity which used practically all the data that was produced in the system because the five-year plans were so complex. Number two, that was a time when India had a fairly restrained corporate sector. The bulk of the economy was non-corporate. Non-corporates didn't need nationwide data. The kind of information they needed was much more localized because they're dealing with more difficult markets. That's a very interesting point. It's a very interesting point because sorry to interrupt you because one would have assumed that when 90s liberalization took place, one would say, well, the market would take care of things, why do you need data? But this very important thing that you're saying that as the corporate sector expands, it needs more data about the corporate sector addresses the national economy. Whereas the non-corporate sector is very localized at best regional. So what happened during the 90s? And if you think about 1991 reforms, it was basically unshackling the corporate sector. And so the real growth that you get in the 90s is corporate India. And as the corporates grew, their demand for data grew very rapidly. And just to be clear, you're distinguishing between the corporate sector versus the MSME sector, right? Yeah, but a lot of MSMEs became corporate sector. Also moved, became corporate. And then they behaved like corporate. I mean, born an MSME doesn't mean that you can always be an MSME. That doesn't work that way. So what happened was demands were then being placed on the statistical system, which the statistical system had never encountered before because the planning system didn't need that data. Yes. So in fact, what happened is that during the 90s and 2000s, the range of data that was collected actually went up a lot. People don't realize this. I mean, you say, what happens is a lot of the criticism gets restricted to the data that existed before and it continues to exist today. And then you're looking at what people realize is that the range of data that you have available today is far, far greater than what it used to be. And it's pretty easily available, actually. It's not that difficult. If you go to the internet, it might be old, but it's not... It was a struggle, mind you. Because a lot of this data, in fact, I would say the bulk of this data actually resides with the line ministries. And they were, what should I say, less than enthusiastic about putting it up on their websites. So it's taken a lot of persuasion, a lot of arm twisting and frankly, it hasn't happened entirely yet. There's still a lot of data that's available that's not out of the website. There has been a flowering of just the variety of data that's available today. In fact, quite a surprising amount of data is available. You can actually get district level data on the internet, going to the websites of government, of ministries. It's really quite amazing how much data is available. Yeah. So now think of what happens when all this new data is coming in and you're getting access to all of this. It then becomes possible for analysts to start looking at the data much more closely. Yes. Right. And when you look at it more closely, you of course can criticize it. Now the point is the volume of criticism you hear about the statistical system is partly about the quality of the data that was collected before and continues to be collected now. But a lot of it is about data that never existed before. Yes. Right. That's true. Now when you conflate the two, then the level of criticism becomes much louder than if this additional data had not been provided. And then it's understandable that any political party in power would want to not have that data out because you can actually use any amount. So I've taken a lot of your time. I just take five, six minutes more. I wanted to get a sense of where do you think the current state of data collection is because some of it is always from the government. But there have also been a large number of private data collectors who people are turning to. For instance, I do these weekly videos and my go-to now for current data is CMI. It has been around for a very long time, but earlier people only looked at government data. Now there are these ICE 360. A lot of this is coming out. A lot of data is collected at the both level by political parties, which probably is the only data they want. So where do you see this going? It's a good thing. But at the end of the day, the fact is that whereas the government data continues to be more transparent than any private source of data and therefore can be subjected to more criticism, the hopes of reform in government data is higher. The problem with the private data is that its provenance is unknown. CMI in terms of the consumer pyramids, so for instance, has been very open. But it is less so about other data sources. And it's certainly not very transparent earlier, which is why CMI had a loss of reputation sometime back. But government data is much more transparent. Now because of that, there is more and naturally I think more trust in government data because you can actually analyze it and ask questions, which is why if you look at what is happening in India, the data collection and what data is put out is actually fairly good in terms of the technicality of the data. The problem as you mentioned in your opening is that a lot of data is not put in the public domain is what you would call subtext. So would you say that, sorry go ahead. It is the suppression which is really the issue. The data put in the public domain is open for people to inspect. So what is the solution? I mean, there are arguments that one has to give statutory status to an independent, within the government system, an independent statistical body. Now, is that a solution? But you know, that's a contradiction in terms. If an institution is a part of the government, no matter, at the end of the day, they are going to be under pressure from the powers that be. The same criticism is made about the RBI which supposedly is independent. That's true and so the model that was being proposed is in fact the RBI model. We know how RBI functions. It is independent up to a point, but when push comes to shove, RBI has to pull the line. It should be through here as well. Why would we expect anything different? So what is the solution to the current problem of the deterioration of data that is being put out? The big data. I agree entirely with you that not more data is available from the government than it was earlier. But some of the big data that everyone talks about is the macro data and various other kinds of unemployment and stuff like that. That, how does one solve that problem? No, the employment unemployment data is fine. I mean, I don't think there's been any real criticism of that. No, the fact that it is held back and then released much later. The data that is held back are those which describe the life of the common India. Transumption, etc. There are lots of things on which data is collected. Now, those are held back. Now, what it does is that it has secondary effect in terms of that if these are not being put out into the public domain and are getting suppressed, there are certain other kinds of data which get adversely affected. For instance, in the absence of the consumer expenditure data, we cannot revise the consumer price index. So we are dealing working with the consumer price index that is now 11 years old. It's totally outdated. We know that. We are being criticized. We are under criticism of the fact that the consumer price index is outdated. But then nothing we can do until we get the consumer expenditure survey data. So they are all linked up. The bigger problem is that we are going to face in the immediate future is the absence of the census. The census is the essential foundation of any statistical system. All surveys that are done are ultimately based either on the census or on the economic census, neither of which has been done. So what we are doing is we are having sample surveys which are based on the frames which are completely outdated, which means that the samples that we are drawing are becoming less and less representative. So the quality of the data that you're going to get is going to get worse. And on that note, thank you so much for joining me. I've taken a lot of your time. I mean, one can only hope that things become better in the area of data collection. And sooner or later, someone knocks on my door to ask how many people live here and what they do as the census does take place. I think the only time this had not happened was during the war. And since this is probably the first time that we've not had this census. Thank you so much Dr. Sen for taking our time. Thanks a lot and explaining it so easily. It can only come from knowing this stuff from inside. Thank you very much.