 Welcome everybody. We have a fantastic panel to discuss metadata and HVD. Before we get into the details, this is an event that is being hosted by Hasgeek, your favorite community platform for the tech community. My name is Venkat Singali and I am from a company called Stable Data and we are a machine learning product firm. I think this week has been busy. This is the fifth day of the NPD week. The purpose of this entire week is to discuss the committee report of the non-personal data committee that has been constituted to look at the data sharing and the data economy overall. So last five days we have been discussing various aspects of it. We had several committee members also join us to discuss the evolution of their own thinking, what is the larger international context around this proposal. And of course the community framing and data trustees. So today we are going to focus on a little more of the mechanistic part of this. Organizations within this particular NPD framework are supposed to share their metadata with a non-personal data authority, a body that will be created as part of this law. And organizations can discover data sets and also source them. They have a particular structure which is the data trustees who are authorized to do this. The discussion today revolves around the metadata. What should, what will get shared, how will it share, can we do it in a robust manner, can we do it in a, you know, in a manner that keeps the trust of the entire data ecosystem. And also the particular concept or construct that the NPD committee has introduced, which is the HVD, a high-value data set. And so we'll explore these HVDs as well. To discuss all of this, we have a fantastic panel with us. We have Prashantu Roy, who is a technical technology writer and a public policy consultant. He will open by framing the overall issue within the narrow scope that we are looking at. We will then have respondents share their perspective. We have Ish from Zendrite, Mangalam from ThoughtWorks and Shubhashish from Omadhyas. I will give a little more background about them and some of the questions that they could respond to. But we will open first with Prashantu. Thanks Venkat. Delighted to be on this panel and I'll jump quickly in and I'll try to keep this short. We've probably all been through lots of discussions on the NPD and metadata and all of that, but I think it's just worth reiterating once at the outset how valuable metadata is and it's often more valuable than data. And it really is worthwhile reiterating that with multiple examples, simply even with phone call records where you might actually be able to, supposing you were hypothetically able to tap into phone calls randomly and listen in, there's very little value to that compared with what you can do with aggregated live metadata of all those people who are calling, who are they calling, how frequently is a group of people calling, let's say a neighboring country, Pakistan, how many people are calling particular clusters of shops with cells made. So you can do a lot of stuff with that metadata, which makes it really worrying. And I think we've heard many, many examples of how valuable that metadata is across the various discussions. So clearly, it is frequently underrated and it's held to a lower bar, a lower kind of standard than data, which is unfortunate, even for law enforcement access, etc. Metadata is considered, it's pretty cool, various code order required for phone tap, phone tap order, etc. So therefore, it really is worthwhile looking at how the abuse can happen and therefore what are the concerns and questions which would potentially be settled by maybe a non-personal data authority. But then we will come to that issue in a while about the regulatory capacity, because I expect to see a fair bit of challenges and questions and actually landing up with the authority. And let's kind of see how. So clearly, I think some of the examples we've heard, there are many potential public good causes for demanding metadata from different entities. There's also a bunch of reasons which could be, for example, comparative issues for antitrust related things. So frequent examples are there of the large platforms which have a significant degree of dominance, which can be seen in terms of, for example, let's say an Amazon or Flipkart has insights on so much of sales metadata, the aggregated metadata telling it exactly what is selling, how is selling, and so on. And therefore they can use that to launch their own brand name products. And therefore they have an unfair edge and they actually will edge down other partners and others who are trying to sell those products. And you'll get to see Amazon Basics and Solimo and all these products showing up and we've seen that happening. But I mean clearly, and this I've said before, and I think within the same, an opening session of this NPTV, that you have various ways of tackling that. So that's something that's just to park on the side. So you've got the competition commission, you've CCI, you've got the other frequently used tool and instrument for that, which is well, basically the body DPI IT, or DIPP as it was known earlier the Ministry of Commerce which issues these press notes, you know, which basically define restrictions or which define the conditions on which the entities and e-commerce who take FDI will be bound by. So one of those actually tackling this issue of platforms being invested in their sellers. Okay, so my point is, you know, without going too deep into that, there are many ways of tackling the antitrust, the competitive other challenges without having to resort to a forced sharing of data, which obviously the company will consider very, very valuable, the company itself, whichever is being forced to share it, and will obviously, you know, bring up challenges, legal challenges in every way with the NPD, NPDAA, and maybe cases will land up in court and so on. Maybe the willingness of companies to part with metadata, which is exceptionally valuable to them. And, you know, it becomes an open access metadata directory that is going to be really troubling for companies. I think that's really one big question, while saying that, okay, sure, there are public good causes, which you can argue would be a good thing, you know, for public health or for a bunch of other things. But you've got to look at both the potential hit on businesses, which will actually cause challenges and since we are here to discuss practical challenges, this is one clearly to read out there on the table. The proxy issues of, you know, companies setting up, you know, another entity which is section eight company, and then it sets up a high value database, ostensibly for another public good cause or for, you know, helping with the community or whatever it is, but essentially it seeks data from other data businesses. And that's not hypothetical because, you know, you could already have section eight companies out there, which can actually demand such data, some of which maybe it already has access to and there are concerns about that. And I think we've, you know, in fintech, for example, NPCI is one example, which is actually a competitor on certain areas it is competing with the card companies it's competing with payment system providers with with PPI is with wallets etc with with the entire entire UPI platform. And that's why RBI has now pitched for an alternative framework the NUE framework so that other platforms can also emerge, which don't which is not bound by this one dominant. My point is this is not the only example but you know fintech which I work a lot with you know is fairly sort of obsessed with this example and for possibly good reason. I think, finally, maybe a couple of quick points for startups, you know, they would be concerned emerging about the compliance burden. Now startups are, ironically, I mean startups are very quick and nimble on their feet on lots of things but they're very, very slow to recognize policy challenges. And one of the reasons is they don't really believe they have a lot of role to play in terms of, you know, engaging with policy and trying to shape policy they feel they're too small to be able to make an impact. But I think all startups and small businesses and all of those have recognized the kind of compliance burden they faced with a change like GST. You know, this I mean that's UNC nothing yet. If they have to comply with a bunch of new policies around data, which include the PDP bill related issues, and you know the liabilities that will emerge from that and then the compliance burden of NPD. Now, they, there's obviously some benefits that they'll potentially derive because they'd be very probably looking forward to data from the biggies from the paytams and ubers and somatos and swiggies and so on, and to be able to leverage that. But they too will have a compliance burden and you know they need to be kind of really aware of that. I mentioned this before and I just repeat that all of this will end up with quite a few changes which keep going up to the regulatory authority and we've seen regulatory capacity bottlenecks in multiple areas, you know, whether it's telecom where there is a very significant capacity out there and it still gets keeps getting challenged. The RBI itself, again, huge organization. Okay, we really need to see how quickly an NPD you know in the coming few years would be able to set up and really ramp up capacity and scale. You know, and here I have not even looked at the potential social harms and I did mention the possible possibility of abuse of metadata. So clearly, you know there's a lot that can happen with that metadata and that could be initially it could be sought for a public vote high value data set. And this again is I think an example that we have discussed in perhaps earlier discussions of this type. An example for public health. You know you look at identifying clusters where there are red meat eaters and you know you map them to disease and you know to clusters where there is high chronic cardiac issues and things like that. But equally, you know those clusters are also clusters where you could have people targeted for their dietary habits and that's not a hypothetical issue anymore in India unfortunately that has happened a fair bit of the last few years. And I mean not necessarily based on NPD analytics. So if it is done, you know, and I fully, you know, given the direction that a lot of policymaking is going I mean there is some consultation and so on. And then large I think we've seen a fair bit of what I could call national interest prevailing and national interest today seems to believe in this, you know, this concept of sovereign data, etc. So the you know state decides on data, data is, you know, a national code, etc. and that's been contested and so on. But I fully expect to see some form of this actually go through. So I think one, one would really like to see transparency very clear transparency about the public good basis which high value data set is initiated and created a clear statement of purpose and a discussion around possible alternative users of that data. So that there is a strong assessment of the harms which are possible, harms from a public good and a societal point of view of course, and then there is the issue of the harms to the company in terms of their business protection of the IP that directly affects business. You know, favorite examples for data use and metadata use are of Uber and you know you can use that Uber metadata to do down planning and all of that. But the fact is that yes it is it's open to use an interpretation by potential competitors. And those competitors could be, you know, ostensibly public good kind of compare, I mean, for example, they could be city transport or they could be to improve the, you know, the planning and coordination of the Delhi Metro or the Delhi service or something. But it could equally be for other competitors in there. So clearly an assessment of the impact of that, and Uber believe me Uber or Ola or all of them will do this assessment and will challenge data demands, or HP data set demands, which very clearly and visibly affect their business potentially, or let their proprietary IP and data be visible to competitors. So that's kind of a quick overview and leave the floor open for the others for intervention. I think that is fantastic. I mean, you talked about two big things. One is the legitimate users directing this in the right fashion. And the other one that you talked about is the complexity because of the far reaching nature of this particular change. It will touch every nook and corner and that will have second order effects in terms of exposing the, you know, the given the lack of commitment to this direction. I have one question before I want, you know, Shubhashish to respond on the larger trust and the incentive issue. This is going to be step change, no matter when it is now or three years from now and this needs to be the larger direction in which the world is going. So you talked about transparency as one of the tools that will help in this journey, transparency as far as the objectives, processes, and so on. Given that we have to take this step change, any other thoughts come as far as will help with this? So, you know, I think it's related to the transparency angle into in a couple of ways. So one is the capacity. I mentioned the capacity building for the regulatory authority. Okay, but to go back to the transparency issue. It's not enough to just say that there should be transparency. Okay, who's going to question the transparency. So just as an example for a policy for a public policy development process, you know, transparency requires that there is consultation that is, you know, draft and draft is, you know, is responded to, etc. But for an individual request for a high value data set. Okay, if there is no transparency or if there is a stated purpose, but they could be alternative users use cases which lead to harm. Okay, then, you know, I don't see currently in in place of plan to have to ensure that form of transparency because you clearly are not going to have a consultative process for every request right. So it's going to be up to the NPD authority to be able to educate on that and to look deeper. So possibly a process of feedback there, which may not be in line in line in the sense, it won't lead to a delay in the approval for that particular thing. But when it comes up, and you know, this is brought up to the public that this was the stated purpose. There should be a process to absorb that feedback. Okay, I think that is a non trivial process. That's why when I just say transparency, it's one word but it means to a whole SOP out there which you know really take thinking. I think this is wonderful. At this point I want to bring in a few different perspectives. Because of the emphasis of Prashantro on the system aspects, the design aspects of this particular space, I will change the order a little bit and start with Subashish. Subashish is a principal at Omadiyar Network India and investing both for profit and non-profit entrepreneurs with the aim of helping create a safe digital project. We write transparency on data production, internet governance, digital ID and regulatory design. And I with the breadth of experience that he has and the thought that he has given. Subashish, I want you to comment on the three questions that I'm going to pose that emerged out of earlier conversations and also Prashantro's presentation. We know that there are significant concerns around the overall architecture. Whether it is the NPDA, the design and the impact of them or in terms of potential leaking and anonymization methods, the malacris and so on. The question is how do you create a trustworthy non-person data ecosystem? That is the first problem. The second problem is how that still does not assume that or does not require that the organizations are aligned. How do you create incentives for organizations to be good actors in this entire space? And lastly, do you believe that the data process and the architecture that is proposed today is enough? Or do you feel that there are further refinements that need to be done going in the larger directions? Thanks Venkat for those questions and I'll probably take them in that sequence. Before I do so, the underlying theme in whatever I'm about to say is building of Prashantro's point around regulatory capacity. And if you look at what has been written on this topic, especially Reflect Bojesha and Vijay Kelkar's book, in cases like India where regulatory capacity is low, you typically try to make it as rule-based and as less discretionary as possible. And therefore I think that is what we have to aim for if we have to make this entire system work. So let me take that question of trust to every element of the data journey. So the data as it is with the data custodians or the businesses that give data, I think it's something you will probably discuss later in the event in a lot more detail about metadata and what are the practical challenges there. So I'll skip that bit. Next, there is a data trustee who mandatorily can ask for a non-personal data from businesses. There are, as of now, not too many checks and balances around the roles and the functions and the accountability of this kind of data trustee. And I think building off the point of transparency that Prashantro spoke about, I think that's very important. For example, some of the things that should practically be done is that the data trustee should have a mandate that every year it's putting out an annual report. Because if you're serving a public good function, you should be better, you know, you should be open to the public. You're saying, what are the HVDs you created, who asked for access, et cetera, et cetera, so that we can make sure that data trustees are performing well. Next step in the data value chain is the data that the data request is asking the data trustee for data. And I think this is where some of the incentive questions or discretionary questions become important in the sense that the data trustee is meant to ensure that this data is being used for public good. But at the same time, it's supposed to be non-discriminatory in its in providing access. Now, how does this question of does the data trustee really have any discretion on who it provides this data to is somewhat unclear and one might even say a contradictory. And therefore my suggestion is given that we really do care about a non-discrimination probably more than other things that the data trustee have a mandate to share data with whoever asks for it, unless they have a problem with it and in which case they can take it to the MBTA and have that argued out in court. But I think to provide more discretion to the data trustee in terms of who they can give access to and who they can't probably creates a lot of possibility for the data trustee to actually in that sense be owned by vested interests. And finally, the question is in the final leg of this trust when it reaches the reaches the data request at that point, how do you make sure it is being used for the public good purposes and how do you trust the data will be used well. And I think this is a very big challenge, because you obviously don't want a very heavy handed regulation that is monitoring each and every user. As Prashant said, you can't have a public consultation process on each and every request like that. So you will have to make it somewhat free flowing, but then you have to have post facto checks to ensure and mechanisms to ensure that data has not been used in a misleading way. And for that, for example, I think the kinds of things that will be important is that there should be clearly laid out penalties for misuse that there should be a quasi judicial arm of the MBTA, because that's where what most of Indian regulators miss a lot, which is they don't have a very good judicial function. So for example, half of 70s orders are overturned by the applet tribunal. So we need to ensure that because there's going to be so much litigation on this issue in the first few years that we do have a very good judicial arm of the MPDA. And the third is there have to be some sort of whistleblower protection, because I think any misuse of non personal data is probably going to come from inside the organization. And in India, I don't think we have whistleblower protection of that extent. So I think this is how I would think of trust in each level of the framework, not your questions of incentives for people to behave well, share data, etc. I think that's the key challenge here, which I'm still not able to wrap my head around. Because how do this basic question, how do you set up price for something that is non excludable is something that economists haven't answered. So I don't know how to play out here because you can't if things like defense and you can't charge for those because it's not excludable. One person gets the other person can get it to in this case. For example, if I take if I'm the data requested, I get the non personal data and who nobody's monitoring it. So I just provide it to everyone else. Then on what basis am I paying in the first place and why does anyone have any incentive to pay one person can pay and then share it with everyone else in the ecosystem. And so I think it's a challenge. This question of pricing and there is there, you know, there could be some sort of licensing regimes where you can only use the metadata if you have actually paid for it. But how do you enforce this licensing regime in this country like India is difficult to say there have been lots of private sector models of data marketplaces more on the personal data side. But those have also failed because of difficulty of creating two sided marketplaces. So I think that itself is a challenge. A lot of people have used regulators have used some sort of a almost like a public funding of these kind of infrastructures where for example, the fines collected by a regulator are used to subsidize public good purposes like awareness, etc. One could imagine that kind of an incentive architecture for the NPD, but will the NPD really ever generate so much and finds that it will and is that really the kind of mechanism and incentive we set for it that it has to. So I think the key problem in this entire ecosystem app is this question of pricing and how do you price it appropriately because in this framework, it does say that a data requester has to pay a fee to the data trustee, even though that will be somewhat nominal. And finally to bring it all together and answer your last question about the role of the data trustee itself and the NPD and a lot of other structures in this case, I think. Public good has been defined in section 7.6 as saying that here are the 15 things like quality elevation, etc, which are public good, and there's a 16 which has and others. So that kind of ambiguity is never helpful in any kind of regulation. But more importantly, how do you ensure that a public good purpose is being served. So at the time of asking for the data, will a data requester have to say what it will use it for. Does the data trustee have any discretion in saying no to it? And if the data requester takes that data and does something completely different with it, is there any kind of recourse or not? All of these remain undefined in the current framework. I think it will be useful to tighten it with giving all the suggestions that I said before. I also think given how important the role of the data trustee is, I think it will become important that the data trustee be financially independent of any other structure. There are other kinds of corporate or other interests that's not serving that. So I think the kind of data trustees who get the privilege of hosting an HVD also has to be defined far better. And finally, the overarching point that I will say is given how much litigation this is going to result in and I think it's natural. And I think the committee also acknowledges that jurisprudence will evolve. But who are the players who will have the financial where with all to actually fight those cases again and again and again will be a certain kind of company. So I'm not so sure that with the startups, you will have the management and financial bandwidth to actually fight out with the larger tech companies and therefore make this thing a success. So these are all questions. I think the heart's in the right place, but some of these nitty gritties need to be detailed out a lot more. It sounds like this will be a fair amount of, I mean, there are so many of these big ticket-related questions that have not been answered. I can see this playing out for the next few years, but unlike GST, which has a much narrower mandate, this is a lot more open-ended than too many. And if the Government of India is planning a step change in the ecosystem where one day it passes the law and then unleashes chaos in the world, what do you think would be a way to introduce this law? Discover what are appropriate institutions, processes, incentives, structures and so on. It's not like somebody will magically come up with these things overnight. Yeah, Inar, in the discussions you had previously within this group as well in preparing for this event, I think one of the suggestions was can we start with the Government data itself and saying that you open up the Government data, let's figure out how the system works with that kind of data and then scale it to private players as well. That's a good suggestion. I don't think there might be as much political appetite to do that. The alternative could be do you want to open it up in a particular sector before you take a cross-sectoral view? So if you say that in healthcare, for example, this is really important for us or in education, it's really important for us, then let's do it in one sector, figure out what are the problems in this kind of thing. I think it's really ambitious. So it's probably wise to start small and in a somewhat more controlled setting. I think that's wonderful. This is a nice segue to a big domain that the committee has talked about often, which is transportation. They're very interested in transportation data. So we get to hear from and inside the nourish about how the ecosystem might react to this. Nourish is a senior product manager for product management at Zendrite, which is a transportation and insurance company. He will explain the context to us. He's a multidisciplinary product leader. It's over 11 years of experience delivering innovative and high-value consumer and fast product across the market. Product market fit, growth, maturity stages of the product manager. That was the mouthful. But it really helps that he understands the domain inside out and actually builds the data products in this space. There are a few key questions that I would like Nourish to touch upon in addition to anything else that may be on it. The first thing is, what kind of data do you have that is of interest to the community overall? When they keep talking about transportation data or Uber data, what are they talking about? Second, if even companies like Uber have to share, what are the realistic challenges that they have? Whether it is legal, technical, for example, are there multiple people who own the data? The third thing is, if the transportation industry has to share this data, what are some things that it would look at to back in return? In what way can it benefit from such an ecosystem? And of course, finally, if you were to introduce this kind of thing as a product, what are some thoughts? Thank you very much, Venkat. I know that introduction was a mouthful, but I'm glad you went through it anyway. So I think transportation is interesting as a product manager in this space who has not only worked with Zendri where we deal with road safety and driver engagement use cases. I've also previously worked at Swiggy where I've seen how we address some of the challenges that they generally face on the road. So the first thing that one needs to understand when they're thinking about the transportation space is that we deal with location data and sensor data at scale. And imagining them to be the building blocks, what we then have are certain aggregate results about users and their trips on a daily basis. These in a very blunt sense would be the units of interest, whether you're looking at Nuba or Swiggy or even at Zendri. Now, when I think about all the challenges that an NPD poses, I'll keep aside some of the commentary related to addressing the regulatory challenges, having the capacity to also meet some of the expectations that the NPD or a data trustee would have to some of the other panelists. I'll talk about perhaps some of the product and the engineering challenges as well. And I'll perhaps frame it in three different ways. First, there are the costs of documenting and delivering non-personal data as per the expectations of the authority. Then you have the fact that different types of metadata need specialized treatment, not only in how they're stored, but how they're adjusted, how they're secured. And then finally, I'd also want to perhaps talk about the most challenging of the problems that we face, which is metadata can reveal certain proprietary mechanisms which offer competitive advantages to companies which use this data for different use cases. So first I'll talk about the costs. I think it's become somewhat popular to imagine that data businesses harvest the last KB of data. I can emphatically say that that's not the case. Very often there are trade-offs made on a real basis where stale data is often perched, where the costs of storing and processing the data can exceed its purported benefits. And there are, of course, different policies that different companies may have on how much data they wish to retain in a general sense. The draft report does not have anything in there which talks about how much data must be retained, what type of data may be of interest to them for a specific sector. In our case, we know location data must be treated differently. And then one other concern which is allied is we know that companies have created a lot of specialized infrastructures to deal with different types of data. Do we anticipate the data trustees to have the same level of sophistication, how they secure their data and of course manage it in a general sense. I think all of that kind of comes through with the same argument. Then there's this realization that, you know, since GDPR, a lot of companies, transportation companies have invested a lot of efforts into isolating personal and non-personal data. Now through that, I think extreme care has been taken into how we could service the laws that exist in different countries. This specialized approach has told us at least in the transportation ecosystem that location and sensor data requires special infrastructure. I don't want to get into the technical details here, but it's not the same as storing say transactional data or even aggregated data of different forms. And I think it still is not clear to me on whether a trustee can independently arrive at the sort of designs some of us have in the transportation world for the durable infrastructure that we've set up to manage different types of non-personal data in the transportation world. And then in this sense, I think when I think about it, I do acknowledge that there is public utility to the data that the transportation companies have, but we need to ask ourselves the question on why do we want to recreate this infrastructure. And if we do want to recreate this infrastructure or, you know, leverage some of which is already existing, should the existing data custodians also in a sense act as the trust beneficiaries. This might actually work because since defining the purpose of acquiring the data for a trustee and then of course the HVD precedes the access of data. Having some data custodians as beneficiaries might actually avoid certain principal agent problems, problems related to commercialization of data and even data governance. Finally, I think perhaps the most pressing of challenges that companies in the transportation sector would have would be regarding how certain types of metadata could offer a peek into what may offer competitive advantages to different businesses. I know, for instance, if we exhaustively catalog met data, we would at least for certain businesses and use cases that I've seen offer a peek into the quality of data that we aggregate and offer to our customers as well. So there is nothing in the report which acknowledges this challenge. So I would want some perspective on how we could perhaps isolate that those parts of data which could be, which could be more valuable than others. Lastly, as a comment, Venkat, I have a slightly different view on the larger direction the world is going. I understand that the governments of the world are very keen on acquiring a certain sovereignty over data that is collected among the citizenry. But let's also acknowledge that we have giants like Apple and Google who control large platforms who in many ways offer guidance and determine the direction in which the industry is moving. For instance, it's been clear to us in the transportation world over the last couple of years that a lot of the data processing that we do for location as well as sensor data is incentivized to happen on a device itself. So if you're using a smartphone and say a certain piece of technology needs location data in order to improve a certain user experience that need not happen through a server round trip that can all happen on a device itself. Which means increasingly we're going to see more and more data restricted to a device and perhaps only transaction data coming through to hit the servers. So in this universe, how ambitious can governments really be about extracting public utility from non-personal data when we have transportation companies also perhaps waiting for the blessings of big tech before they use it themselves. There's one other question I remember you asking is how do I think about it as a product manager. I think in the world of product management we often talk about go to market and then you know it's kind of something that we think about at the top of the engineering cycle. So if I were a government, I wonder whether you know I can have say a small scale experiment to validate all that I seek as goals through a small experiment with a small group of stakeholders in the industry before enshrining things in the policy itself. I think that would be my key suggestion as a product manager. That's all interesting. So what I'm taking away from everything that you said is that one is there are strong technology changes, there is a lot of complexity and sophistication in this infrastructure and so on. What I did not hear in the whole thing is one thing that would incentivize you as a transportation person to actually share the data. Is there nothing at all for you in this? That's not true. So a lot of us are mission driven. For instance at Zendri we really care about road safety and we realize that by enhanced public utility of the data that we have we might ultimately improve the infrastructure upon which the data that we generate exists. So I see value. I'll be at value over a longer cycle but there is public utility and this of course is my point of view but it's clear to me that the transportation industry will have data that will be of public utility. Yeah, that's wonderful. So with that I think both Parshwan Toshu, Subashish and Narees, they're all talking about capacity limitations in the system. So we'll move to a slightly different entity about which also we're going to talk which is all these data businesses. Narees is then trying being just one instance of it. So in order to give us a broader perspective, we have Mangalam. Thank you for waiting on us, going to be with head around the many aspects of this. Mangalam Nandakumar is a principal consultant at Fort Worth. She has 18 years of experience in product management and software delivery. She's a published author, speaker and is passionate about equitable and inclusive technology. This is the, you know, impedance should advance the topic for sure. Let's talk about some of the implementation challenges that we hinted at earlier. Mangalam, if you could talk about how best can we empower these data trustees from a technology standpoint, as well as the enterprises. What is the current state? Is every organization as sophisticated as Narees? Or what is the, when you look inside the enterprises, how do they look with the capacity to comply with the NPD law? And thirdly, as far as the sharing of this data is concerned, Narees integrated, but essentially we are looking at creating new data infrastructure, including cataloging, isolating, anonymizing and so on. Is it realistic though companies have the capacity or is there a need for public goods in the technologies base to support NPD? What is your take? Thanks Venkata. I think it's my privilege to be part of this discussion and it was not at all a problem waiting, I think wonderful opinions all around. Today interestingly is also World Data Privacy Day, so it makes it very interesting actually. So, to your question, right, initially on the data trustees and how they can be supported, I think a lot of good points that both Prasanto and Subishi brought up just in terms of the discretionary aspects of data trustee as an entity in this system. And how independently can they make these decisions, right? So, the draft of course places like a lot of responsibility on data trustees. And this is not just looking out for the interests of the community that they serve, but also in terms of maintaining the HVD infrastructure. Right, so for a Section 8 company to be able to do this right with they have the necessary capability, the capacity and the expertise to pull off something like this. I think it seems a little impractical. But the outcome of it could be one or the other, right? One, this entire thing proves to be an entry barrier. And then you just don't end up having the right kind of data trustees to begin with, right? Or you end up having these Section 8 companies partner with or be set up as proxies by some of the larger corporates, right? And then you have to start questioning in terms of whose interests are they really serving, right? So I think that is the real concern I would say. But I wanted to actually take a step back and look at data trustees in a different perspective altogether and if that's possible, right? I think right now the draft looks at them as gatekeepers, right? So there's this data, let's protect it and let's make sure that it's providing the right benefit and not harming and so on. But what if we look at them more in the context of advisors, right? Taking a more forward-looking view rather than looking at today's state but saying that, okay, here's data sets that need or that can empower these communities that probably don't exist today, right? So can we actually promote and say that, hey, this is the data set that should probably be collected? Maybe that can foster some data collection happening to begin with, right? Or another aspect that I think Subashie has brought up, which is purely in terms of their audit, right? Like one part is audit, but also can they be the channel that publishes the insights in terms of what is the most frequently requested data, right? Like what is the market looking like? And we may have a lot of data from the transportation companies but maybe that's not what people need, right? Maybe that's not what is spurring the innovation in the public good sector. Maybe it's something else, right? And maybe we don't have the data. Or another aspect is just, you know, providing insights on what a high value data is to begin with, right? So is it, you know, metadata sets and everything? Whatever is collected isn't the context of the business that collected it to begin with, right? Now that may or may not necessarily constitute public good but can trustees become the advisors in that context who's providing perspective on this? And maybe be coupled that whole administration of data to a different entity altogether, right? I think that is at least my perspective. But assuming that trustees have to play that role of data administration, then I think there is definitely a lot of tech support, right? Like in terms of just understanding what metadata is, I think even large corporates sort of struggle with it, right? Like data governance is like a, is a huge aspect in itself to expect section eight companies to suddenly overnight, you know, wake up to this concept. I think that seems a little impractical. So there is definitely support that's needed for them as they ramp up on this journey and that in a way that doesn't influence their overall, you know, the objectives of why they exist in the first place. Leading. I think your second question was in terms of enterprises itself, right? Like, are they equipped to deal with this and how, right? Yeah, I think I'll make a sweeping generalization here. But I think businesses definitely are seeing data strategy is essential to their business agility, right? So I would go on to claim that businesses have probably been investing in their data strategy for a while now in the hope that they're going to reap benefits from those insights, right? That will help them steer ahead forward. Now, to the thing is, I think a lot of the panelists here spoke about the incentives, right? Like, what are the right incentives for a business to be even motivated to share this data? So the aspect of this whole monetization thing seems a little, you know, dam script to me because data in that sense is not a commodity, right? It's an asset. So you know, it's not, the value is not in just the sale of that raw data. It's about, you know, what does that data give you in terms of insights? How timely are those insights, right? And how valuable is it to the contents of the business that you run it? Now, and it gets magnified, right? So you're looking at your data set, it gets magnified when you combine it with a different data set and so on. So in that sense, if I look at it from a requester's perspective, it could be an all or none, right? So I'm requesting a certain combination of data sets and let's say one of those entities refuses to give me, is it still valuable data for me? Like, I think that's a different aspect. But in terms of enterprises, there is, if the carrot doesn't work, I'm guessing the stick does, right? So you would see the, they would potentially see this as a necessary evil, like a box to be checked. You know, if you have to survive as a business, you need to comply with this. It's almost like what GDPR does, right? You would do it for your own sake, but whether you would do it for benefiting another business or a public good sake. And I don't think data monetization is the right incentive. I think at a bare minimum, we should at least look at removing all these unreasonable asks and demands, right? Like for instance, let's say there is turnaround times. How soon is the business supposed to respond in terms of a data request, right? What happens, for instance, if they contest a request and say they don't want to provide this data, how long does it take for the matter to be resolved, right? And I think those are the aspects that would definitely concern larger businesses. And this is also important, I think, to look at the context in terms of businesses operating across multiple countries, right? So you're going to have to comply with multiple versions of privacy data policies, right? And then how do you reconcile those and how do you actually, even this shouldn't become like one more burden that they need to deal with, right? And not just in, so my take is like incentives is just one aspect, but also, you know, lowering that threshold of inconvenience, if I may say that, right? Like I think that's equally important as well. You might have to repeat the third question for me, I couldn't keep that. So in terms of just data quality and whether they're organized just in a way that it can be, whether we, you know, we can't go back and clean the data, right? That is the best way to be heard. So are they in a state that they can even be shared assuming that we build some of this infrastructure around it? I think that is two things, right? One, so that I would say there are a few aspects to this. One is that initial cataloging itself, right? Are you able to catalog the data? Are you also say able to segregate the PII data from the non-PII ones, right? Like all of that is, I think, essential. And then there is this question of just in terms of like, one is the cost of doing that, right? Like you would be doing this internally, but then there are let's say other aspects to it. For instance, the data in the way that you use for your internal usage may or may not be the same way that you want to expose it for the purpose of NPD, right? So are you going to have to set up like a parallel infrastructure? The second part of it I think is in terms of data retention. Businesses, let's say, may have an internal policy of archiving everything every six months, right? Now, we don't know what the ask is in terms of NPD, right? Like what if you're asked to maintain your data for the past 10 years, right? Like so, and those are different significant changes in terms of how you business might want to approach their data strategy. The third aspect I think just in terms of sharing data itself, there are issues, right? The draft calls out innovation and entrepreneurship as a desirable outcome. Now, I think Nareesh mentioned it as well, speed to market is essential, right? So if you can't innovate when based on data that's two years hence or you wanted today and you're getting it six months later, that's, you know, as good as not having it, right? So where is that accountability in terms of turnaround times? Just by reading the draft, it looks like multiple layers, right? Like a trustee takes some view and then goes to the request, you know, the data business and then what did they reject and then goes to the authority, you know, by that time and really do startups can actually can they afford to stay that long in the game, right? So that is one aspect, but the next is also in terms of accuracy itself, right? Quality, what if there are gaps in data, right? So the metadata is one part of it, but let's say the timeline of data. I want the most recent one. Now is the data business going to be willing to give more recent data? And then there is also the question of, you know, purely the quality itself of data. So who's going to be checking it or verifying it? Is it the trustee? Does the requester have to worry about it? And then how, what is my recourse, right? Like, how do I go back and ask for more data? So how does that process look like? And therefore there is also, is this a one-time request? What does it look like when there is a more, you know, frequent more than once type request versus real-time data request? Are those processes looking different? Right? I think there are several aspects. And the last one I think we should concern enterprises in terms of data sharing, right? Is enterprises are likely to say put in a lot of thought around cybersecurity and data protection and so on. Now, once you exposed your data to data requesters, what's the guarantee that the same level of diligence is there on the requesters side? Right? And also it's not just about, okay, they're using this data internally and they have, let's say, lack of security that is one risk. What stops them from, say, reselling that data? Right? That's another aspect as well. And I think those will be the larger concerns. I think we should even go towards, much of the concerns, I mean, as somebody who plays in this range of data day in and day out, recognize all the issues that you're talking about. We are running short of time. Pressure, I don't think anything that you have heard so far is shocking surprising. I think you have heard this thing before. What do you see as the path forward? You know, I think to step back to the 10,000 foot level, this is something again that has come up in discussions. You know, the question of, is this a solution looking for a problem? Do we need this? Okay. And I think I'd just take a couple of examples from what has been set to summarize that. Nareesh, for example, gave a very good example of the incentive question and said that safety, for example, is a driver. Now, if I take any of the other critical infrastructure areas, then by the way, transportation is one of the things. So take FinTech, for example, the banking financial services FinTech. Again, safety there is currently a very big driver. Okay. But having said that, there is a very strong regulator in that, in fact, there are more than one, but there is RBI to start with in the entire payments infrastructure space. There is a lot of data that RBI actually asks for. Now, each entity, each payment system operator would be very keen to be able to share data and collaborate, et cetera, to improve safety in the system. Okay. But it doesn't need to go through the future infrastructure of an NPD, HP data set framework for that. There is an existing framework and that is under the regulatory, you know, oversight of the Reserve Bank of India. So therefore, you know, in this overall framework, I remain unconvinced that we actually need this huge mega framework that we are going to when struggling with these questions and so on. So that is a very existential 10,000 foot question back onto this. And if it is done and I suspect a lot of these things will just be done regardless, then I'm really looking forward to some of the transparency things, which I think Shubhashish also elaborated further on. Sorry, you're on mute. Yeah. I mean, it will be a neat thing. Time for sure. If we can make it on the whole country. Any closing thoughts, Shubhashish Narish? I'll probably come in with one because I think a panel where everyone agrees is not a fun enough panel. So just to disagree a little bit with what Prashanta just said, I think, again, putting my economic lens hat on. I do see the need for something like this because data markets left to themselves because of the nature of data that is non-excludable. The private, purely private market cannot emerge in that those kind of settings. So I think there is definitely a nudge that is required to make data more freely accessible, to get to socially more optimal outcomes of all the possible options. Is this the best one is I think one that is certainly up for debate and see if the committee in its report hasn't looked at evaluated other models for doing so. So I do see the need and there I disagree slightly with what Prashanta said, but I do broadly agree that is this the best possible way to achieve it? We don't yet know. So there is one question from Shubhashish. Can there be incentives such as tax credit, why will they get funding for data custodians and so on? Apparently other jurisdictions have explored. Basically, is there a way to provide software on the cost side, on the management and can be put in place within NPDA framework? So if I can just very quickly, I mean, sure, I think incentives, the question is a good one and it should be looked at in terms of other types of incentives that would work, but I doubt if the incentive part will be an overriding factor. I started with examples like Amazon's overwhelming potential dominance based on how there is a reason why that is, I mean that data is so valuable to it. So a little bit of an incentive here and there is not going to override the fact that there is very valuable data that is going to fight to retain. So yes, if we are going ahead and doing this, then we must consider the incentive question and that incentive can be public code which these entities are part of. So safety is one safety security. And these are things which, by the way, directly benefit the players also and FinTech is an example. If you reduce the number of fraud from the level that we have seen happening in 1920 in the pandemic year, all the PSPs and PSOs will be actually happy. It's just that there are different ways of doing it. Wonderful. I think slightly over one hour of budget that we have. Let me highlight a few things that came up during the conversation. One is of course all the different ways we can ensure trust in the system. We talked about various transparency mechanisms, feedback mechanisms and so on. We also talked a little bit about the complexity of implementation at the NPD level itself and the data business and how we need a lot more tighter rules, a lot more narrower framing of this whole space in order for this to become manageable. I mean potential, we looked at whether there could be incremental path for this. Some of the sectoral approaches are possible. We can introduce some form of pricing and so on. The big thing seems to be that there is a lot of uncertainty along many different directions in this space, you know, cope implementation and especially in a country where the regulatory capacity is low. So it will be interesting to see how this whole thing will play out. And of course there could be new directions. Mangalam pointed out how data trustees could take a slightly different role where they would actively engage the data custodians to advance the community interest. That is a much more of a positive proactive notion of trustees as opposed to passive channel for data. And the significant compliance burden that we will introduce on the organization.