 This is Satish KS and I head the engineering at the company called Zeotap. Prior to Zeotap, I worked in Ola camps. I was taking care of the data platform and the fraud prevention platform. I have 18 plus years of experience and I have spent the last eight years working in data and big data areas across various domains. The current one being by virtue of my current company, it's an ad tech and mark tech and prior to that, it is an e-com and prior to that in a couple of IoT domains. Today, I have a small presentation around my personal thoughts on the NPD draft or the non-personal data build draft which the government is or MEA is proposing and what I perceive as its impact on data business. A disclaimer here, this is not the views of my company or my organization, it is my personal views. So with that, just an intro on the NPD draft, I know a bunch of you in the audience who have already gone through the draft. First and foremost, if you look at the non-personal data build, that means a data set which is, I would say stripped off its personal information or personally identified information that is defined as non-personal data build as per the Indian government which is under the draft. So if you take a read at it, it has the implications on the whole org. When I say the whole org, it's across the value chain starting from your engineering, product, infra teams, security teams, as well as your legal teams, as well as it could have an implication on the teams which is working on your pricing and the external business contract aspects as well. And the second thing which is very striking about the builders, if you think about it, it defines something called a threshold which doesn't have a number to it as of today. And based on that threshold, if any business is collecting data beyond that threshold, it automatically has to register themselves as a data business. Now, if you think about it, with the explosion of the data which any online business or any tech company is seeing today, within one or two quarters, every business would sort of become a data business on its own. That is another impact of the non-personal data bill. And if they have to cater to the asks of democratizing and commoditizing data, which may not be part of the core business of what the business is operating on, that requires additional investments across the org again in terms of since it has implications or it has additional investments across the org. That is another aspect. And one of the foundational thoughts there in the NPD bill is it is striving to create something called a level playing field between small business, medium business, as well as a big tech company. And while we are going through the presentation and then we are concluding, we'll just give some thought over it whether it is really in the interest of, say, a startup or a small business, this particular bill, whether it is having adequate protections to help them as well. So that is a big info on the non-production bill. Primer in any data processing, be it data business, whether you are running data as a service or even if you're running as a, say, some other domain, if you're having data processing pipelines, what exactly happens there, right? So there are three pillars to this whole data aspect, if you think about it. So the first pillar is the sourcing. So if you look at any first party, let me define what is first party. First party is the data the company is collecting about its own customers. A classy example could be Ola collecting customer info based on their rides and their registration and all these things, right? So that the sources of this data could be mirrored from, say, the web SDKs or SDKs, apps, as well as the walk-ins and the discount coupons and the subscriptions they get and all these things, right? So there are a bunch of online sources as well as offline sources for first party. And if you think from a third party business angle, there is a complete flip in the sense you will have specialist sourcing teams which actually scouts around on various aspects of the data, whether it is the right data, whether it is in the right format and what are the integrations required and what are the contractual requirements. And based on all these, they'll be sourcing this data. And this data is what, say, some company B is collecting from some company A. Some company A is its first party data and that comes and this is termed as a third party data. And that is the sourcing aspect of the whole problem. And the second thing is the refining aspect. Refining, if you think about it, it is, there is enough literature as well as lots of data engineering talks around what is refining, refinement of data. So in refinement of data, you standardize, in the sense you standardize into a common taxonomy, you do mapping, you do transformation, you do tensing of the data to get the, weed out the wrong data points and get the clean data inside your system. And you will have a bunch of enrichments on top of it. Then you can apply heuristic or A-driven or ML-driven intelligence addition on top of it. Then you have the quality makeovers, whatever is needed when you are creating on the statistical quality or it could be anomaly detection and these bunch of quality requirements would be there. And the final point is you will be adding something called the temporality. Now temporality is nothing but the time dimension because any data which you collect, almost 90 data, I wouldn't say all the data, but almost 90% of the data you collect would lose value over a period of time. So you have to add the temporality as well to the data. And from that, you move on to creating consumable data sets, which can be delivered to the end consumer. Now this delivery, if you think about it, it is more or less the reverse of your sourcing problem that in sourcing you have the same kind of challenges in terms of format integrations, delivery to have the same formats and integration challenges, as well as you will have say a push-based mechanism or full-based mechanism, batch mode mechanism, streaming mechanisms, AP-based mechanisms, or some customer would come and say, I need a self-serve, a discoverable tool on top of your data set where I can go and define my own criteria, create my own data sets, and download it for my consumptions. So these are all the various modes of delivery, then other non-functional aspects come in terms of the security as well as the reliability. When I say reliability, it is about the classic SLA management. For example, you could have given a contract saying, monthly, first week of the month, I'll give you, ensure the data dump is available to you. So in that case, what happens if you don't honor the SLA? So these are all various aspects in terms of the non-functional aspects happening in terms of delivery. So if you think about it, this is something very similar to what happens in the current oil industry, and it's intended to be so. Data business more or less happens in these three major pillars in terms of sourcing, then refining the data, that is curating the data sets and giving it for delivery. So with that, we'll see if you are either already a data business, you will be having pipelines which are doing it. If you are in a different domain, say some other ad tech domain or fintech domain, you might be internally running some pipelines to solve some internal growth or revenue or fraud detection use cases. These particular similar structures or constructs would be available within your company for that. Start with something called the data catalog or the metadata derivation problem. If you think about it, you might have 100 sources internal or external and each source will have its own metadata around how they collect the data and each source will have various value types which they are giving into your system. So you need to have a catalog of all these. Now that is from the source catalog perspective. What are the sources you're getting? That is what I'm mentioning as the raw data. Now once you refine the data, you might add 10 or 11 columns to it in terms of how I have refined it, I have enriched it and these are the additional things I have figured out about the data set. Now that creates a complete bunch of data set in terms of the refined data catalog. Now when you're creating curating data sets that might be additional views and inferences and derivatives on top of this curated data set which you are giving for various end customers. Suppose you have 10 customers, each of them may not be consuming the same set of data. So each would be consuming their own set of data which will have some catalog implication with a same tag around who's consuming it and who's the source of it. You may not run into a major catalog issue unless you are doing a complete denormalization that is like a union all of all the dimensions of the data you're collecting across all the sources and making a single denormalized table for consumption which is practically not possible. So from the NPD angle, what are the challenges here is it does not clearly mention, it does not clearly mention what are the catalogs I need to make available for my end user, whether I should give right from my source catalog, the refined catalogs as well as well as my consuming data catalogs or I should invest exclusively on creating a new data set with its own metadata for NPD consumption alone. So this is one of the, I would say a gap in terms of the NPD draft where it is not clearly defining what are the catalogs I need to maintain. So that becomes, I would say, a question and which data architect or a data engineering team has to answer down the line. And if you think about it, there are other common challenges in terms of when you do this thing, the data cataloging as well, in terms of you can have cardinality management when I say cardinality, you could have high cardinality data, low cardinality data, you will have your tags towards compliance, towards sources, towards consumption and a couple of other unique things would be the path where the data is residing. There could be logical paths, there could be physical paths where it is residing and where it flows through, what are the version, what is my current version of data? What are the version which I have sent out for consumption? So these are all various metadata which you need to manage around the data and this all gets compounded if you have to manage it for the NPD scenario as well down the line. And another thing is, another challenge in terms of cataloging, the tool set, I will say as we are speaking, it's actually fast maturing. We have a bunch of tools, cloud native tools as well as open source tools available for data cataloging. And so the tool set maturity would be a concern, but it is greatly getting addressed as we speak on this. And the third thing is the cultural aspect. So if you think about this cataloging or creating asset inventory of your whole data sets for any organization, it's typically on a backbone. It is something, classically what I would draw a similarity in terms of being a tech person is where unit testing takes a backseat during development. So it'll be unit testing would be doing a catch up. So development of say 100% of code is there. This will be say 40% then it will reach up to 50, 60% so on and so forth. Similar challenges appear in the cataloging scenario as well. So if you have to adhere to the, say the PDP or NPD or these kind of draft, this has to be given a first class citizenship in terms of your architecture and treatment and that has to be in place for you to adhere to or cater to the requirements of NPD. If you think about data processing, first is many of the processing pipelines, let's look at the life cycle of how a data processing pipeline is created. Typically you have the business team which comes and put a bunch of us and it goes to the product team and the product team curates us and they decide based on existing data sets and new data sets which might be applicable, whether we need to create a new processing pipeline, whether we need to invest on that and what is the life cycle of the pipeline and what is the frequency? All the bunch of things which you can think of from a productization perspective happens there. Now, given the NPD states that I have to make publicly available all my data available, that means I am making it discoverable via data catalog. Think about the scenario, there might be 10, 20, 30, 40 asks saying that I need this part of your data, I need this item on your data or this on your data. So various, they can ask a subset, they can ask aggregate, they can ask a view of your data. So all these things could come into this. Now, who will curate it? Who will curate the asks and how does it get curated is one of the major of this thing. And then you could run into a scenario where you as a business don't need to run any aggregation or tertiary pipeline at all. So now that would become a requirement by say the external NPD based discovery and requirement coming into the picture. And the thing is sometimes you may not even be protected from it because if the requirement runs into a conflict and if you go to the obertsman and they say, okay, you have to make this available, that means you have to actually invest on this particular new pipeline processing as well. So that is one. Now the thing is, when you're processing these new pipelines, new requirements for each of the additional consumers, what happens is your metadata automatically increases. So it becomes a cycle. Now you have a bunch of metadata based on this metadata, new metadata is coming in because you're creating for that new consumer. Now that can go into another cycle. So it creates a ending cycle of the metadata derivation problem as well. And the third thing, if you think about it in terms of processing is running proper algorithms in the sense, even in the NPD draft, there are a bunch of algorithms which they talk about, K-anonymity differential privacy, homomorphic encryption and all these things which may not be required to be run for your core business, whatever you are doing. A homomorphic encryption may not be needed. But by virtue of ask from the based on the NPD from the external consumer, you may need to force to be run that algorithm. Say, if you want to run a differential privacy algorithm on a million dataset, it might be possible. Suppose you are talking about a one billion dataset or two billion dataset, there is a possibility it may not even be possible for you to run that algorithm. So in these kinds of cases where there is a genuine issue for you as a data provider that you won't be able to honor that ask it all, what is your protection? Whom do you go to? So this is some gaps which I see that it is not addressed in the NPD. Like can the provider say no? No, it is not possible for me to give whatever be the price or anything. It's not within my reach to provide this particular data or cater to this ask. Even though my metadata has this particular from my metadata, even though he has figured out this is possible, but it is not possible for me to create the dataset and give it to the person. So that is another thing. So that is what I mentioned in the challenges as based on the metadata to do, we have a control on what the ask might be. And if you look at any data processing pipeline, whatever you put into your system, be it in your first party systems or third party systems, it comes with a couple of baggages, non-functional baggages. You might be running a reporting on top of it. It could be a quality reporting or client number reporting. You could be having a alerting and monitoring on top of it. So this all these additional pipelines create some back pressure in those additional systems as well. That is another aspect. And the third thing is which you have to think about it. Any pipeline will have some failures. So the failure rate actually I mentioned as rate, rate doesn't increase. The number of failures more or less remain constant because the rate remains constant. So for example, if 10 pipelines, there is a failure rate of one person and 100 pipelines having a failure rate of one person, that means you have to manage 10 failures for this thing by catering to these new requirements. So that also throws up a challenge to the implementer in terms of how he is going to manage the whole processing life cycle as well. So that is another area which needs thought when a person is trying to cater or when this bill comes into a fruition and it comes into a full-fledged stage. There should be protections around whatever we have been talking about as well as we need to figure out how all these additional things are handled there as well. This is an interesting aspect in the sense every data set has a quality aspect. The quality can be in terms of absolute quantity of data or in the dimensions of the data or in the value set of the data. So all these contribute to the quality aspect of the data. And if you think about it, if you are running the data within your system and within your consumer ecosystem, consumer ecosystem could be say one or 10 or 20, whatever. So any issue in the quality, the blast radius as it's called in classic security terms is contained within these systems. Now with NPD, what happens? You give a data set. That data set can be federated to another person. That can go to another person. Now say layer three, that is the quality issue. Because say in your ingestion, a person who has to give data on a snapshot basis for some reason, he just dumped the whole data on one day. That nuked couple of your versioning capabilities, couple of your quality capabilities and everything. And this is getting propagated to say one, two layers down the line of your consumption and his algorithm breaks or whatever. It created havoc on the system. Now, who takes liability of these issues? What is the protection criteria and what are the legal measures to be taken here? So this is something to be thought about. And if you have a quality issue, if I have a customer, you would have seen in a classic e-com website that is status page which is showing there is a quality issue here and we are rectifying it and this is the RCA. Now, that is the classic broadcast. Like it's like think about it, you are a hub and there is a bunch of spokes you are connected to, you are able to clearly broadcast. Now think of your hub and it's a kind of a free model in which case how do you do the broadcasting of this issue and how do you stop the downstream layers from doing any harmful processing of the data. So that is the challenge which I am talking about. So on the left hand side, the problems it can cost, it can create is it can create unnecessary cost of the overall ecosystem. And we don't know, again, it's all comes to the question of liabilities of who is liable at this point. And this is one aspect of propagation of quality and incoming quality issues having a system effect on the entire ecosystem. Second aspect is if you think about it, anybody who's consuming the data, they have to invest on something called some quality gates or veracity of the data. Suppose say you are collecting a particular attribute from three different data sources, you can get conflicting attributes. Now, how do you figure out which of the three sources has given me the right attribute? This itself is a challenge. And if any startup is trying to solve this challenge, they'll have a bunch of ways to figure out how to do that. And this is not a simple investment. This is a huge investment in terms of how do you create an incoming gate, how you can protect yourself when you are getting the data. So this is a two-prong problem in terms of the producer side as well as the consumption side, what they need to invest on. So I just put two challenges there. One is the veracity and other, in the veracity, I just told same entity, different sources are coming with the same entity and conflicting data attributes, what happens to that. So this is the other pillar, which we need to think about when you're trying to implement something towards all these drafts. NPD says that the security is governed by each vertical as well as say, if you're doing NPD from the personal data, whatever laws are there in the PDP automatically applies to NPD. But there are a bunch of problems there in terms of how it can morph. One is the cascading effect, in terms of say one person is opted out and how you are going to audit the entire opt-out across the downstream federated data as well. There are some contractual classes, but the thing is if something happens there, how you can prevent it. First and foremost is from a customer, but if you're thinking about it, all these bills are trying to increase the customer trust. Now, if something happens in three or four of your federated systems, the trust is automatically broken. So even though you have a contractual clause protecting your business saying that wherever leakage happens, he is liable, but the leakage and the source, if they look back at the source, the trust on the entire ecosystem takes a hit on this. So if the concern is not working well. So this is something which is a technical challenge. I wouldn't say it's a legal challenge, but it is a technical challenge for people in terms of how it is. And another is the ownership. So in terms of how the security ownership is transferred from each of this thing, who's the data owner? Now that becomes an additional cataloging problem in terms of who's the owner at each point of time and who has given access permissions and who has the audit rights on whether the secure mechanisms are working or not. And in terms of actual exchange, if I think about data as a service or data business, what happens is you will have that specific business team sitting one on one with the consumption team and they'll be jotting it down saying, I'll need three layers of security or two layers of security. I need an IP site, I need a VPN on top of it and you need to encrypt my data with a public key or my secret key. So all these nuances are captured. So in this case, who's gonna capture the nuances and how it is gonna happen and how many such contracts we are gonna handle. So those are all various security aspects which we need to think into. And other thing is, even the key management, take a simple scenario, are we going to create one key per consumer and if any key which you're creating will be adhered by a key rotation policy, update and rotation policy, right? So how this all pans out into the entire federated ecosystem. This is another, I would say, this is another consideration that when you are implementing all these drafts or when you are thinking about how this will pan out into your ecosystem, this has a major impact on how you're going to have that. So it is gonna have a major impact on the data security areas. Probably this is a very small point. See, we saw the three pillars there, right? The sourcing, refining, as well as the distribution system, the data distribution. You as a business could be having an IP on the sourcing itself. A classic example could be Yahoo chats, Whitechats or Bloomberg, which whose unique IP is access to a high quality, very high quality finance data, which they're able to sell to their customers. Second thing could be, you are able to provide real good insights on top of your inferences, right? So that could be IP in your refinement layer or that could be IP in your delivery. You are able to deliver to say 200 plus channels, 300 plus channels or whatever and that delivery itself could be IP or you can have a combination of all these. And if you think about it from NPD perspective, this IP is more or less becoming commoditized. So in which case, what would be your USP in terms of you running a business? So how do they survive? The whole survival of a data as a business itself comes into question if you're a small player especially. And yeah, the another thing is when you're creating the USP, obviously you will be putting some patents around it and what is the protection around the patent, right? So that is another area. Operational efforts come in two things. One is greenhouse pipeline life cycle monitoring. And the other thing is, say you'll have multiple support system for your customer. Say you might have invested in Zendes, another root on Salesforce, another person on Jira, another person on Kisflow, any systems which is having a ticket management and the support SLA management, prioritization and all these things, right? One is the support loading in terms of the number of consumer consumption increase. Second thing is, how do you federate within these systems, right? So is there any thought process on how the support process has to work? Or if there is none, the organization has to start investing on what is the support process they'll be able to provide to these additional consumers which this bill might come into, we'll throw into picture. Then this is a very interesting area in terms of pricing. If you think about any data pricing, it is a very, very nuanced problem and complicated problem. You have a bunch of specialists who are putting day and night on that, on figuring out what is the right pricing. And generally the value perception of data is on a one-to-one basis. Say actor A is looking at your data and other actor B is looking at your data. How A perceives the value of your data and different data sets you have versus how B is going to perceive your data and your data set is going to be completely different. Generally the pricing are on one-on-one basis, okay? Second thing is it has other nuances in terms of whether you're going to go with a pay-as-you-go model or a subscription model. And if the subscription, what is a tenure? What are the commitments there, right? So these are all classic pricing problems. Now in NPD what it says is, if the pricing is not agreed upon, there is going to be a person who will arbiter between these two parties to figure out the pricing. Now that is a complete gray area in terms of how that is going to work and how specialized is this person to figure out the data across multiple verticals, say it could be fintech, adtech, say plain product data or weather data or whatever. So how he is going to arrive at the pricing and what are the criteria there and is there going to be any annexure or blueprint which is going to help in this whole pricing as this thing, right? And the other things in the pricing contact is typically what you have in your SLA agreements if you will have some service credits in case SLA has a problem. Then in terms of you'll have your own contracts with the incoming data providers saying that I'll have a share of this revenue. How this revenue which you're generating from this has to go back to those contracts, right? So these are all additional aspects which you need to think into. And second thing is when the receiver is going to resell your data, what is the pricing there, right? So that also might run into the, it's a, I would say it's like a dy by dx, the second level of arbitration challenge of how that pricing is going to work. So this is another thing which I just want to bring into table in terms of pricing. It is a very, very complex exercise in terms of data pricing and many data business or data as a service business run on a low margin actually to be honest in terms of how they operate. And this is going to be really, really sensitive area when the bill comes into force or if it is not corrected upfront. You have cost across the organization across the value chain if you think about it. And what I have mentioned here a bunch of items is like what we talked about the data processing cost, additional pipeline creation cost, additional aggregation which we need to create cost, additional aggregation or inferences or derivations. Then you have the transfer cost, okay, that is something, right? So if you are doing transfer of data from your system, there is always a transfer cost, data transfer cost associated if you are sending a data set. And if you have to, if you're bound by SLA to send a daily data set, you call for value cost as well in terms of the data transfers. And additional algorithm run has a cost as well, right? Any algorithm on a large data set beyond a particular capability is going to incur lots of cloud computing cost or any on-prem computing cost or however you are operating. So if I have to spend all these additional costs, what are my incentives, right? So I have to give this data and to give this data, I am incurring additional costs. So what are my incentives to give this additional data? Am I going to get a large incentive? So that is partly should be addressed by the pricing problem as well because the value can increase down the line. Say it one month, the data value is so much and three months down the line, it is so much. So how is it going to be are the incentivization going to happen? You as a company who's providing is going to be participant of that incentivization, right? So this is something to be thought about as a broad ecosystem as a whole. And the other thing is if you think about it, startups are already running into multiple challenges. One of the, we talked about it in the intro slide, one of the foundational principles of the NPD tractors, they want to create a level playing field where everybody is able to take data and achieve scale as soon as possible. In taking data, we figure out there are bunch of challenges in terms of creating new pipelines, in terms of verifying the quality, putting the gates and all these and having the contractual and all these as well as for giving the data out, we saw a bunch of processing costs, new aggregations, security considerations and all these things. If you think about it, a startup might in a completely different business what they're trying to do. And within three months, they achieved a scale of data where they are forced by this draft to register themselves as a data business. Now they have to spend money and resource time, people as well as actual rupees to get this, right? Do they have the resources or Big Tech has the resources? In my opinion, probably the bigger companies or better place to either put in these new pipelines or procure new data and process them faster. So if you think about it, the equation is bit swayed towards a person who has a deeper pockets rather than a startup in terms of how it is. Probably in terms of ideation startup has a, I would say advantage there, but in terms of actual execution, definitely they don't stand to gain from this whole thing. So with that, I just want to conclude my thought process here in terms of this thing. So the top most item which you think about it which needs consideration from the NPD angle is the right pricing and some blueprint or framework of how the right pricing should be. And it has to have considerations across the verticals and across the data domain as well. There are two aspects, data domain is a horizontal aspect and the vertical is like your finance data or people-centric data or say weather-centric data or car-centric data. So that is the vertical-centric item. So how you're going to achieve it? So this might be a, I would say, it is a hard problem to solve. So this is something that needs to be thought about. Then the second is, as I told you, what is the protection for the data provider himself in terms of, of course, there are myriad of laws across say finance data, all these things. Now, suppose a company is having user-centric data as well as the finance data. Now, what is the protection when they give out these data, when they do some aggregation using both these data sets because they have user dimensions as well as the finance dimension and the NPD is forcing them to give a dimension which is blending these two. So how does, what is the liability protection available for that person? So for that data provider of the source. And the thing is, if you think about PDP, there is a usage, right? You need to get explicit consent for usage of a data from a customer. Now that usage, would you be able to enforce across your ecosystem, right? We talked about the tree, the federated tree, how it is going out. So what if some algorithm runs and tweaks the usage in terms of this thing? So what happens there, right? So the who has the honors awarded, whether it's going to be a government authority or who has honors of audit or should I spend my own rupees on going and auditing each time that the whole ecosystem is using it the right way or not? So that is another, another, another concern which has to be addressed there. And if you think about it, anonymization itself is a technical challenge. If you look at latest technical literature and all these things, truly anonymized data set is something very hard to achieve. It is achievable within a contained ecosystem, within your company. But say if a person is collecting aggregates from your company and 10 other companies and he's able to add some intelligence with this to the sum of the parts, you can get, you may not be able to get the exact name and the identity of the individual, but you still know or you have the digital persona of that person, right? So there is a problem there in anonymization itself. So the whole non-personal concept here, non-personal concept, as well as data democratization in the sense all the data sets is available from all the companies outside, that sounds a bit oxymoronic to me in terms of how to achieve both together, right? And the last thing which we talked about in terms of the IP rights is, if data is the USP of your organization, data or its derivative or its inference is a USP or the IP of your organization, pretty much I believe this commoditization which the NPD is bringing about and scrutiny by public domain of all the assets which you are having, it can pretty much nuke that IP or USP of yours. So that is another thing, how it can be addressed with the data as a business, especially from a startup perspective, probably a group tech company who has trillions and petabytes of data points can still survive and go ahead and do new creation, but if you are a data oriented startup which you're doing that, whether you have viable business at all comes into question with this question. Thank you, Satish. So now we have Chaitanya. Hi, I'd like to ask you to share a bit more on the concerns regarding data anonymization. Basically, Satish covered most of the topics that even I had concerns about. Basically see, right? I mean, when the government mandates the sharing of non-personal data at an aggregate level, basically saying that it's anonymized. There are a couple of things which come up, right? The first thing is how much anonymization is actually anonymization? I mean, as Satish pointed out, if you have enough access to data from different points, we can always re-identify the anonymized data. I mean, enough research has been done in that respect. I think a university in Britain also did this research where they figured out you can have four points of data enough to tell you with 95% accuracy, a particular person. So with that kind of data available and the compute power available to different people, true anonymization, I think is a dream. I don't think that will happen. The second question is, I mean, okay, anonymization is at the private level, but what about group anonymization? I mean, you can always say that a group of people, you are anonymizing based on an aggregate level, but that aggregate is not anonymized. I'm looking at the different things that are happening in our country. Certain groups can come under, can lose their privacy. Like you can say that one group is going from one location to another location. I mean, it's anonymizing the sense that it's a group that's moving, but that group may be under stress. So there are multiple things in this, the way you look at it. And the other point is that by the mandate, as Satish pointed out, I mean, if my business is the data and I have to give this data at an aggregate level also, doesn't matter. I have to share it for public use. Then in that case, I may lose my competitive edge. So I have to think about that also. While these are some of the concerns that I mainly have. And mainly, I mean, what public use? That's also not defined. It's just mandated that, okay, at aggregate level you have to provide it for public use. But what is the public use and who gets access to it, there's no clarity on that. So I think these are some of my concerns, Amisha. Thank you, thank you for that. We also have Satish today with us. So Satish, what are some of your suggestions for SMEs to comply with such regulatory bills and laws given that the data protection bill may come into effect sooner or later? Okay, so it's gonna be a first step. If NPD comes in the world, probably India's the first country who's just having something called as a NPD, this kind of bill. One of probably the immediate action items irrespective of NPD or the data production bill in the current format that is coming, is what I personally felt in the Indian industry is nobody knows what data assets they even have, okay. So first investing on giving a first-class citizenship to all your data assets in your company and having a clear catalog of what is lying there and who's having access to that. The basics around it, a basic sanity around it is where I would invest the first step in. And after that, we could solve a bunch of things in an operational manner in this. It's even though the NPD in the initial draft is asking for data anonymized and aggregated to be released, that can be done a little bit in an operational manner. But currently what is plaguing many of our Indian industry is we don't have any kind of sanctity in how we are managing our data assets. So that would be the first step if you ask me. Then get into all the additional items, right? How do you want to process it? What pipeline, what technologies? That is all secondary. First, identify where your data is. That itself is, if you just walk around and take a status of 10 companies, I believe six may not know where exactly what is lying. So that is a current situation in India. So that will be the first step on this if you ask me.