 I'm Parminder Jayat Singh. I am with IT4 Change which is an NGO based in Banjo and we work at the intersection of digital and social change largely from equity and social justice viewpoint and that that's important that this background be stated. I'm a part of this committee which came out with this report, a first year disclaimer that though I would try to background the kind of ideas which are the reason things went the way they did but of course they are either my ideas or my interpretation of what we were probably thinking. Another thing I think time is short I'll be forthright and I understand a lot of people here actually believe that NPD report should not have been there. They're ready to talk about whether we can have less pain or more pain but behind it also there's a feeling if we had no pain altogether would have been a good thing. That also comes out a lot from the Hasgeeks survey, very well done. I got very important points from it but like two-third of people said can we just not have NPD governance. So there's no point in a way also to discuss downstream issues about details when there is either a lack of clarity or a lack of agreement on upstream issues. So let's go it in the hierarchy of how things actually meeting of minds take place or arguments and discussions can take place. So first the question why do we have to dis govern NPD at all and that's where I would like you now to start thinking. So let's start with things we may agree on. We may agree on that data is the most important resource of a new industrial era. I call it post-industrial. Digital has been different from industry but whatever there is a distinct new era which is different from the industrial age. We are on the threshold of it where data is the single most important resource. I also know computing resources are important intelligence derived from data is important but people recognize the data is the single most resource and on this I think most people would agree here. People still think that the personal data is the most important resource. It's not true and I'm happy that Sushnath talked a lot about people like us and data of people like us. It is a lot of data about people of certain categories which provides the real insights. AI is very happy to treat data in an anonymized manner as long as the basic patterns remain. So perhaps the most valuable data is non-personal data today. Personal data is also valuable because finally when I have intelligence I have to target that intelligence to specific people and in the last phase of it I need again personal data. Now if non-personal data is one of the most important resource of the world the question is should it be governed at all or not and whom does it benefit to govern it and let's look at whom it benefits and whom it benefits not to govern it at all. I don't know next stage people may agree with me up to this point. Next stage is would people agree that there is a huge amount of inequitable distribution of data power. It is very lopsided distribution even more lopsided distribution than industrial age economic and social power used to be. I don't know whether people would agree with it but that's another assumption that there is a very acute disbalance between of distribution of data power between nations whereby China and US are becoming two poles which will suck off all lot if not all AI power and the world would revolve around it. They will outsource their intelligence to these two powers and this is why Europe and I will come to what Europe is doing about non-personal data and they are also very much hovering around the kind of things which this committee has done. So there's a lot of disbalance between nations but perhaps you are not so much into geopolitics and geoeconomics but a lot of disbalance between people and big platforms. We all privately or when we are not discussing data governance are very happy to discuss the power of Twitter to be able to switch off a elected president which however we may feel about Trump does not instinctively look nice for somebody to be able to do that or the power of Amazon or the kind of problems Australia is having with Google whether content makers should be paid something or not. So there is a huge disbalance of data power and I hope people agree with me that there is a huge disbalance of data power as well. Now if there's a disbalance the question is should we govern that data we should allocate value of who takes the benefit of data in given circumstances and this is what we have tried to do. We all agree that person is a subject of data but this committee goes into community as a collective subject of data. Again this was talked about that even if I don't contribute data there are people like me who would have contributed data and things can come back to me and though I agree with Sushna that a lot of things have to be done about authorities who can protect harm etc but before that it was important to establish a community's rights in its data and that's what this report has done but also I must remind you that this concept has a history if you look at the document which was a mandate of this committee it already had the concept of community data and if community is that concept is already there is already implied that there is some data to which community has a right and therefore it has a right of two kinds one is to prevent harm and second is to decide the benefits of it and where this high value data sets etc come in that how would the community control its data. So there that that's the background if you want to really understand what is data sharing think of data infrastructures. Now this was the issue has talked about this word data trust has disappeared it has been replaced by data infrastructure if you want to understand what this committee is trying to do is trying to develop data infrastructures in physical terms of industrial economy there used to be certain economic activity which was infrastructures and on the top of it business is used to run think of a road road is developed in a community and a public manner so data wise also there's a huge thinking and this committee believes that certain data is of infrastructure quality and if there is and this is one major proposition I want to put in front of you that if there is a lot of data dis power disbalance the disbalance is owing to a few global corporations who have captured the infrastructure of data in industrial age if somebody captures roads and ports and banks then the power that company would have is disproportionate and that's why they separated banking ports roads as a infrastructural activity on the top of which different businesses would do their work in that same sense there's certain kind of data which are of a infrastructural kind and if you want to understand what is a high value data set high value data set is that data which is considered under certain circumstances to have an infrastructure value which is necessary for all businesses in a category to to be able to function properly and if that infrastructure is captured by a early starter as googles and amazons and twitters and facebook service world have done then it becomes an unequal situation and almost all digital problems if I may see or make that bigger claim are going largely to the fact that data infrastructures are captured by early starters and you can never be equal therefore so this is the disbalance which the committee is trying to trying to correct now I'll come to what has really changed and what it tries to do is very difficult to try to correct disbalance around an artifact which is considered the central value of a digital economy or a whole society a lot of people have loosely spoken of proprietary rights about data if you really go and read the literature there's hardly real legal right on data they don't exist I have studied for weeks about what is existing existing set of IP rights over data they really don't exist in any strong manner so the question was that certain kinds of data it's not all data not even non all non personal data but certain kinds of community data which is sourced from outside from people from communities from public infrastructure and so on would be called community data and community would consider to have a right over it and thereby establish very high degrees of duty of protection harm protection and also be able to use it when necessary for building data infrastructures you know first of all to make that kind of a shift it was very important to establish a clear right because unless you establish right all this talk about voluntary data sharing data altruism simply doesn't work let's be real I mean we are in a hard law ground when you're talking about the most valuable resource of this this society now while we were establishing share right and that's the big change in the new new new version the problem was would it happen that the data legislation comes everybody can ask everybody else data and the old system is completely collapsed so while we wanted to establish a default right but the way data sharing would take place would be very gradual and that gradualness has been established through the high value dataset that only such data which is recognized and consented by the non personal data authority as being of infrastructural nature would be asked for even the kind of data which can be asked for the grounds not the grounds thus the fields etc will be determined I won't ask for a whole database I will only ask for a certain transportation data which is considered to be of an infrastructural nature and I keep on using that term infrastructure because that's key to understanding what this committee is trying to do the data of an infrastructure nature can be taken and we will all go one by one and that's why the committee say as a pilot start with health data when we do health data we learn a lot of things and things will change so while we established a default right that community has that right is necessary because this is a hard core political economy this is my thing unless you really tell that no sorry this kind of data is not your property it is community property unless you make that legal legal property right kind of a thing with the community you can't really ask for data sharing but once you have established it the right only kicks in in a very constrained process the process has been defined and its guidelines would be made clearer and clearer so there's a lot of certainty for the balances the last thing for the startups and other businesses the last thing I want to say is that why would then startups be unhappy before that I would also say that there is a data act which will come in Europe in this year 2021 which is going to talk about compulsory data sharing is going to talk about B2B right to access data these are words which are there already in the data strategy which says that these are the things which the data act will come up with right to access in a B2B context they are going to come up with it and I think we have done a little better than what they have done and I don't want I don't have the time to go into how they do not use a community rights framework in which ways our committee's report is much better but there is another digital markets act which the draft is out which I don't know what people have noticed that gives right to business users of platforms to empty about them that is like Amazon traders who have their collective data about say diapers or their data about fountain pens etc makers of fountain pens have data on Amazon individually and then aggregated data can be taken back by them they can ask for it so there's again an entity right they have established this committee did not establish rights for business users it only did it for a community of natural people so that that's the difference and why why would therefore all startups be immediately not be happy and that's my last point I'll just take 30 seconds and I can understand that currently startups exist in a business environment which is determined by a certain Silicon Valley model and most businesses in this area have a six to eight month of vision that's that's how that's the hard facts of this business and they're happy to they've developed their whole business model in that Silicon Valley model thing whether it is to be the next Google or sell out to Google in three years or to Facebook or to Twitter all the biggies or to Baidu or or any of those Alibaba's so so the problem is that it's very difficult from them to see it in their six or eight months window which they have to sell to the venture funder their their business model how things will change this is a normal it's been called a business dilemma social choice theory problem that if you ask individuals would you develop infrastructure they say they say we are very happy with the way we are so somebody else has to come and say no no give up a part of land of all houses because the road has to be made here so it's like developing infrastructure and very much understand why current startups may be a little nervous about because they're only projected themselves in this mainstream global model as we exist I don't think things will change too much because the whole process of these committees reports implementation is to go one by one even in one sector where they go for example health they would go for very few data sets which are supposed to be compulsorily shared we will learn but finally like we had it in the industrial age certain data would be infrastructural which will be commons other data would be proprietary on which competitive advantage will be built so my vision is have infrastructure of data on top of it have intelligent services which are your competitive advantage and don't make data holding your competitive advantage so things will start shifting in that direction and this is what this committee has tried to do I've stuck to the overview specifics I can answer that but I think I should stop now to to get if people don't get this vision of what is being tried to be done it's very difficult to discuss downstream issues. Hi Parvinder thanks for the clarification I think the big picture understanding is really critical so how are you seeing the roadmap forward will there be another v3 of this or the committee is going to hand off and then look for more implementation learning before you revisit or what is the current thinking about the roadmap for going forward from a committee's perspective so Scribble data is a data processor if the npd implementation actually goes through it is likely that we will be the our systems will be the ones that will be generating the metadata that will be submitted and we will be building the hvds on our kind of platform we are a data preparation company for machine learning so I come at this thing from a not so much on a value judgment point of you I know that this is a massive battle but I'm looking at it from a practical implementation challenges and so the the when my clients ask me they're essentially asking okay what time frame is this going to be implemented in what sequence so that they can appropriately build up the capacity in the organization I can talk a little bit about the enterprise data environment but what I really wanted to take away from Parmin there is I think we have seen the journey until now how does he how does the committee see the way forward for this happens going forward is generally a decision of the committee and then also the government people who are associated however the very point of having consultations is that there would be they would be taken unless we really it's very unlikely that no point we received was what tinkering at all the old version that's unlikely so that's the whole point of taking views is to see those views against what is in there and something or other may just get changed and I described the larger thing which I don't think would get changed because it's part of this committee's mindset it's government's mindset it is developing countries mindset this is a bigger thing though when I think I should you can ask me offline I wasn't very sure the kind of metadata you are talking about as your core work is the metadata we are asking to be shared there it's probably not affecting that manner metadata registry is just the is the just kind of a public disclosures like you the public listed companies do about their basic you know businesses and what divisions of businesses and what do they do etc is a very skeletal describing your nature of your data process I collect data from hospitals and I do this I have a back office thing you know it's that kind of a thing rest actual data sharing thing is a different thing whether your kind of work would come in that actually thing is a different matter so they're two different things but we can talk about the yeah I think that helps I mean we we do see that there will be multiple levels of this thing so one major concern that I had was that if you look at the enterprise data environments right they also generate a lot of data sets but our experience has been that a lot of that is very messy internally it's very people heavy and very laborious activity so one of the things I was reading in the b1 as well as b2 is a lot of tangible costs right I can literally point out to you will have to hire some this kind of people and this kind of people and track this kind of information and so on but from an economic architecture what I'm not finding is as many upsides or incentive for the data business to be a good data citizen you can force it through the the lines of the law saying that thou shall do this but I think the intent of the committee is to go beyond that and have a good data citizens in this ecosystem and that incentive alignment is something that concerns me and all of this is you know both the data engineering the the sheer costs of all of these things and the availability of people there are lots of resource limitations of organizations there so the first one is the larger economic architecture there is still a little bit of uncertainty I expect that it will happen over a period of time just wanted some thoughts about how you see the economy developing the other thought which was that came to our mind is that from everything that we know about data usage it is absolutely necessary to have a guarantor of the data somebody should sign the dotted line saying that this data is actually it is complete it has integrity and so on HVD that way could be a potential template for you know if the committee decides to expand because what we saw in trustee is ultimately somebody who will sign the dotted line and say this data is good for the following reasons and that without the guarantor there's no way we saw that the economy would work that was the second thought that we had and the third thing came up during our conversation which is the you talked in terms of infrastructure I wondered if you would go to the next step and expand this notion of extend the notion of neutrality to this space as well is there a analogy of net neutrality in the data space that that you are seeing let me hold here and there are a bunch of other interesting thoughts as well that I'm hoping to hear from others sure thank you so much Venkat so now I just like to invite Saranya to sort of give her the thoughts on like the conversations could ask for as well as quick questions to you Saranya sure yeah okay absolutely permittan please see to Saranya then you come sorry I mean whichever way so so I understand Venkat the cost wise you know we have we have only tried to laid the principle it is like once we agree that you know we also have to see the larger society when we want to talk about Twitter and you know Google issues then we say no no those powers are bad but when we come out to businesses they say don't do anything if financial audit which is very expensive enterprise is required because finance is something which is very you know valuable so data is also we are only talking about AI audit we can't have AI audit before we have data audits so without having certain processes and in this committee we only recognize the principle we have not even gone one step down to say what it would be and it keeps on saying it will be light it will be light so if you don't agree at a principle level that's the only point here rest all will come as we would do to maximize our digital industry's interest a second I love the point about that infrastructure is like roads you know you can't just make roads and they are full of potholes and you know not to proper engineering so governments or trustees when they get into this business as as an industrial age governments also became experts of making roads making you know ports earlier governments were only for security on your danda and your horse and you know only do security but they became experts of certain kind of productive activity and I completely agree that they should become experts not only to take data and make an infrastructure but the infrastructure should be able to only provide that data as infrastructure which is more guaranteed to be good data and this definitely should be a layer which should be added to the job of the trustees and rather the responsibilities of the trustees I agree with it neutrality in data space very much so there is a data governance at draft data governance act of the EU right now on consultation we are going to submit our thing which develops Europe is developing this concept of European data spaces they're very much like data infrastructures actually they just don't use the community rights framework and I will separate discussion tell you why this community rights framework is better than the eminent domain framework where government takes the data without the community right community rights framework gives a lot of inherent potential of a participative design very fact that is a community right framework and government just does not take data but no time for that yes and they talk about data sharing services being neutral services and government will only stamp them as neutral services if they certainly they said they comply to certain list so it's a beautiful data governance act if you search for it you will get it and think these kind of things should be required for any data sharing services and the data trustee is supposed to be a data sharing services it should have a very long list of compliance on neutrality. Good evening everyone and thank first off thank you to the Haspeak team for setting up this conversation most importantly I like the framing of this panel which is the motivations and the intentions behind the drafting of the framework in itself and needless to say thank you to Parminder for sharing his thoughts I think whichever way you look at it the panel was given an imposing mandate of squaring a huge circle and they've drafted one framework and have taken in feedback and applied it and come with the next one so I would like to add the outset say that you know could rose to the work being done and unfortunately I think there's still a wrong road ahead. I think some aspects have already been touched upon I will reiterate it from a slightly different perspective and due to the other panelists for their thoughts on that I think one part which I think there should be some sense of agreement on is towards which I had said the limitation of consent is I think an understood limitation on this grander larger concept and I won't get into that in detail but even in seeking for consent to anonymize data I think there's certain granularity required because there is a difference between consent for anonymization of the data for the use by the organization or by the company with which you're interacting for use by the company to give to third parties what purposes which are the normal course of business and then at a third level additional consent for anonymization of data to be contemplated within the non-personal data framework to become a high value data set to be shared with the third party which currently nobody can really name which brings me to the second point I think the consent piece if that is the bedrock at which we as a country have agreed that personal data sharing as well as non-personal data sharing is concerned a slightly allied concept of choice of the consumer may seem to be neutralized by the concept of non-personal data for example today consumer choices are not fungible as to the cheapest biscuit and I will go with it I think as we progress as a consumer understandably this is a concept which is not as permeated in a young economy or okay not a young economy but still a developing economy like India but as choices dev now there are choices made of several factors for example I we choose to work with the flipkart because let's say I believe in their labour practices I believe in their inclusivity track you can already see that on let's say choices many brands and we can get in the ethics of that in a bit but the point is the choices are more 360 degree than just an hour 30 fungible in which case I as a consumer am choosing to interact or buy my product from this particular company and in that and choosing and voting with my purchase as to the company that I want to benefit from my purchase in which case by saying that this company's data which we're saying is of value the data which is given to this company can then be shared to any other company is not something I'm interested in I'm very conscious of the choice that I'm making of I'm conscious of the company that I'm enriching with not just my dollar but also with my data and that choice of the consumer is getting neutralized to a large extent in this concept of the consumers participated now leave them out of the conversation and we can trade between the organizations as we do it I agree that there is the idea of the consumer being a data beneficiary who can then frame or influence their argument through the data trustee however one can easily argue that in a country like India we have a bunch more nuance to get into as to how that data beneficiary will intra and inter organize and organize themselves to be able to adequately influence the movement of that will there be veto power will there be majority will there be any one person is allowed to come in I think obviously there's a lot to be taken on and will be taken on a case by case basis so that is the second piece I think there's a this will kind of come back a little bit later the third piece to what Venkata had mentioned is definitely the absence of incentives to be an enthusiastic participant I think one is organizing the data properly one is even ensuring the data is up to date what is the if I as a requester even if we assume that now the question is somebody who is registered in India we can get into the foreign Indian aspect and the coloring of it and what is the line of that in you know I'm sure that's a conversation that I'm sure that has been part of many forms and ways so I won't get into that at this point but a data requester in going and making a demand of a company to expand resources to update their data set we are saying that is based on the community data aspect and I think that's a fairly nebulous framing on which any organization can go and demand an expansion of resources to this point I think it is interesting to see how we will enforce this beyond a stick method beyond you will be fined for it and so on and so forth because I think that is really the core of what we are trying to achieve over here how do we understand or if there is a way in which we can encourage positive organizations to update and work on their data sets without you know necessarily penalizing it I think we've cracked a huge part of the problem to address even existing open data sets right case in point RBI's data sets are unavailable as a co-founder of an industry body which works in the payments data space I can tell you that the data is something which we publish every month and by that I mean it's only something that we collate and reframe from what is available government data sets I can tell you a particular data set of intense value for me from the RBI has not been updated past October 2020 I'm not finding fault in that RBI has nonetheless been very caught up with things but if there is a positive framework with which we can encourage RBI to update its data sets I think that does a large part of the job without saying that only if Amazon gives all of the data sets and we crack that problem is the non-personal data problem correct which brings me to my last point I think the second last point I think the raw infrastructure example that Parminder had quoted is spot on and I think that's a wonderful framing which gives a great understanding of what the piece is in expecting good public infrastructure which everybody can benefit of in this country as a citizen the expectation is not that everybody takes a piece of gravel from their house and goes and participates in the road building right the job I've done is as a tax paid citizen I've paid my taxes and I've done my job in them as long as every company is a tax paying organization as long as the income tax laws of this country have in their wisdom made choices how foreign owned entities will be paying tax to this country as long as DIPP, erstwhile DIPP has taken a choice in its wisdom to allow for 100 percent investment in the technology sector how are we in a well-placed position to over and above them paying tax them hiring and complying with whatever foreign investment compliances such as you know hiring so many percent pages Indian professionals and so on and above that penalizing them and I quote that because I know that may not be a fair claiming but over and above that saying you still have more blood to shed for your maybe prefer to make up from for the origin of your organization or for because that's not a distinction made in the report at all that distinction is not a fair one to read into the report that it was only to be aimed at foreign entity or not it is very much expected that a large Indian entity will also be and by large this is only as a threshold of data this will also be subject as an Indian company which has paid taxes and hired Indians am I forced to also do more admittedly there is a contour of without forming harm to the data custodian but in the absence of defining that harm which the very intent that this non-personal data from the data custodian is going to enrich somebody else what is really the on a principal basis how much more can we demand from the companies which brings me to the last point in all of this I fully recognize that there's a larger problem statement which is being tried to be addressed and as it has said as there is an appreciation of where we are as a country as an appreciation of where we are geopolitically this is not something which possibly we can move in inches or miles through you know conversations and I fully appreciate that this group is that the committee has been open to feedback and I genuinely respect that I mean you know it's not click service that so much participation by the committee members is not to be taken lightly my only suggestion would be considering that this is a mammoth task that we're all trying to perfect at one shot uh which is a little difficult to do is there a possibility that we can first perfect this attempt for let's say government data sets or government funded institutes ensure that the framework works perfectly in those contexts and then extended to private entities where there all of these conversations will become a little clearer uh I think the so much data with regards to whether it's regards to government funded data sets with regards to even uh of I mean various forms of financial inclusion the state back of India if we have to get very nitpicky where so much of tax paying money has gone into it and the demands from that organization are possibly a slightly stronger putting to be able to say hey you are funded by us tax paying citizens of the country you have an obligation to create and clean up data sets and make it available to the larger country do that and I'm not saying that that obligation doesn't involve the private companies but for setting it in that framework which itself will be a very uh important and big task and then do everybody a little better feeling when we try to extend it to the private economy as it happens uh and with that I will take us thank you hey thanks I'm Arvind I I'm a co-founder of uh nityo learn so we are in the human learning space not machine learning space so it is uh we train people it's a B2B SaaS company so I first want to thank Parminder and I probably would start off I think it's not only to say where we disagree I think the first part what you said that non-personal data is huge it's larger than actually the personal data in the value that it has completely agreed and I I also think a regulation for this is necessary so the basic question of whether if you're looking for agreements here I think there is an agreement at least from my side here and I'm talking here more as a citizen rather than a very startup focused founder view and I think overall the point what you said whether larger companies in a way it is a winner take it all once the first one collects data the data becomes a monopoly and also that you have a data monopoly in one area you can actually create a data monopoly across in multiple areas and this is essentially in some way this has to be handled or you need to create enough regulation to help even antitrust frameworks to start working on here so I think one of the biggest concern many startups have and so on is also how much of this will start impacting smaller companies smaller startups and so on and I think overall the thresholds are very important and maybe specifying a little more on the size of the threshold so values of the thresholds on where it will be is it going to be really at very large companies that we are talking about where this would be mandated would be I think helping continue a lot of discussion with many because I think there is clearly a value in saying that the first and after a certain size is also having valuable data that you can use it for any of your so-called infrastructure of data if the data is collected is not sufficient in volume then that data is probably also not going to be very useful so maybe a threshold to say that this whole thing is applicable only at a certain size and the size is large enough that small players are not having to be worried about the red tapes that it would cause since rifle innovation here so that that is one of the points I was having when I was looking through this and the second point was the high value data that we are talking about is very focused on how do you use that value for creating more public good I think there should also be more points around how can these non-personal data actually be misused and how can we prevent that so the point I said around in audit and so on was in those lines are the ways that data that is used used in in some way which is today not illegal but it is something that is taking us in the wrong direction like a large amount of data used to let's say help or break democracy in some way help election results change in one way or the other or create silos and create kind of eco chambers to give people a certain set of view because you understand from the data who are in what ways so there are ways we understand now data can actually be misused as well and also the point I was talking about the experiment that we are running with a lot of people actually giving that data and a lot of personal intricate information is available to the big techs and controlling this is something not only asking them to share that information but also to see what kind of information can be collected the non-personal data and how it can be put to use the regulation should also look at that aspect of it those are the main points I think there were other points may be touched by others I don't want to repeat them yeah first of all it's a very nice session thanks for organizing has geek and I mean there clearly explain why the npd framework is required in fact this session is very very helpful and if you go through the framework itself you may not get the details like why it is needed I like the analogy of the routes whatever you give parameter I have three questions basically one is the with respect to privacy education engineering side okay if you see the nationals AI strategy paper clearly they mentioned that right academia and industry needs to work with various courses to offer for the engineering students so that right people will get build expertise and education experience and the AI and machine learning side but whereas in the privacy side in my opinion we don't have any such kind of initiatives okay in fact like in us only one university like karnigan millen university offers a privacy engineering masters okay in europe there is one one online course from switzerland but whereas if you see that indian universities right now I don't see any universities which offering privacy engineering as a in masters or in a graduate level itself so that will be one of the first thing maybe before framing any personal data framework or non-personal data framework right the government has to work with the academia and industries to right build the expertise and experience on the privacy engineering side as well okay the second one is the data consent management okay so right now if I go and visit any website it asks for hey we are storing some cookies whether you opt in or opt out and on okay so then if I am in a hurry so I will I will not read their privacy policies I will go there and then I will blindly accept it okay so then there are some right like disadvantage of doing that and then nowadays rent people may not have time and then lack of education also right people see the nowadays like me the mobiles penetration in india is too much okay and then people will watch all kinds of sites okay they don't know what data they are all other companies are taking and all right and on top of that if you ask saying that hey all the companies has to collect the consent for anonymous data sets as well then that will be too much itself right and then people may not be right when a position to right judge what companies are collecting it how they are using and all first of all my point is basically we have to educate right and then if a bank is asking for some data right typically right they will explain why they are collecting the data so similarly like the education is very much important in that way right people will come to know like why what what they are contributing how their particular data is being utilized and the third point is like serenja mentioned right I know like you may say that first of all you have to agree principally about all the regulations whatever you're talking about but it will be good to showcase some reference implementations right once you agree that assume like if you have agreed that these are the principles which you are agreeing it so then then we'll come to know like what are the typical challenges when processing these large amounts of data right and like like whatever she mentioned right like data cleaning or providing the data sets in a regular manner so then we'll come to know the real challenges and then completely agree that we should have some some framework and then I'm glad that from India we are driving some of these non-personal data frameworks and then which are much right stricter and then other other other countries as well okay that's all from my side thank you thank you so much for organizing this event some shubo I'm from a company called app nox and it's into you know application security and I am kind of a you know privacy like I kind of want to advocate privacy and that's what I stand for right so I have few comments and few questions right so one of the thing which I did not hear across through this whole you know talk is the word democracy democratizing the whole data itself right so that is not something I understand that this is not a personal data but it's non-personal data but still the data should own the data and it is the user who is owning the data so he has every right to go ahead and say no I don't want to share my data and then what happens you know does that get reflected back into your you know high value data save does it goes ahead and erase that so or what is the action taking forward or is there a way where me as a citizen I want to go ahead and you know remove anything which is related to me right so that's something which I am concerned a little bit about right the second thing which I do have in mind is the data security part of it mostly because this is the community data and which is rightly said that this is the most powerful data right because even if it is not a personal data and if I have a community data I can basically create models to mimic one fake you know model of a community because I have all the data entry points over there so what is it that you know how do these accessor of the data which is there for the you know for other companies to access these data how are you going to control those accesses is there a fair usage policy you have to access those kind of data right so that is another very important question which I do have in mind the third thing that I want to also bring into picture is the word open governance because there are a lot of data provided by our government itself right I understand the fact that finance for it has a lot of open data which which we already have but there are other realms in the government where there are open for example law has a lot of open data right so it will be better for the government to come up with a high value data set of non-personal data and then you go and then I think step two would be to go ahead and ask private companies to you know follow the same or or go ahead and the fourth is a very important point since I'm running a business sitting out of India in the whole world right what will be boundary of the data which as a private company I should be sharing in this data team right because today I may not have a demarcated data you know this set of data is based out of data which are aggregated in India and these kind of data which have data are aggregated from outside India and we do not have those kind of demarcations are those kind of demarcations you know spoken about or been taught about that's another very important thing as a startup I can think about right and one last question that I do have I just want to be concerned and trip is data sharing right so do like I think that this you know NPD policy which is coming up it should also be taken into the world stage because let's say if European Union comes up with another NPD and if India comes up with another NPD and let's say Singapore comes up with another NPD it becomes a really hassle for a small startup or even for a bigger organization to make sure that we align our data according to whatever the inputs which you want or which the other countries want right so I think there are there has to be kind of two umbrellas one would be something in terms of global but if India can push for that that's going to be an amazing thing and of course something related to India which is a smaller umbrella under the bigger global umbrella so so these are my thoughts and questions yeah so Selenia's question about consent I argued in the committee also I think consent for anonymization is already a part of PDP there has been a difference of views I leave it out but my belief because anonymization itself is a processing and processing is covered under PDP I think it's already there in the committee also I argued that that's a different matter you are talking about a more granular consent that's a big problem but I think once the lines have drawn that I give consent for only this company to use it not give it to other company and rather to give it a treaty now this creates an issue I think we will come to it I myself a believer that provenance of a data should be recorded permanently in some way without even identifying it personally it's a very complex thing how once the identification has been removed the consent structures remain and I'm ready to explore that new system because I believe in the rights of both the people and the community whose data it was and I have been a rotary of international arrangement where the source country and the source community would have the rights wherever the data finally resides and it again needs the provenance tracing which what you are talking about how whether veto power or majority how communities will take those decisions is a complex things things happen in the world they already take many decisions about forest about water they are not unknown that's why community resource frameworks have been talked about in the committee we can always keep on you know saying that it can't happen problem is what is trade off involved existing open data something which also says now there are two different questions we cannot do that that you know problem is here and why don't we do that first why government doesn't you know improve primary education before they go to higher education why they don't improve technical it doesn't work like that governments do whatever is important so we open data is a different kind of a problem and somebody says the data power is in RBI data and it is not in Amazon data that's not true 90% of data power right now is with data with the platforms governments have to do something about it while they're not doing a great work or doing a great work elsewhere or not whether you've already paid the taxes and why should I contribute to a data infrastructure data infrastructure is a very different kind of infrastructure even if government had all the money you can make data infrastructures you can make new roads you can make ports but you can't make data infrastructures because data infrastructures would not be sold by google's and amazon's first of all second they need a real-time flow of data there are many complexities i've written in paper i can share it that even if you had those billions of dollars which you can use to make roads you can actually make data infrastructure so that's a problem out there again arvindan has said that how it is i agree the threshold is very important i am myself a great rotary that it is a big data sets which are the infrastructural ones and we shouldn't be going towards smaller companies the threshold should really be good i agree with that point how can be misused absolutely a great great point it is it has to be a jurisprudence and guidelines which how communities work have to evolve there's a big potential of misuse again there's nothing in the principle which helps that misuse but this is a downstream issue which is very very important that how data can be misused the duty of harm has been put and all these high standards of duty of harm etc required a community right to data because if you don't have a community right the harm uh harm thresholds are very low it's a normal harm you can't harm but if it is my data the harm thresholds become very high it is a data about south indians living in certain area and their food habits the harm uh by establishing their principle the right the data is theirs increases the harm threshold jurisprudence very high and within that a lot of things can be worked out so the principle has been made the data will always be remain and i just want to contribute one important point why and saranya also this is a very new kind of asset it is intelligence about me it is intelligence about people like me and there's a moral philosophical argument that intelligence about me and people like us should be somehow controlled by me so there's not this old time period i completely agree with you so my point is yeah sorry physical property you know we can say something saranya you think no no please go ahead i didn't mean to so this is not that old-fashioned physical property thing that my house has been taken for your road intelligence about me intelligence about people like me whether it is a certain you know sex category or whether it is a cultural category is and this is just laying a principle and then equity and fairness will come into the kind of issues which you are describing describing that people who have done business also need their their entitlements uh democracy is what srinivasan said education and completely agree reference implementation method that's why they have said first do health data health data globally there is much more agreement that their community rights to health data there's a whr framework the other frameworks on covid new one had come so that becomes a ref because there's a greater agreement about the public interest aspect of health data so once we do that we will be much able to do education transportation and that's why we are trying to say use a pilot that's going to be the reference thing but unless you make those laws and principles nothing moves sugo you said democracy but just one thing you know democracy we don't understand that is also a community right and individual rights i vote in certain manner but my constituency votes in a certain manner i accept that representative that i cannot keep on saying no no i want to be able to say what this guy says in the parliament but many things in democracy are bunched by community and collective decisions so democracy is both the individual right and and the community right uh open to the question which srinia said we have already said outside data and inside data you said and this whole question of provenance problem comes and i think like finances we have now forced to keep many documents right i have to keep it you know you know my my logistics documents my supply documents it will become complex because if data is underpinning the value system then it's auditing its requirements will be complex and therefore be some complexity about inside data and outside data are at least separatable in the same way we wanted it separated later than harmonization globally you know there's a last point i want to say we have always been norm takers we never made norms and there's a recent paper by eu which very clearly says we want to be norm give us not take us on digital policy so india can also get up and say that i would not wait for a gdp i to come and then do something i would give the norms first we will negotiate internationally we are working it for changes working with the south center is trying to develop a global model law which is pro south and pro developing countries and then negotiations will take place and improve but you cannot wait for the us or eu to give you the global format you put it the ball out first negotiate it but that's what this time we're trying to do so i wanted to quickly thanks so much for minter sorry you were saying something you do you want to very quickly respond to that and then we'll wrap the session up absolutely no i think uh permanent points well taken i think the couple of um uh caveats is the first of all i i completely take the point that uh it's not a matter of uh unless and until we um do it uh for x we can't do it for y i take your point on board i think it's a question of uh just as you mentioned seeing an implementation with all of its nuances uh and considering there is still a negotiation there is strong feedback on the basic principles of whether or not those uh pieces can be or the data can be demanded from the private institutions on the same footing as one can demand from a government institution the thought process was getting it implemented through a government institution where there is no where there is large consensus on the rights and the demands to be made that was the thought process but i said that i think uh in as much as india taking um taking this bull by the halls and saying let's be the front runners in this conversation there is complete value to that uh the only thing is um in 3.6 also i think of the framework i'm not mistaken i think you enumerate the interests in the pursuant to which this framework is pursued uh which is the direction towards innovation and towards uh benefits um so far uh and maybe this is putting a two final point on things the economic benefit to the community i'm not able to see very clear example i'm happy to take this offline and possibly there is some context to be added on over there uh but nonetheless uh very glad that this conversation is uh moving forward and thank you so much for your responses thanks it's a small um just a comment so just wanted to define democracy uh democracy doesn't always mean deleting data so democracy meaning currently let's say my father has not signed up on any of the social platforms or anything and today if he signs up for one and now his data is now being available in across all the data aggregators and across everything so a democracy in in choosing to share it or not is what a democracy would be right so that's one small thing which i wanted to see the the other thing which i also have been hearing and just wanted to comment on this the you are coming the the european union is coming with non uh you know non personal data uh act and everything but till today as far as i know i'm not sure what's happening internally uh you know we they still couldn't categorize that this is what a non personal data is and that's something which is still up for uh you know debate and and a lot of uh you know speculation but uh yeah so and i'm happy that we are also doing this right so so yeah just two comments which i just wanted to bring in and do the points