 Hello, everyone. Welcome to today's talk on Data Stewards Trust and Data Stewardship. We have with us Aastha Kapoor and Siddharth from App the Institute. Both of them are going to introduce us with this new concept of Data Stewardship and why it's being discussed within the data governance circles a lot and how this entire idea of data trust and Data Stewardship works. Aastha? Yes. Hi, everybody. It's so nice to be here. As Srinivas said, Siddharth and I will take you through what we call Data Stewardship as an idea of data governance. A few things before I jump into it, you know, in terms of a disclaimer, we think that Data Stewardship is new. It's being discussed, but it remains in a certain way at an idea state. So real-life instantiations of Data Stewardship in its different forms are few and far between. So do regard everything that we say as something that is largely at an idea or pilot stages. The second thing is that, you know, this is a very, very large sort of subject. So we will, in the interest of time, be sort of touching upon some examples. We will go through the logic for why Data Stewardship is important and then run you through some examples or rather models of how Data Stewardship can be, you know, implemented, what are the different ways in which it can be or how it can be structured. So that's the format of this conversation. I'll jump into it. So, you know, one of the things that we've been hearing as part of this sort of ongoing COVID health crisis is how do we share data better. And we believe that, you know, there's a huge demonstrable value for sharing data and health is one of those sectors. So there's an example that is out of Switzerland called NeData or, you know, where, which is, as you'll see on the slide, they pull data from various sources. They hold it in this NeData cooperative. And then that cooperative is inhabited by the people whose data it is. And then this data can be used for research and overtime for the development of new treatments, health treatments, for instance. So the NeData example right now is working on multiple sclerosis. It's working on pollinologies. It's working on some kind of obesity treatments. And the way it works is that you sort of as an individual sign up for the NeData platform, which is where your personal data is stored. You are now part of a cooperative, which takes these collective decisions on, you know, where the data is going to be used, what kind of, you know, governance mechanisms will be allowed for certain kinds of data. And, and then whether there's a value to it, et cetera, will also be designed by the cooperative. Only individuals have access to personal data. Whereas when it goes out for research, it's all aggregated and anonymized. And to us, this sort of NeData cooperative, the way it's structured, which is that it facilitates the sharing of data and it also affords individual certain rights over their data sharing is a data steward. And, you know, it has a huge amount of public value because it allows data to be shared for research for development of treatments over time. It's a very small project, but it is something that exists in the real world. And now, particularly in light of the COVID-19 health crisis, it is becoming more and more relevant. Another example is in the space of mobility. This is an example that we found in Seattle in, you know, it's governed by the University of Washington. So what this does is that the centerpiece, which is the transport data collaborative, pulls data from mobile, not mobile, mobility service providers in the city. So, you know, it's anonymized data that is shared by your Uber. It's largely for two-wheelers right now, but soon to be expanded. So it's Uber, etc. Other two-wheeler providers will share data to the transport collective. And then the transport collective aggregates it, reports it, shares it back with the city. And so it helps the city make better decisions around transportation. It also shares it with researchers, as well as, you know, helps, you know, researchers who can then innovate, they can, you know, they can be startups that can help build better options for mobility data, you know, discoverability for individuals. And it also allows individuals through the Seattle app to discover their transit options better. So all in all, because, you know, the transport data collective is, you know, it converts the existing data into synthetic data to ensure privacy. It returns the data to the city in aggregated forms with insights. It gives the data to startups and innovators and researchers, and it also allows individuals to discover transport options much better. So again, the TDC in this case is a data steward. And you'll see that in both cases, there is, in the ME data case, it's personal data, whereas in our personally identifiable information, whereas in the TDC, it's, it can be both personally identifiable information as well as non-PII. So this is just the two examples for where we think that the steward is useful. And its value is pretty demonstratable, which is that it generates value across stakeholders, as well as safeguards in the shape that it exists right now, the rights of privacy rights of individuals and communities. So just on that, but obviously we've seen that there isn't a lot of data sharing. And that happens for multiple reasons, right? Like I said, there's privacy, obviously, the societal value of data and the fact that this value can be heightened because of sharing is not necessarily understood. The trust in the process of sharing is extremely low. So right now, data is largely shared through data sharing agreements, which are often pretty dubious. So, you know, there is no mechanism through which data can be shared in a reliable, trustworthy, responsible manner. And data stewardship may be part of that answer. The incentive to share is low. So, you know, the government and the private sector don't necessarily trust each other with each other's data. The governments are moving towards mandating data sharing, which obviously impacts the private sector in the way that, you know, they don't trust that process at all. It lies in silos, in companies, within the government. It's disaggregated, it's incomplete. It's also incompatible with different systems. So, you know, that's also another huge disincentive to share data. There's a huge amount of friction to do so. And again, like the overall problem is that there's no framework through which data can be shared safely, actively, responsibly. But there's a lot of conversation that's happening now. So, you know, there's a Shikrishna committee, which talks about community data. It talks about data as a natural resource that needs to be shared. Then there's, you know, conversation on account aggregators and we'll touch on this a little bit more. But it's a regulation to cover data handling and practices for sharing financial data of individuals with service providers. There's a national data sharing and accessibility policy that came out in 2012. There's more conversation around it. And then there's obviously the mention of community data in the national e-commerce policy and then the committee on non-personal data that was also set up last year. So the government in India is talking increasingly about data sharing data. It's happening in different ways. And there is an effort to evolve certain kinds of governance mechanisms, certain kinds of definitions of what data can be shared and cannot be shared, what protections exist with data that is shared. All of this is right now sort of being considered. And that's why it's an important moment to talk about, okay, if we are going to share data, then what are the responsible mechanisms to do so? And again, coming back to that, we think data stewardship is part of that answer. So just sort of jumping into two important conversations that the government of India is having. One is on non-personal data. Now, it's not that India is the only place where non-personal data is being discovered and discussed. The EU, in its latest data strategy, has also put out that there is a need to make data available so that there can be innovation, so that there's a greater public benefit. This is also in the ideation stage because the strategy came out, I guess, in April or maybe early. But there is a sense that there is data that is, in some cases, an idea of sovereignty. But we need to create data spaces which will be managed, which will be possibly stewarded, but also regard data as a collaborative resource and maybe even push into the idea that data in itself, because it's generated by us, by our communities, is a source of common and should be shared. India is also similarly, like I mentioned, doing this process of defining non-personal data. We've got mention of community data. We are also looking at thinking about public benefit and questions of privacy and anonymization, governance, and accountability still remain unanswered. We're hoping that the non-personal data committee will be able to answer some of these questions. But one thing is for certain, data is going to be increasingly shared. Then, as I mentioned, there's the account aggregators. These are consent managers for financial data. They've been sort of in the realm of imagination since 2016. The RBI has also granted, I want to say, 11 licenses to our two companies who are developing account aggregators. But basically, these are consent managers that move data from those who want it to those who need it. And they are a data blind exchange layer. But again, we believe that there's space for account aggregators to be more than just a data exchange layer. They can be broader consent managers that actually work on behalf of individuals to provide some kind of advisory services on what data should and should not be shared and actually have a more expanded role as a steward itself, looking to actively collaborate, share data in a privacy-minded manner. So that's the sort of broad context of data stewardship. But before anything else, I think it's important to define it. So a data steward sort of sits in the middle of data principles, users, individuals, communities, and provide data. And these individuals and users provide data to the steward and they need to be protected and their rights need to be safeguarded. Data requesters are people who might want data. So for instance, if in the case of the media example, the researchers want data to push their thinking, so they could be the data requesters. In the case of account aggregators, it's banks from who you may need services. Data fiduciaries or holders are people who have the data. So, you know, Uber has my mobility data. So they are data fiduciaries and they will provide users and companies data generated so that it can be protected. The data steward sits in the middle of this relationship. So, you know, it works with users and fiduciaries to aggregate data to make it make sure that it's safe for sharing and pushes it out. So these are some of the sort of responsibilities offered data steward. It actively seeks to collaborate. It is the manager of data. It defines the usability guidelines so that different types of data is shareable. It's in the same standard. And most significantly, it intermediates on behalf of people and communities whose data it is stewarding. So, you know, it becomes a point of negotiation. As in the case of media, it allows people the space to sort of express their own desires or aspirations with regard to data. I want to share it for this. I do not want to share it for that. It should be shared for, you know, developing vaccines, but it should not be shared for something else that I don't believe in. So it allows for that intermediation between, you know, technology companies that want our data and us. So that it's a question of control and collaboration, which is important. Why do it? Why do data sharing through a steward? Well, it is a way of ensuring trust in the data sharing process. Fundamentally, users can exercise more control. You can control the purpose for why data is being shared. Privacy is possible. Otherwise, users invariably have no control over how the data is being shared. And I mean, it's not that data is not being shared. It's just not being done in a collaborative, transparent manner. It's being done through data sharing agreements between companies which are beyond scrutiny, which are beyond control. And also, as I mentioned, it allows for some kind of connect collective bargaining. So if all of us pool our data with a steward, that steward now has the ability to negotiate, as I mentioned on our behalf. So for data principles, which are people like us, it's a huge sort of tool of empowerment. The data requesters also, for instance, if Facebook wants our data or whoever else wants our data, it is also usable because there's a reliability quality of data. It's efficient because the steward is doing the idea of managing and sponsoring the data in a certain way. It also reduces the cost of accessing data because you could go to a sort of data steward that is looking at, I don't know, my asthma data or whatever else, or certain specific kind of data and requested. So it becomes the cost of data collection for requesters also becomes reduced. So we think that there are four possible kinds of data stewards. This is a not exhaustive list, but we believe that this framework manages to capture the types of roles the steward can play and also the myriad of stakeholders who can manage the access and use in a certain way. So I'll explain this is this sort of framework. So on the vertical access you have, you know, who is the data accessed and used and who decides the access and use, I can do it myself, a nominee or a steward or a legally defined trustees and we'll come to each of these models in detail right after. At the bottom, you have the role of the steward, which is, you know, the steward can just be a data flow mechanism, which is that a pass through, as I mentioned, the account aggregators that just move data from point A to point B, just as a data exchange layer, you can have somebody who stalks and flows the data. So, you know, both is a holder, it aggregates it, it, you know, it brings it together as well as allows people to share it or shares it on your behalf. And then there's the data stock, which is just a sort of data wallet or a personal data store where I can keep my data and then share it as I as I feel comfortable. So in a certain way, the data stock is a very individualized experience, whereas, as is the data flow in a certain way, whereas the data stock and flow is something that allows for that collaboration, collection and insights. But again, I think that one thing is to keep in mind is that not all data sharing models are equal or data stewardship models are equal, they are defined by why they exist. So, you know, creation of social societal value is a big, big thing. So if you are like me data, you are a non-profit, you want to make sure that, you know, what Poland disease does not exist in the world. So you are setting up a data store for that. So the values of a me data are going to be very different than, say for instance, a bunch of shipping country companies have come together, they share their data and that is to generate commercial value from it. So their responsibilities are totally different. And then the final bit is, or to empower individuals, which is something like a personal data store that allows me to store my data, to control it, to monetize it in ways that I think are feasible or important or, you know, something that I care about. So the intent, again, these are not mutually exclusive categories, but the intent is something that defines the governance models more deeply. Now I'm going to hand it over to Siddharth to take you through the different models of stewardship, the ones that we mentioned here, to just sort of click out and explain what they are and what they're not. The other thing that I forgot to mention is that the data type is also important because in most legislations, sensitive personal data, personally identifiable information and non-PII are all dealt with differently. So again, that is another thing that defines the kind of, you know, data stewardship model you would take. So you would want to have a highly protective model for sensitive personal information like health data, whereas you may not want a highly protective model for something that is of public value, that is, you know, anonymized non-personals such as mobility data. So that's just something to also keep in mind. Siddharth, over to you. Yeah, so speaking about the models of stewardship now, that is to say, when we say models were not talking about, let's say, different contexts of governance or even purpose or use cases for that matter, but we're talking about here models in terms of the broad design of data sharing, which is to say, I think in one of the earlier slides, Aastha had described how data might be a stock or flow or both and also in terms of where the authority or the permissions or the decision-making flows from. And that's also linked to user consent. But just speaking about the kind of design and the structure of how stewardship happens, the first model that we kind of identified was the data trust. Now, the trust here is the data trust here rather is an intermediary, but not in the legal sense of an intermediary with all the baggage that it comes with, but rather a body that is operating on fiduciary principles to share and store that data for a certain agreed purpose. Now, the idea of a data trust is also, in many cases, that it is created or set up for a specific purpose. So the principles or the policies guiding the activities of this data trust would be according to that purpose. So a data trust set up for medical related purposes, for example, might be designed differently than one that's designed for mobility data and research on that. And so those are the policies that would be tailored according to the purpose of that data trust, but suffice to say that the idea here being that the data trust operates on fiduciary principles and is taking decisions. Yeah, it takes decisions on, in that sense, it has the control of decision making of sharing this data further as opposed to the user having direct control over it or it being directed by any other third party. Can switch to the next one. Now, the next is data collaboratives. The idea here and also in our analysis this is kind of analogous to what are generally known as data exchanges. The idea here being that data is pooled by third parties. Now these third parties might come together in terms of just a horizontal contract or it might be a collaborative in the sense of one of the organizations has data, the other organization doesn't have data, but rather is focused on usage of data in certain ways and these two organizations might collaborate or it might be a closed link contract where a few parties get together in order to share data with each other for some kind of collective benefit. Either both kind of examples exist in the world and the purpose of these can, the purpose of these could be a lot for varied kinds of either public or private benefit. For example, in terms of data exchange I might be a company engaged in a certain sector of business let's say and I find it useful to share my data with a few peers who are also engaged in that same business and we have some kind of collective benefit from that arrangement. Now on the other hand a collaborative might also be a situation where you have a municipal entity which is working with a private company, so some kind of public-private partnership let's say, so that kind of situation where the municipal entity is using some kind of aggregated data and such for either for policy making or just for executing its functions better that kind of arrangement would also in its larger design fall into a data collaborative or a data exchange. Now the account aggregator model is I think something we came across mainly because it is something that has a lot of policy, existing policy tools in India as it is and it also has a certain a prominent example in Europe through the X-Road system and in India we have the account aggregators as is. So the RBI released regulations on account aggregators that kind of set it up in this design as well which is to say that the you have a central steward in that sense aggregating sharing controls. It aggregates the information on which users have consented to share the data with which third parties. Now in the Indian example if we're going to go into that these are known as financial information users and financial information providers. Now as the name kind of suggests the providers are the ones who have their data and the data requesters as I explained are the financial information users. Now sitting at the center of this is the account aggregator controlling the protocols on sharing and consent. So the idea is that the end user which is the customers interact with the account aggregator and have consent controls on what data I shared with which third party. So I have some data if I have some data with a certain service provider let's say by either my telecom service provider or my financial service provider. I have some data with them which is pertinent to me. I might choose to let them share that data with another third party let's say an insurer or a medical health provider or or somewhere where I need to get a license using certain information that I need to provide. I could choose to share data with this third party through the account aggregator. The idea is also that the account aggregator provides protocols and uniform standards for data sharing amongst these third parties. But there are certain differences of design which for example the account aggregator model in India for example only requires direct contracts between the steward and the third parties. Whereas the European model which is X road requires contracts between all three third parties individually. So if I wanted for in that earlier example to share my financial data from one third party to another those two third parties would need to have a contract between them. I mean there are also other differences which is that in the Indian example the user is only given the option of consent and nothing else and and the requests are made by the third parties and not by the user which is also which in some ways is also the case in the X road example except there is less user control over there and it's more direct a common platform for data sharing. Yeah so and the last one is personal data stores. Personal data stores are the model which kind of give users the most control over here that the sharing and policies are kind of to a certain extent the controls of course are tailored by the user and the policies are kind of designed so that third parties approach the steward to access data for whatever purpose they may need and these purposes are kind of made available to the user as well so if the users have a bit more have greater control over what aspects of the data are shared and for how long it shared and consent revocation which which adds a lot to user agency essentially and these models are so few of these models exist in terms of the solid project started by I think started in Europe and we and I think a few examples of let's say the apple wallet loosely would fall into this as well where the users basically have control over what data sets are shared and how so that's the four models and in our report earlier this year and hopefully we should share that later but it's called understanding data stewardship and this is where we've gone into depth of all these models along with analysis of examples of each model we outline certain early principles or early understanding of principles on which data stewardship should kind of function and some and we've mapped them out here in a in a kind of early understanding of it which is to have some kind of a participational representation of course which is just what I touched on in terms of user agency and user control over their data because the idea of data stewardship is of course to move away from a model of I guess third parties having a model of data sharing where users have no expression or no participation in terms of how they are being affected by this data sharing the idea speaks to how users relate to the data in very kind of very real ways or very consequential ways and how it's how it's shared should be a bit more easily controlled or at least a bit more visible nothing else and that's and that relatedly speaks to representation as well which is to say that user it's all well and good to have that information but you need to have some kind of control over it as well and these models kind of have different ways of making it work for example a data trust may not give you as much direct agency or direct control as a personal data store but you can design a data trust so that it works for certain specific purposes that are aligned with the interests of users and speaking and a bit more on the governance aspect is the accountability and transparency which is which I think which I would say applies across models because the transparency is a matter of again just again it's a link to accountability but just the users knowing what's happening with their data and accountability also speaks to modes of redress and grievance redress it's it matters in terms of how it's regulated or the laws or the policies governing it how do you enforce those policies because I think a question that we did not address uh majorly in our report but we are I think thinking through now is the question of how do you design the governance principles for these entities and I think that speaks to a larger principles of data protection larger principles of accountability and data sharing right I think that speaks to other debates as well on privacy and on participation and public accountability so these are questions that we kind of intend to address going forward as well and yeah this is kind of what I've been talking about here in terms of user participation in terms of the structure and design something I could touch on a bit more is now is conflict of interest which is to say that so one of the arguments that we have received for example when we've kind of spoken about this idea of data stewardship is that you're just introducing another point of failure in the system which is I think a fair criticism to say that you have all these companies that are sharing data in unscrupulous ways what's to say that the data steward is not going to do the same now the obviously the idea of the steward is to again reorganize this pattern of data sharing the idea of the data steward is to give some expression to users and provide more strength to user rights now the data now data protection principles already do that attempt to do that to some extent of course in the Indian context we don't have them codified yet in terms of what's enforceable but the principles of data protection do are very heavily linked to user expression whether it's in terms of deletion of personal data whether in terms of modification whether with or just pure transparency measures in terms of notification right being notified the entities with whom your data is being shared being notified about what's being shared or why it's being shared and what exactly is being done with it all of those things speak to transparency and and those are things and these are interests that need to be aligned with the idea of the steward or with the purpose of the steward pardon because the idea of setting up the steward needs to be that for the purpose that has some public benefit right for that's the the entire idea behind using data for good of a public good in that sense is something that is kind of relevant here and in order to do that you you make sure that the business model and the financial models for the steward are not designed in a way that the intent of stewardship gets compromised and that's the very important kind of structural design and a governance question that needs to be always kept in mind and lastly on sector and data type specifics towards so the idea here is that this is a design question or a yeah it's it's it's it's it becomes it becomes an important question legally and structurally because you may want different kinds of regulations for different sectors you may want different kinds of regulations based on what kind of data the steward is handling for example if it's handling highly sensitive personal information you may want to regulate that differently from data collected from let's say speed guns in public places for example that that data might be viewed differently than sensitive personal information the idea is to have stewards that are geared towards specific purposes and just to sort of end with the fact that this is very much you know if we are at the exploration stage we are as that mentioned starting to ask and answer some of these questions and also finding examples of instantiations of different kinds of data stewardship models so we've got a database that we keep adding on and will soon make public we are also writing you know a couple of papers or one on obviously this basis of data sharing itself how do we think about data is it common is it sovereignty is it just contractual and then also on as I mentioned on the conflict of interest we are trying to think about what possible revenue models can exist for data stewards such that they are sustainable but also not necessarily you know making money of the data that they are stewarding and are responsible for so that's and so these are also questions that we would love to discuss with the group and or you know and and we'll also hopefully present a work in the future thank you thank you Asha I think there are a couple of questions that are already coming but if you are on youtube or watching it live elsewhere please ask your questions on the chat and we will point them to Astha to start with I think there's one question from Dvij Dvij I think I can unlock unmute you do you want to ask a question directly to Astha you're allowed to speak now sure thanks and thanks for that presentation I just wanted to ask if there are any standardization efforts around these protocols I mean I know kind of a lot of this thinking is very preliminary right now but has anyone tried to think about you know how these protocols could look like from technical sides what kind of you know data entry fields might exist even in the case of say simply financial data sharing or data stewardship protocols I don't I mean from what we've learned so far they don't necessarily exist everybody sort of there's no collective protocol at all or standards any model that's experimenting with this kind of data sharing through sort of intermediary or steward is developing their own protocols around it part of our effort is in in this research is to actually align some of these protocols and build up what we are calling the principles of a good steward so that it's something that anybody who's building a data steward can ascribe to this will include technological protocols as well if I could just say something about standardization I think I think we're at the point of standardization of I think the models themselves right now or even the structures around those models I think the kind of institutional structure that carries out these functions I think is where we're at in terms of just kind of fixing on a common idea because there's because I think majorly what's being discussed is data trusts are discussed a lot and we have account aggregators here we have data collaboratives which are kind of a significant idea in their own right so and these kind of overlap to different extents with each other in terms of their function in terms of how they're governed in terms of just how prevalent they are and to take up a little to kind of take a stab with the technical aspect many of these initiatives are are small projects in the sense of pilots yeah either pilots or even just entrepreneurial efforts in that sense and the nature of such efforts is also that it works in a proprietary manner so you have your own kind of setup which works for even for data for good even for data for like helping some kind of public purpose you have your own protocols and controls for it when it's a startup or when it's an entrepreneurial effort so that's the stage for a lot of it at speaking from a kind of I don't know a top-down approach of standardization I think we're at the stage of agreeing on the concepts and principles right now is what I would say Hi, so sorry do you want to add more Rasta? No no we can move on to the next one okay cool so the we have one question from an anonymous attendee he's asking are you looking at data trust the same way of Wendy Hall's report of thinking about data trust as a way to develop artificial intelligence? I think this is in relation to the UK's parliamentary report if I'm not wrong yeah yeah I as far as data trust are concerned sorry Asha you you want to take this okay I think as far as data trust are concerned we are looking at that for sure in terms of just giving it context that was a useful kind of point or to see how data trust are operating and I think it is relevant it's extremely relevant because the idea of data trust there is yes two kind of reorganized data sharing for purposes of public good now I think it's important to kind of note that that are different like political forces at play here because it's it's kind of you have you have all kinds of regulations and you have all kinds of political efforts trying to regulate data I think it's important to look at that context in terms of things like the digital tax and stuff and this kind of is contextualized in a larger politics of data protection and data sharing in that sense but as far as the idea of data trust is concerned I think it is definitely a useful formulation of that idea which you'll find in those UK reports parliamentary reports and based on that the open data institute is also has done and is doing some good research on that aspect as well on how to kind of just look at data trust and like I said it's an evolving conversation so now they're looking at how to build so to speak data institutions and I'm not completely clear about what the divergence of both of those things is but the idea being that you have an institutional design that is able to reorganize data sharing for public good right and but yeah just to answer the question in short it is relevant and and as I think we we may diverge from data trust in terms of that report because I think the that report the idea of that data trust is obviously coming from some kind of very much a top-down legal perspective or a state perspective in that sense but I think we I think what we look at as a data trust is a governance layer based on fiduciary principles yeah and just to add to that I think that it's important to note that you know a lot of these sort of data stewardship models where there's collaborative exchanges trusts even you know but account activators but all of them are have a bunch of different interpretations coming from different parts of the world and part of our effort has been in some ways to align and and sort of harmonize some of those definitions so the report that's that mentioned actually relies on a lot of existing work and is an effort to find ways in which we can align the taxonomy through a study of the use cases so Wendy Hall's work ODI's work also the work of you know scholars such as Sylvie Delacro and Neil Lawrence who've written a fabulous paper called bottom-up data trust which is really really useful and has and and the work of Sean McDonald including also the works of people like Bianca Wiley who have been you know sort of at the forefront of that citizen resistance against a certain model of data trust that was being imposed by sidewalk labs are or have all helped inform our understanding of what we think are possible principles for responsible data sharing there was one question on how do you consider Aadhaar here or do you think it is an aggregator or a personal data stock would would Aadhaar fit the personal data stock but I guess there are issues of governance or control that UIDI provides to an individual right yeah we would not regard it based on our definition that it is a personal data store to us it is a digital identity as it is for a lot of people but yeah we would not regard it as a personal data store Sifat what do you think I think in terms of personal data store I think what's more analogous to the idea is the digi locker which which kind of gives your which which gives the user more a control over what's how that data is used and shared though I'm not fully informed of how the project works I think Aadhaar is is more of a identification project from what I'm seeing it as and and it's less it's I don't think it's linked inherently to the idea of data stewardship the way I see it okay I think that's it we have one last question of Priti who's asking us what is the threat of big tech like when it comes to data sharing how can one circumvent their influence is this something that can go into the thinking of data governance in policy making so the question is okay I think the question is rather who's going to build these data models right like we did has to be the private sector especially the big five Google, Microsoft, Apple, Facebook or Amazon or can it be the government or can it be say even a citizen collective like we haven't seen any citizen-led data trust have we I mean most of the examples that you put involved which were pushed by the private sector or by the government yeah so I think that and maybe Siddharth and I disagree on this but I think that there is space enough in this data sharing space right now for different kinds of trust or not trust stewards for different kinds of purposes so you know it's possible that you do go to a health data steward that's working on as we mentioned media data that's an entrepreneurial effort or a nonprofit that is working on research you may choose to also go to a data steward that actually and this may be a problematic assertion but just is that allows you to commodify your data and draw value from it you may also go to a data steward that is willing to negotiate on your behalf and I think that different functions of the steward will define its governance and who builds it and I think that there is a huge amount of space and something that we would love to explore and see which is already being discussed is the idea of you know these data cooperatives these data unions that are stewards that negotiate on your behalf and our community built and community led in a certain way and negotiate on your on your behalf as a way a citizen representative would do yeah go ahead sorry finish no I was just going to say like and and I would not believe that big tech should be building this in a certain way because there's a huge amount of obviously conflict of interest that cannot be resolved also I think that as we mentioned a lot of the answers and questions and sustainability are important to this how does how is the data steward made sustainable is important and we've been exploring you know ideas such as stewarding of the commons in the way that LNR Strom talks about and is there a way that if it is indeed considered as commons is that a way to think about sustainability there's another possibility of thinking of data as you know labor or union and then considering data unions as part of that and then finally if you think of data as an asset then then maybe the role of financial intermediaries and the way that they charge commissions on transactions could be something that something that can be considered so I think that as we've been saying there's us you know there's a question on the kind of sectors what kind of questions you're answering for the the specific purpose of the data steward is is is also something that came up on this yeah so just to kind of the question of whether big tech should do it or who's going to do this I agree that there's there's a problematic element of of a conflict of interest which I think needs to be sorted every time you're setting up data steward in that sense but I think there is overlap of these principles of governance and on the specific versus the specific task of data sharing and how to do it now data sharing a lot of projects are being taken up in terms of either open data or otherwise by by in that sense big tech companies but I suppose the what's required for a robust steward is is the political will behind it to kind of create it which is why you have a lot of discourse in in places like the UK and Canada looking at these structures of data governance because I because I think there is there is a certain political will for it as well because it is it is very much aligned with an anti-big tech or an anti-US centric interest which I think is relevant in this debate so I mean yes to answer that question shortly I think I think yes there is space for that kind of perspective to be relevant but I also think that there is the important question of incentivizing data sharing and when you're trying to walk that fine line between incentivizing data sharing and empowering users you need to show value and you need to show benefits being accrued to these users in order to make it a viable suggestion Venkat had a question his point on hey thanks for the presentation actually it was very insightful when I look at it from the perspective I swim in this data world and the implications of this trust are actually pretty profound so related to that if you a couple of questions one is that where is this conversation happening today and whether it is in India or elsewhere who are the people at the table as of today second when I look at it from the again from the trenches there is in most a market seem to be oligopoly market power that can be if the existing players for example let's say Ola and Uber get together price fixing is the simplest thing that will happen so the conflict of interest transparency and some of these things are very hard problems given the monetary incentive for coordination of the data and sharing of the data that exists today any thoughts on the first one where is this discussion happening because a lot of details have to be worked out and second the this whole focus on conflict of interest I heard something but I worry a lot because you can obfuscate in any number of different levels yeah I could take the question on where the conversation is happening I think you are right it's largely driven by a certain kind of west as of now but we see that increasingly the idea of sort of appending data governance in a way that it empowers people and gives them more control is something that we've you know noticed happening in different parts of the world whether it's you know India South Asia Africa so I think that I think that it's important to to start to think of these bottom up mechanisms of data governance which I think is an opportunity that data stewardship provides and you know sort of offer alternatives to how we've been thinking about data whose it is who controls it how it's shared and who draws value from it so I think that increasingly the I think the broad basing of the conversation is happening the models are something that we have to consider for ourselves wherever it is and we need to be in some ways localized and contextualized so that so that the governance of the so that the governance can be something that makes sense for at a lower level in a certain way so that I think is and just on the conflict of interest I absolutely echo your concerns I think that these are questions that need to be considered more deeply there needs to be you know there needs to be a few things that come together in terms of alternative revenue models there needs to be a governance check there needs to be a political check to make sure that that the DECA steward remains true to what its main function is which is to you know represent individuals or communities and and sort of intermediate on their behalf so I absolutely agree that that it is it is a huge concern and towards what Sahab mentioned earlier that it is it does have the risk of just creating another sort of mount of power that is that has access to your data so I think that I mean just to iterate that that's something that we are very very acutely aware of and considering deeply as well so that did you want to add to that the only thing I think I have to add is just in terms of that question of just who has a seat at the table in that sense I think that's a very relevant question I think it's it speaks to a lot of what we've been discussing or a lot of what we're looking at right now in our research in terms of just the governance and institutional design for the for data sharing right I think you can have any of these models and all of them may work all of them may fail but I think what really paints success or failure for any kind of data sharing project is the intent and the where the where the kind of where the power lies in that sense or where the political will is coming from or where the controls over the data sharing or where those where the design of how data shared is coming from and who gets to design that and who gets to say that I think these are all very fundamental questions before even entering the specific question of data sharing for that for that matter even data collection as is as as as it stands today I think these are also relevant questions right and yeah and and I think it's something we've tried to caveat because this for example what we've spoken about today are are models of data sharing in terms of the design of or the patterns of sharing rather and I think what we always try to caveat is that these these need to be kind of ensconced in solid governance models and solid principles respecting privacy respecting user rights respecting you know these principles of conflict of interest and public interest and and I think it speaks to what the previous question brought up as well that these are all institutional and governance questions that need to be always kept in mind while designing these processes I think I get to ask you one last question so often these ideas are discussed in silos right like I mean we're talking about how to store data simply or how to share data is how do we store share and how to govern it but often if you look at the institutional hierarchy say take the case of the account aggregators itself RBI has been a closed institution for a really long time it doesn't matter who the government is for their own independence and that's the way the institution has run so when you look at implementing any of these models at any of the existing organizations you tend to take the baggage of the existing organizations policies right it's not like you can't suddenly undo them you inherit them so where would one look at placing these inherent issues when something like this is designed right like I mean you could be bringing I think the try is trying to work on something similar with telephone data right they're trying to share some of these datasets or become a data trust in itself so try on the other hand I like is a really good transparent institution which actually brings in your sort of experience right so when you look at RBI versus try both of them find institutions in their own respect but they have different kinds of policies so what would you think when a different ecosystem or a different data model comes into play in different ecosystems do you think the ecosystem will have those issues because the governance structure has those issues absolutely I think there's no escaping the baggage that the institutions come with which is why we're trying to flag these institutional questions as being relevant I think I think the idea behind bringing data stewardship at least the most benign way of the most benign kind of motivation for bringing this into the public realm is to reorganize how data shared then to tackle the various problems that already exist today regarding power concentration regarding privacy and how those two problems are inherently linked to be honest and sure there are other problems like policy making and so forth but there is no escaping the institutional baggage that we have even whether nationally or trans nationally and I think yeah it would be remiss to ignore that and just bring in a kind of what we see as this ideal all loosens tied up model which really isn't the case I think you need to I think you need to start from the questions you are asking Srinivas I think you need to start from the institutional nature of how data is already being used and shared and I think you need to modify that to bring in these principles of fiduciary responsibility of bringing in these user controls over sharing and and you those being informed of what's been done with that data and purpose limitation you know the right to delete your data and control and duration controls and such and these are things that are I don't know very basic and we need a data protection law in the first place before we can start really meaningfully talking about designing data sharing systems right because what is what is a data sharing system without data protection principles it doesn't mean a lot so so I yeah I mean there is no I don't think there's a question of ignoring existing institutions I think we need to look at how they operate and modify them yeah Asla yeah no I just wanted to add that absolutely agree with you know what Siddharth is saying and also with your questions like we get into this knowing that it is a difficult place with a huge amount of friction with a huge amount of baggage but also to what Siddharth was saying I think the starting point is also a huge amount of you know user information and awareness on what is possible where are the controls and then as we were saying like building up certain kinds of bottom up governance mechanism where they are possible that might sound a little bit utopic at this moment but you know a potential data protection law does give users certain kinds of rights and those rights can be a way of solidifying more more equitable more sort of you know more responsible methods of data sharing which then over time lead to a certain kind of empowerment that we are imagining thank you for that it was a lovely session I hope anyone else who has questions can reach out to you separately but on that note I had this comment I think there will never be a perfect data store whether it's trust personal data store or what do you call it the institutional politics or conflicts of interest and accountable issues will remain and we will have to strive to ensure that it will be fixed over time thank you Astha thank you Siddharth for joining us today and we will end the session now thank you so much thank you