 Hi everyone. Thanks for joining. This is Conversation number seven. I think in our series of conversations with data practitioners and Of course, our series is called making data science work Which is all about the elements of what it takes to to get data working in production with all of its messiness and complexities and today's conversation Very interestingly is now starting to focus on some of the things that you might have otherwise considered at the periphery But something that it's becoming extremely relevant very very quickly. So this is around privacy law and data product design And so in a few months India will be passing the GDPR equivalent law, which is PDP or the personal data protection bill and So data science teams obviously tend to be the most extensive user users of data and their work impacts Impacts these organizations at a huge scale and this starts to raise a number of different questions. So We're very glad to have with us today Sheenidhi Srinivasan and Shivangi Nagkarni. Let me introduce them. So Sheenidhi is a senior associate at Ikegai law working in the policy and regulatory teams She advises clients on compliance with the IT Act and India's Upcoming PDP law and also on policy positions and strategies and data governance cloud computing and cyber security She previously worked at the Vidhi Center for Legal Policy Advising Government Ministries and regulators on issues related to the digital economy She also spent a year at as research faculty at the Georgia Tech Schellers College of Business And she holds a Masters in Law from Columbia Law School where she focused her program on privacy and technology It's a very very relevant and Shivangi is the co-founder and CEO of ARCA. That's a RRKA She has over 24 years of experience in the domains of information security and risk management E-commerce and networking across multiple geographies She's earlier worked with organizations like WIPRO and CIPI handling a variety of roles and heading different lines of business She was instrumental in setting up the first licensed certifying authority in India in association with Parisine She's authored the privacy book of knowledge for data for DSCI for that VCPP certification program And she's a regular speaker writer and faculty at various forums and programs Shivangi has a bachelor's degree in electrical and electronics engineering from Bitspilani and a master's degree in management from IM Calcutta so We're about to get started and I'd just like to give a shout out to Hazgeek and Fifth Elephant under whose umbrella We are being able to bring you this series of conversations And of course Venkata and I are from Scribble Data We are an ML feature store company focused on helping Businesses to feature engineering in a streamlined and fairly accelerated and trustable way With that I'd love to get started Welcome everybody and so a question to to Srinundi and Shivangi to kick us off both of you are on mute by the way Why is this conversation timely right now? PDP has been brewing for a while Why are why where are we feeling the fire now? What's changed? Sure. Thank you so much Indra. Thank you for organizing this and for having me here And let me just start by saying we've been having these conversations on the data protection law in you know Different silos, especially in the policy circles I'm very happy to be here with the sort of data science and product community Which will actually have to implement a lot of these changes So like you said there is PDP has been brewing for a couple of years This version of the bill that is currently in Parliament was introduced last December and it had been Referred to a parliamentary committee. So it's being currently wetted by a parliamentary committee The committee was actually expected to submit its recommendations now towards the around the monsoon session But that's somewhat got sidelined because of covid but things seem to have picked up pace again They've had they've held depositions with you know industry associations and other Private players a couple of other sort of government folk also deposing before the committee on on what needs to change What are the key concerns with the bill what it should look like how it will impact the industry all of those things? So it is expected to like things have picked up steam on that front and it is expected to pick up these things Do seem to be going back on track now And and so the bill is you know a fairly comprehensive detailed data protection law Which is expected to significantly affect the way businesses collect and handle data about individuals So if it were to pass in the near future, which which is likely to happen I mean that this year or the next You know that there'll be a very short uptake period for businesses to adjust and align themselves to that law and sort of There are a fairly comprehensive sort of obligations both structural behavioral changes There may be changes needed to you are UX to product design to back-end tech architecture Just on legal internal processes Involvement of different teams and players within an organization So there will be a whole bunch of organizational changes that are required So it's somewhat units very important to sort of start having these conversations now around what needs to change So that businesses can be better prepared just as a reference point also the GDPR, which was notified in 2016 Gave it's gave companies two years to comply came into force only in 2018 made But even then it seemed like that wasn't enough of a lead time a lot of businesses did seem to be grappling sort of the week Before it was going to be enforced on what all are the universe of changes that we are required to make internally You know, can our product models even even fly under the new GDPR regime? So all of these are questions that require a fair degree of lead time and thought to be gone into And I know I know Shivangi can speak to a lot of sort of the compliance Perspective on how long does it typically take to even comply with even an information security program takes a while So privacy would strike at just the core of data processing operations Yeah, so I think I'll stop there so that Shivangi can also jump in Thanks, Shrinidhi and thanks Indra Yudha and Venkata for organizing this itself and thanks for inviting me I think Take adding to what Shrinidhi said I think it is already late. I mean honestly, I don't think You know, we have the luxury of time anymore Sure, the Indian Bill is around the corner. Let's say, you know, it doesn't happen for another year or so okay, even then I think every other country has a law and Most organizations, maybe they started the journey with GDPR because GDPR sort of shook everybody thanks to the penalties But I think every other country we are the last country in the world to offer, you know, size and You know relevance in the global scheme of things to actually think of I mean bring out our law, right now Given that and our cast focuses on actually implementation. So we have a fair Idea of the kind of challenges that are faced on the ground and somewhere We've you know seen that curve also sort of happened and we realize that in the beginning, you know people first of all, there is a There is a need for what I call you know Conceptual clarity and contextual familiarity to be associated with privacy most people first of all think it's a security problem Right. So then they have this a bit of a moment where they realize that it's not a security problem This is a completely different problem by itself Then they then there is a phase where people realize. Hey, this is like hitting at the core They talked about where there is in you know, it sometimes it goes into an existential sort of thing because Organizations are oriented towards not ever questioning the kind of data they collect and process They're only oriented towards saying what do I do with this data to maximize value out of it, right? Now here is coming a paradigm which is forcing people to say, you know question Why are we collecting data? Are we really using it for what it is supposed to be used for and are we you know deleting it after that? Problems with nobody's thought about now this goes back to the core So the first initial pushback that we typically get from business teams is that You know so that wrong perception it starts from there and then you have this big journey of trying to sort of you know Then start complying the other Misconception that is there is that you know, this is just a law So hey, you know what my legal team will give me a checklist and I just have to take off on maybe You know 25 different things and I'm done It is not, you know, it it goes back to actually the hitting at the core Which means that you go back to questioning and reprogramming and resetting some very core processes and very core You know infrastructure and applications that drive your business today Which means that you it is not something which will get done in couple of years Sure, if you're a small company of 100 200 maybe 500 people, you will probably do it in a year or so But any large organization or you know enterprises looking at a three to four year horizon So let's get that perspective and I think that is very important for most of us to understand when I I gave it talk a couple of months back on the Implication architectural implication of some of the provisions in the PDP law and I was accused of being very What stopping all data science and all all data work because the the provisions are so Broad I mean forget me for example applies to pretty much all data that you have in your organization If you don't even know where they where the data is you you got a challenge right to begin with Before that Before we further jump in she needy can can you just briefly introduce the the top level? The entities and the concepts that are there in the law so that all the audience is on the same page as the rest of us Just a quick. It's not a substitute for a full reading of the law just enough to be able to follow Sure sure I think the starting point is you know who does this even apply to Essentially this extends to all businesses that collect store or use a process Which is a fairly widely defined term it includes any element to do with our data which which which handle personal data Personal data is you know data about individuals or which can be traced back to individuals So it will include your obvious identifiers like name address phone numbers contact details But also in direct identifiers like unique device IDs or IP addresses or even Instances that can be traced back to an individual So you say you next likes to buy a certain brand of shoes that is important personal data Which is also within the scope of this law So any business that essentially handles this kind of data is covered Which is you know practically every business that interacts with the user and there are a range of obligations then associated with data use On so the entities that are implicated again Are there's this there's this I won't go into too much detail I think because that that film might that might sidestep the whole silent conversation But you know there are data fiduciaries who define how who define the entire basis for processing This is why we are collecting and processing the data all of those decisions to do with data are taken by data fiduciaries And there are data processors which will be your third party vendors typically who are processing data on behalf of these data And a range of obligations applied to data fiduciaries and there's a core set of our data protection principles Which are reflected in almost, you know every other regime in the world like purpose specification or data Minimization or privacy by design or individual participation Just just a couple of examples that I could give you are you know Purpose specification would be when you collect in process personal data It has to be for very clearly defined purposes that are communicated to the user at the time of collection If you have an over-broad purpose like say simply saying research might not work anymore Data minimization is you have to think about whether you need a particular piece of data or not for the for the purpose of which you're processing it So if you're building say a payments app, you really need microphone permission Or if you can make do with pin code, you really need location data So a lot of like I mean these are very top-level examples I think if you dig deeper there'll be so much more so much more which strikes at the core And I think everything that that that everything that is being operationalized is through consent so all your data processing has to be on the basis of user consent and Historically, you know organizations have tended to rely on a broad catch all agreement to the terms and conditions or privacy policies That might not work anymore if it has to be because the law says that the consent has to be clear and specific So and more sort of standards are laid out on what that looks like Which will require you knowing essentially your entire data lifecycle and being able to map that out and making sure that Each part of the journey is supported by consent or some other legal basis for processing. There are also individual rights So individuals should have the ability to access that data are corrected if needed If they might want you to tell you who you've shared it with They can ask for that there are data portability rights as well If I want to shift my entire social graph from one social media platform to the other I should be able to do that under the law and Essentially like every business also has to have a privacy by design policy Which is a construct which essentially says that you've embedded privacy into every part of your business or product design So at each stage of the data lifecycle You've thought about user privacy and you are able to demonstrate that if you're a business category Which is called significant data fiduciary if you process a large volume of data or even if you process data Which has some specific? I mean which has a potential for significant risk of harm all these concepts will be further sort of Distilled it's all fairly new conversation around what is risk of harm? You can if you're classified as a significant business or significant data fiduciary You may have to do data and protection impact assessments You'll have to appoint a data protection officer have your data practices audited So these are just a couple of broad sort of this is what applies to and broad obligations that one needs to watch out for and These can require fairly large-scale changes to internal processes in the way a business collects and holds So it sounds like a lot of doom and gloom for a number of these companies who have so far I mean, you know, we've matured up to a point where we're able to use this data But I see one silver lining and tell me if I'm going Going right here, but If the rest of the world has already if the rest of the developed world has already adopted these standards for us to have Level footing with them. It seems like this is only going to be this is a minimum requirement But also something that signals credibility once we go through the pain of this this journey it was signal credibility to partners customers Anybody that we do business with in in in the EU in the US my right about that this helps put us on a good playing field Yeah, yeah, no, definitely. I think credibility and reputational risk also I mean outside of the law, there is also value in sort of thinking through compliance from the lens of privacy and And you know building that trust with users because at the end of the day I think people are also becoming somewhat more aware about privacy and you know data protection and the way That businesses are holding there and using that data at the back end Especially there's a data breach and it's found that the company had really abysmal data practices Even in the absence of any express legal mandate that severely can hit a company's reputation right that reflects poorly on On just customer relations as well. Um, you know, I can one example that comes to mind is there was this hotel group in the US which Which had three separate instances of breaches data breaches or hacks Over a few years and you just don't have a federal data privacy law They have a regulator the FTC which looks at certain instances of data breaches But but yeah, they were subject to intense regulatory scrutiny after that and just the cost of Responding to those requests for information responding to the lawsuits that in itself was such a significant time and resource Sort of draining exercise for them. If they'd had better processes, perhaps they have been in a better situation Yeah Shivank you want to add something to this? Yeah, in fact, I think it is you know in addition to what Shindee just said I think it's just smart business moving forward, you know, if you want to play on the global You know scene you have to be a privacy smart Organization in whatever you offer, you know Instead of being reactive and you know then putting in bolting in stuff later It makes sense to be proactive People look at Apple right Apple is using it as their positioning pivot and things are literally changing So if you take any sector You take the marketing and the mark, you know, the whole at-tech cycle is going through a complete Upheaval and sometimes, you know, it is like it happens literally overnight. So The the browser saying not supporting third-party cookies It the implications are tremendous or Apple saying that they are going to, you know, restrict the use of that ID Apple advertising ID has far-reaching implications and suddenly, you know, it's like the The carpet is pulled off from under your feet So it makes sense to therefore be proactive and sort of prepare for it In fact at an anecdotal level a lot of the enquiries that we get are actually what are called emergency Inquiries and I feel that their emergency is man sort of manufactured because you know, they would have There are entities who are talking to a client They go through the whole hassle of a sales process and then they finally reach a stage where they think they've closed the deal And then hey, suddenly there is, you know, the compliance group It says but you're not compliant with the basic privacy requirements And then it suddenly sets them back by several months because they think that okay This is something which I can just go and you know get done in a month's time It doesn't so it just is just makes smart business sense to be prepared and prepared now If you want to play in the global scene, so as we said forget the legal compliance bit I feel it's just the you know, what I call tail wagging the dog The most important thing is do it for your business Yeah, and just on this thread of sort of global operations You for instance now has you know, the GDPR which is fairly stringent So if it's an Indian business, which is transacting with an EU business They contractually in any case there will be a whole bunch of privacy compliance obligations that you are because their data Controllers have to pass on those obligations to contractors as well and Europe has a stringent adequacy regime So they can't transfer data outside of Europe unless there are certain protections in place There's this whole thing about the EU US privacy shield where data transfers between the two countries have now Come to somewhat of a standstill because because of because yeah It was struck down by a court in the in Europe saying that there aren't adequate protection in the US So similarly with India as well without without those baseline protections Just being a global participant in these data transactions becomes all the more challenging we saw in our own business right we touch a lot of We do data preparation for machine learning and recently we added a customer from Mauritius Their first question even before they knew anything about our product was You know, are you going to be compliant? Right. So this is the the second thing the point that Shivangi said that there is a broad convergence as far as across the legal Jurisdictions about data protection now there are minor differences between them but broadly at a principle Level there is there is a broad agreement. So it makes sense for us. Companies cannot escape this anymore, right? This is no longer about finding loopholes and Or setting aside some amount of money to to be a be a fine or anything. I remember that As recently as about three or four years ago, there were companies in the Bay Area who were Telling the fact that the data that they held Was now an item on their asset asset list, right? It was in from an accounting principles perspective They could actually assign a value to it and I'm thinking right now that if those same companies Don't have the processes tools and mechanisms by which to be compliant with these things that same data can start To become a liability you hold all of this, but you don't have this the kind of safeguards and the guardrails that one would need I I can see that the you know, the serious risk that that company holds from a If you I was To ask Shivani about this if you look at the impact of this Law on the life of a data engineer or a data scientist or a data product owner How do you see the the processes and the the program management aspects of it? How do you see this changing do you see Privacy as an add-on to an existing program or you know privacy by design has to be Something that has to be part of the DNA of the product design itself Um good question. See you may start off by What I call doing the quick fixes because you're trying to comply But I think in the longer term privacy by design is absolutely necessary So if you take a step back and look at where does Privacy impact data as such right so there are three layers So first is trying to figure out this whole data classification bit And how do you incorporate it into all your data and the organization? So you have you know, what we call a personal data sensitive personal data critical personal data Then you have children's data and every law sort of slices and dice is it in different ways, right? So you have to accommodate for that because you know, you may Whether you're needing to comply with one law or you're needing to comply with 25 laws simultaneously, right? So you need to be at some point build it into the design because you can't keep applying band-aid and dealing with this, right? The second is this whole thing around What I call purpose. So, you know, we just talked about purpose being actually the pivot, right? So you use What you collect and use the data for is Whether you're sticking to the purpose for which you have got consent or any other sort of okay from the end user, right? Now to ensure that at the data level, what are the kind of controls and mechanisms and processes and Architecture that you build to be able to support that because it's one thing to have it in your policy It's one thing to put it up on your website to say that hey I will use your data only for these purposes, but how do you actually translated downstream, right into into your actual Business operations and the data scientists have a very critical role to play over there And the third is, you know, a use case a specific use case in the purpose scenario, which I wanted to bring up It takes we talk so much about AI ML One fear of I would say apprehension that a lot of and there's a lot of discussion around this is that You know, you collect data for a specific purpose Or and you know commit to the usage of a specific purpose, but you know running it through your traditional You know all your algorithms can actually create a new purpose by itself So how do you then marry that new purpose that has emerged with the original purpose and how do you track it and balance the whole you know fairness and And that's something which I don't know if there are answers But one needs to think and figure it out because it's not a trivial problem to deal with Right. Um, and for this also, I feel we need to look at data architectures itself Right given that you have different laws in different formats around the world How do you how does it impact your architecture so that you're able to comply with different laws in different you know sort of Ways without having to go back to the drawing board at every single time you come come up with a new law that you want to Uh comply with so I would say there's a lot of thinking to be done First, you know throwing up the problem statement themselves and then being able to figure out solutions for that And I think it has to be done at a community level. I don't think individuals will Will be able to it needs all our intelligence to put together for many Absolutely. I can I can see the the whole model development process itself changing. I can see significant education component One other thing that I see developing in the ecosystem is The ikigais of the world which are actually In some sense providing some guidance on or providing guardrails for this whole thing because Engineers can't always Interpret the legalese where we like precision, but most a lot of the words are very very broad for example, you know Erasure, what does what does it mean? so How do you see the the ecosystem developing in terms of the different players who will help achieve this overall compliance with the With this law Yeah, I think I I echo what Shivangi also said and what what you've just pointed out There are so many sort of different aspects to this. There are the legal teams. There are the product business teams All of whom are sort of involved in this conversation around data Just in terms of so from from our and also we've started having a lot of these conversations around What these like these are at the end of the day PDP is also a set of principles, right? But how it's operationalized and how it's Translated into product design what changes will go into the ui ux will be sort of a mixture of legal technical experts sort of coming together and figuring out maybe through codes of practice what that looks like So this codes of practice concept is Recognized under the PDP bill also saying that the regulator will issue these codes So these could be in the form of like just the industry coming together and deciding that say this is what Consent should look like on a ui on the app itself or say a payments app. This is what it should look like I mean, it's it's hard to anticipate or envisage the kind of granularity. This can go into it could be any number of products But but just sort of you know throwing out a few themes there Um, and yeah, I do see a lot of sort of privacy engineers also cropping up people who can actually understand the principles And then implement those in the tech design tech architecture part of it um So because um, I also understand at the end of the day like from from a legal lens often we say that Um, we can say do data classification do this Inventorization process But that can be a fairly involved process that involves back and forth between a bunch of teams and actually doing these those interviews discovering where all your data resides in different Work workflows like that in itself can be a fairly challenging process So maybe these like there will be a crop of privacy engineers who who can who understand the law But can then implement it at the back end and you know translate that can sort of bridge that gap. Um, there's there's also sort of You know, there's this concept of consent managers, which we be under the under the bill itself, which is Entities that will assist sort of that lack as a facilitator between a user and and say the Data fiduciary or the business itself which can show you maybe this will be in the form of a dashboard so that it becomes easier for users to also understand who What where all have they given their consent? So people who can design those sort of products as well and just people who can design these privacy compliance tools There are shawangi can of course speak about this a lot more because they've worked with these tools but sort of You know just understanding again this this work through identifying where all your data resides or or just A bunch of stuff around that even of course the lawyer We we we also do a bunch of this stuff sort of marrying that Law with the tech design part, but actually then implementing it will require, you know, that engineering folks to step I was going to ask Sheenati, do you foresee that at some point certification agencies will start to crop up? and I you know, I'm thinking about that pillar of consent that you talked about earlier, right, which is country like india which is Experiencing this geo broadband boom where everybody and their mother has access to a host of new services That they're signing up for willy-nilly what does PDP Finally mean to them. I mean, how will they pass that information about what they're giving their consent to? So I guess this is this is a question in two parts one is Do you do you foresee standardized certification? And to any thoughts on how all of this is finally going to be made digestible to the layperson like me Yeah, I I do think although I mean so like we have these parallel certification regimes for security like your iso 27,001 There are certain privacy Certifications as well say the DSA has come up with and in the u.s. There's been one NIST framework for privacy management There could be these certification bodies also that crop up under the PDP bill itself There are auditors that are to be designated and they may conduct data audits and sort of sign off that at least This bunch of these processes are in are in order Uh, so that that could be something there is also this concept of a trust score under the bill itself So maybe, you know, assigning agencies can develop which which say that this is an organization that has a privacy score of 7 out of 10 Which is somewhat more digestible to an average user perhaps than you know, complex Delination of this is what the privacy policy says sort of an independent body coming in and saying that here here We've done our sort of audit and here's where they stand on these different parameters. Uh, maybe maybe that that's something that can be Uh, uh, that that might crop up as well. Uh, but but yeah, I think these are all sort of nascent conversations still we don't have Uh, a lot of thinking or a lot of organizations that are that are building these certifications or even these sort of compliance, uh, you know, these trust score type type type businesses So yeah, see how these are actually in the next coming months. No, but you know what when you say that this is nascent Um, I I mean I I strongly feel like if it doesn't hit puberty soon We may not have the luxury of the two year window that GDPR afforded everybody especially especially for two reasons one is if one of the drivers is Um, well, I I can see three drivers. Anyway, one is the greater good of the consumer, right? If you want to catch up two years When everybody else has already implemented this or or or move forward two years might be a luxury gdp are being the the forerunner they might have taken some time to figure things out two is if we want to Compete on the global stage with other other countries if you want to be able to offer our services on equal footing to other countries We may want to do this really fast Um, and three is and this part might be a little bit uh controversial is the other aspect of pdp where the government can potentially um Quickly hard up lega, uh, what data they feel Under some variant of the patriot act, right? So for all of these reasons, we may not have the luxury of uh Of companies taking their time like these conversations that it seems to me like they need to accelerate So to that end I want to ask Uh, shivan give it There's an audience question Okay To the next one. So there's there is an interesting couple of questions from shrinivas and um It is all around how institutions develop. So the first one is uh, intra organization institutions. So this shrinivasan is asking You know, once we start, uh, taking privacy into account and if you are conscious, um, a lot of Folks in the companies are going to ask questions about the purpose and about, uh, the methods um How do how do Organizations, how do these institutions like is there a review board that to which you can go or Uh, uh, how do they resolve these kinds of internal Uh, conflicts and the second one He asked is around external institutions, which is that uh, this trust scoring institutions or whether it is the Uh dpi and and so on um The agencies how do How do these develop over a period of time? What should people expect to see? Yeah, I can address. Uh, definitely the first one. Um, so if you see parallels in, you know, how security programs have developed or even It programs have developed a lot of mature organizations tend to have governance committees You know, uh, at the top which are typically cross functional Leaders from cross functional teams who are uh, who who form like the board for that? And I would expect that even privacy is something which will typically rest with one of these, uh, You know, existing committees or new committees will get formed like a governance kind of framework So governance is something which as organizations mature typically set up for all programs and similarly it will happen for privacy So I don't See for see as that as a major challenge, you know, it's and there are there are playbooks available for this which one needs to follow Uh external, uh, definitely there will be uh, things are growing and especially in the Indian Context, I think it will take a little time to mature but it ties back in with what In the youth brought up which is about certification Bodies and you know ecosystems around certification Maybe at a different level And it's very closely merged, you know interlinked with what is happening globally, right? We're not in a silo So, um, I I'm sure we will figure things out along the way Um, it the good news is that many people there is There is structure in even the law that is accommodating for some of these challenges, right? So right from uh, you know, not just the regulator, but they made You know, she touched upon this whole issue of setting up codes of practice Having, uh, you know, sort of governance at the industry level along with the whole certification mechanisms checks and balances will emerge That's what is my anticipation Like in any other industry So you need any thoughts on staffing of the this regulatory bodies? Uh, yeah, no setting up the regulator will be definitely I mean a challenging administrative task because I mean data is such a routine business activity that if because the regulator also has the Uh scope of hearing complaints So there you may need to set up one in each state or you know different different bodies at at different multiple levels Uh, that that that will definitely be something. I mean that that would be a big area of Focus I think just setting up that architecture For for complaints redressal More than anything else if that is the idea. So maybe something along the lines of the consumer forums as well consumer dispute redressal mechanisms Yeah, yeah, so just so let me ask this question Shivangi earlier you talked about Some companies coming to you in emergency mode Let's assume for the for the sake of this hypothetical that Uh, somebody's heard our our little talk and there's a fire now that's starting to burn under their feet or any other and Where should they start where if you had to advise them, where would you advise them to start looking? What are the low-hanging fruits that they should start with? What are the milestones that they should look? um How would you advise them to the steps forward basically? Uh, yeah, I always uh, I mean we always at our guard wise people to start small and start with the data Right, so start with what personal data you're dealing with get an understanding of that because that's the crux Right on unless you have a view of what is what is the personal data you're dealing with what types How is it coming into your system? Where is it flowing out? Is it crossing borders? What is it being used for? You know What are the applications it resides on how does it flow around until you build that view? You are never going to be able to then figure out what needs to be done with it Right So first build that view and then and don't try to build it for the whole organization always say pick a small Business unit or process or team or geography. What have you whatever makes sense Do it end to end for that try and implement? You know the privacy elements on top of it in the process Everybody in the organization comes up to speed and then you know the subsequent sort of Scaling up across becomes a lot easier and you know, everybody knows what needs to be done as they go along Instead of so, you know a term frequently used is don't try to churn the ocean Um because it's not a problem that you can solve overnight So it's best to accept the fact and accept the reality and do it in a structured manner Down the line once this becomes a law Um, how much do you think? Uh, the interpretation of penalties will be based on intent The showing of intent to do to do good versus Actual breaches where they're not able to comply I think that's a question first really Yeah No, I'm happy to take a stab at that. Um I think overall the tone and tenor of the bill is also like I said it does seem to be Based on principles So the ability to show that processing is fair the ability to show that you have a privacy by design approach a lot of that is uh Actually, I mean being able to demonstrate that privacy is a factor in your decision making or in your risk matrix It's just another business risk. Perhaps, you know, like all the other business risks that you that you might factor in While making any product a business related decision But but but kind of embedding that In in your systems Will will put you in a better honor probably stronger footing when it comes to actually Like the core of it seems to be accountability. So you having that sort of Taking on that owners of having these conversations having your data mapped out Having those records to show that, you know, we've taken consent for this particular piece or or having regular audits done or just those sanity exercises of of making sure that Internally also once a particular process is over we look at what data we have and we will delete extraneous stuff from our systems Um, all of that will actually also, you know, bring down your risk of An attack at some final time or a hack. Um, of course, I I don't want to conflate Privacy and security here, but but just I mean breaches and and hacks seem to be Something that most people are familiar with and you know, there is a lot of completion between privacy and security when it comes to that Uh, but but just just in terms of, um, uh, if you if you have all of those processes If you're able to demonstrate those records Definitely, I mean in investigations or if anything does come up if there is a complaint against you that does put you at a stronger Uh does give you somewhat of a stronger basis to make your case Because at the end of the day, I mean, I was also, I mean, you know stuff like data processing should be fair Uh, there are n number of ways to interpret that as well But but your interpretation should also take into account perhaps the kind of data that you're processing Uh, the sensitivity of the data, for instance, if you process and hold a lot of biometric data Which is sensitive under the bill Uh, then there may be increased obligations on you or or just just the Amount of thought that you give into privacy will be a little more than if you're just processing say Only device identifiers and nothing nothing else will just force them So, absolutely In all of what you've said actually that both of you have said one of the the aspects that seems to strike me is Um, it is it is almost imperative that the business know beforehand the the use cases that they that they're going to Gun for meaning You collect only that much amount of data. You really define what the minimum amount of data you collect you enrich and you store But so much of the competitive edge is when you look for patterns when you look for newer opportunities that may not have been Possible right at the very outset By which time according to the sum of the tenets of pdp You might have deleted the data or you might not have Sort consent explicit consent for newer business cases. Is this Is there a way around this? Are there any provisions to help with this? Um, there is this concept of a sandbox. I don't know if the sandbox concept can be used in the you know To deal with the problem that you just articulated. Srinidhi. Would you be able to better? sort of Yeah, sure There there is that concept of a sandbox, but but i'm not entirely sure how useful Uh, how how relevant or useful that could be but but that does give Businesses which enter into or agree to participate in this sandbox will Uh, will be exempt from purpose limitation. I believe if i'm not wrong Yeah, I think it is purpose limitation, which is sort of I mean purpose specification world over There's been some conversation around is it completely at odds with this whole concept of big data analytics? Uh, because like we've been discussing, you know as you go deeper There are different use cases that you discover. So what do you do if at the outset? You hadn't clarified that to a user from whom you've taken this data Um, so perhaps that sandbox is is something but that that will not be available to Everyone like it will only be available to those businesses that have, uh Gotten their privacy by design policy certified and are agreeing to participate in that sandbox and that's also time capped So it will not be endless Just on I think one concern that a lot of people have also flagged is that consent seems to be the cornerstone of this Law as opposed to the gdpr, which has other other legal basis also So so just going back to an individual for a fresh consent sometimes is Extremely, I mean onerous it adds friction to the process, which a lot of you know Product teams can vouch for like just asking that might lead to consumer drop off as well Um under the gdpr there is this concept of legitimate interest So if you're processing data for legitimate interest, of course, you still have to show All other principles are complied with but you don't need consent then as the primary ground Under the pdp. There is some some sort of uh reconstruction of that in the reasonable purposes ground So stuff like fraud prevention or or credit scoring Uh network information like if you're if you're if you're processing for these purposes You might not always need consent you can call back on that reasonable purposes Option and I'm wondering if that is something that can be and the regulator can add to these purposes So I just wonder if that's something that can be explored as well because you don't want to stunt or cycle innovation That makes sense and if you have some latitude in How broadly you can define that purpose meaning today you have some sense of what that purpose is and tomorrow I mean pdp itself is is fairly broad right when you say that there are these bases So imagine if somebody said that I want to be able to offer better products to my end customers I mean, I I'm trying to figure out how much of this is is going to be between Lawyers on two sides versus how much is going to be how much is going to fall on data science teams? And it sounds to me like that is going to be an evolving discussion So we have one question from sucana um How would the law treat the synthetic data? Do the Purpose limitations or consent or any of those things are If we just make up data right for our modeling testing I can take a stab at that With all my with all my legal expertise. I can say that I think that because this there's no personal data involved I don't think that there should be any Some platform Yeah, I mean if it does to if for example, there could be a slight overlap in the sense that Let's say it reflects the distribution of the people in the real world and that itself is biased right Oh Venkata, you make a very good point because I think this leads into this other fire that is now burning Which is about npd And I would love it and if either of you could speak about this this fire this storm that's coming Where my broad understanding is that a lot of data that is not at the granular level of being personal Things that are that may be either clustered or not necessarily to do with individuals Are now soon going to be on the open market to even out playing fields and can either And on yeah, no if either of you can just speak a little bit about this fire that is in the offing Uh, yeah, I can that's a whole other can of worms. I think Yeah, yeah, but but so data, which is I mean there can be broadly two categories of data ideas identifiable and non-personal data and and there's this uh Government committee has come up with certain proposals to facilitate or encourage this sort of uh non-personal data sharing So there'll be a new law which will be proposed for non-personal data There'll be a new regulator for non-personal data as well And just to give you a sense of what non-personal data is like there are different classifications that they make that Raw data and process data, which is like insights And just raw data, which is a raw data set, but it's anonymized So any anonymized aggregated data that you hold say if it's a if it's a food delivery app And and it has aggregated personal data of everyone and it masks those identifiers And it's just sort of anonymized data in that sense that that data set then should be freely shareable for To anyone on request And and there's a whole bunch of nuance there on who this is applicable to it says that there'll be data businesses That can be registered with the authority these data businesses are determined on the basis of volume of how much data you process Um, and and then you have to make your you have to have these meta data sort of directries where you there'll be a directory Where you say this is all the data that I process which will be available on an open access basis to everyone So another startup or anyone essentially even a competitor could look at that and request you for that data set Um, and if it's raw in the sense that you haven't added any value to it Uh, you may be obliged or required under that lot to share it So it's a it's a framework for sharing of that data, which is Historically thought of as you know business data and and this is I believe this is sort of Unprecedented in some sense because across the world. We're very familiar with personal data Especially because of gdpr and you know all the other data protection regimes But non personal data is somewhat new and so far Is it fair to say that if and I I'll stand try and stay brief about npd But but as far as non personal data goes if this open market for all non personal data Uh, actually takes effect then, uh, the biggest Players the ones that have all of you know, large volumes of this data if they have to give that away um The asset is no longer the raw data that they hold the amount of data that they hold it is Now that ability to enrich that data and do something with the processing of that data and whoever In a particular domain is able to do this more accurately in a more targeted and and domain specific fashion They're going to sort of uh, take the lead there I think yeah, I think you Rational that that like that that does seem to be a recurring theme across the report as well on like that's the underlying idea But of course it has it doesn't There are a bunch of I guess problems with it just in terms of you know, well-appening regulators because the competition commission of india is Sort of that that's their mandate as well on addressing exactly what you said of you know These big businesses or or that have access to large volumes of data Fair market competition all of that is its remit. So new regulator. Then what is exactly the regulator's role? All of these are open questions right now To speak in to communicate in the language that uh, my community understands Data is not a mode anymore Right, your data is available Potentially because of the npd law to your competitors. So it is the strength of the modeling and the enrichment that is the That is going to be your value act or your the protective advantage Absolutely And you were uh, I wanted to come back to one part which is the the career implications which a lot of The audience of this meetup is interested in you mentioned about privacy engineers you mentioned about Program management privacy program management individuals uh, any other designations or functions You see coming within the organization touching data at some level um I can just start off with auditors as well or data auditors and this will be a legally required So legally and this is our structure because the bill talks about data auditors being registered with the authority. So that would be another piece This will not be model auditors. Right. We are not talking model. We are just talking about uh data auditors Okay Yeah, I anything involving fairness Or other aspects of the data Are you targeting particular communities or algorithmic fairness accountability any other Dimensions you see coming Yeah, I mean plenty right from you know, the whole thing about algorithmic biases being built in That's opening up a whole can of worms. Uh, there is the whole issue about how do you manage The life cycle of the data itself, you know retention and the regulatory sort of impact on it So for example, let me give you an example if you just take employee data as a data set, right? Um, it has uh financial data. It has health data It has data to do with let's say pf in india and you know, it has your identity It has all kinds of data, right? Each of that, you know, so there are multiple regulators under which each data set falls And so a while a privacy law says something like, you know, you Will delete data that is no longer required But every regulator has a different sort of norm and it's different in different countries So now if you have to uh, somebody invokes a right of saying, you know, right to erasure or the organization has to implement this right to You know delete data, uh, I mean Erase data after it is no longer required. It has to be done kept in keeping in mind all of this Thing, you know, uh requirements Which is a huge thing by itself because it's not something which can be done overnight because people don't Haven't done that mapping now as a data scientist You have to take two steps back and say I need to build this into my design Because you know, today I am present in xyz countries tomorrow I'm going to add 10 more countries and I can't go back and tell my business that they have to go and completely Re-architect my model. So, you know stuff like this will emerge and Therefore privacy by design becomes so important because you kind of need to decouple the law And compliance with the law from the design itself and design for long-term and design to accommodate the Do you see No, I just had a thought on this fairness and sort of algorithmic accountability Um, so one of the I mean, so there's this right to explanation under the european data protection law Where you have to tell people the logic behind decisions. They haven't imported that in the pdp as yet at least Uh, having said that there is still some sort of element of fairness Like there is one principle which says that data should be processed in a fair and reasonable manner Uh, we I mean distilling that into actual practice. I'm not sure how that would be interpreted There's also data protection impact assessments. There may be an element of fairness there But I think this concept or conversation around algorithmic explainability is also happening in different Different parts of the government as well like niti. I have recently came out with one working document which calls for you know, public auditability of Certain ai systems ethical committees to be set up for any ai using organization So outside of the data regime also there might be more developments around this time Good stuff We have about uh four minutes. So I would Like to ask both of you Um for our audience today comprising number of different data science professionals as well as folks that have sort of a PNL responsibility for data science initiatives if you could leave them with one or two takeaways uh, does anything come to mind sort of Primarily to mind I would say that uh, your jobs have gotten far more interesting than It has introduced many, you know, completely new parameters um, so I would look at this as an huge opportunity because uh, you know It is something which has added dimensions which enriches the whole thing And I also feel that as individuals we don't often get to see uh too many paradigm shifts in You know while we are working, right? I mean, this is something which is happening at a at a rapid space place. So It's nice. It's you know, so it makes sense to look at it positively. Look at it as You know opportunities and sort of ride that wave because uh, in a way if you step back and look at this from a holistic perspective You realize this is like almost like an opportunity for complete creative destruction You know of uh, what we are doing and we look at things afresh um, so I mean that's that's all that I can say and uh, you know, learn to live with ambiguity Wonderful Yeah, that's such a nice thought I almost don't want to add to that because it just I mean I think I echo that sentiment if you look at it from a purely legal compliance lens You might be missing out on a lot of the fun that you might actually have the product design when it comes to sort of embedding privacy into the Uh system. So yeah, I think I I echo that sentiment on very interesting times ahead I'd like to add something and then you guys can tell me if if I'm uh, If I'm not a hundred percent, right? There there is no more a question of If it will come it is a question of when it will come There may be nuance that and a little bit of tweaking this way and that way But it's a matter only a matter of time and uh The penalties that will accompany it those will be significant the reputational risks that will accompany Not being able to comply with this those will start to be significant So the number of excuses behind which a company can Risk deep prioritizing their efforts those are reducing fair Yeah, definitely definitely something that this is going to happen like this is no and no longer Sort of up for discussion. Definitely. There'll be a data protection law sooner rather than later and yeah significant penalties and a whole new architecture. So Yeah, wonderful. I Sorry Venkat at ease I didn't And the right thing to do Yes, yes being responsible with your data and your customers data is Is a good thing. Yeah, absolutely. So uh to all of our uh listeners today. Thank you for joining us and I think that Both Srinidhi and Shivangi have Given you the tip of the iceberg. There is the part of the intent was to sound the alarm bell In terms of specific questions that you might have Both of them are reachable at Well, you can you can leave comments here and we'd be happy to reach them back to them But Srinidhi is with Ikigai law and Shivangi is with Arka Arka spell with two Rs a r r k a Um, so please feel free to reach out to them. Um, Srinidhi and Shivangi if twitter is an option for you Do you want to just quickly announce your handles, please? Or you can see our latest announcement. We have copied both the Sure, but in case they don't have access to the announcement very quickly Shivangi, what's your twitter handle? Minus Shivangi Nadkar without the i at the end. All right I'm N. G. R. N. N. K. R. N. All right Uh minus Srinidhi BS Got it Thank you very much. Okay. Sorry. I was about to close that go ahead. Thank you. What are your thoughts? So two thoughts first of all, um, we will be doing a deep dive in the next session on Tali's approach to Um, very privacy conscious modeling approach. We'll have their data scientists come and tell us what approach Tali took when dealing with the country wide Pretty much every small Enterprises data. So that that should be very exciting And we expect to have on this particular forum. We expect to have more conversations around What the law means as far as the day to day practices are concerned So we'll there'll be a lot more discussion around tools and techniques approaches codes of Codes that Srinidhi talked about just take you on this forum wonderful And also to all of our listeners Um, all of these sessions that we've been having are accessible On the has geek website under the fifth elephant series of docs So please be sure to check those out And with that, I'll thank both Srinidhi and Shivangi for having joined us today. Thank you so very much I look forward to continuing this conversation, especially uh on the dnpd aspects of this the non-personal data Thank you so much once again. Thank you. Thank you for calling. Thank you so much for doing this. Thank you Take care. Bye. Bye