 And with no further ado, let me hand over to Vizlan. Thank you, Greg. Thank you very much for inviting me. Alex, Greg, thank you very much for agreeing to be moderator and a discussant. So this is all quite new to me. So I'm, I'm coming from the operations. So I'm in the operations department of HAC Paris, but this is joint work with it. It I find my sir from Johns Hopkins and and regularly from London Business School. Just as a word of precaution, I just, I'm in the US and I just taught a class at 5am. So if I fall asleep, please wake me up. So I will talk today about digital privacy. This paper has been around for quite some time, but we are still collecting feedback, any suggestions. So all the comments questions would be very much appreciated. So I will talk about digital privacy and the motivation for this talk is, is the observation that more and more of our social and economic activities are being conducted through platforms. And all these platforms, they collect enormous amounts of data about us. And there is a good side to this data, but there is also a bad side to the data. So on the good side, the information that the platforms collect about us helps the platforms to provide us with better products and services. So for instance, think about Uber, the fact that Uber knows my GPS location helps Uber to match me to the nearby driver. So that also means that Uber can earn more money by leveraging this information and matching me to the drivers. So this is a good side. But at the same time, there is a bad side to the information. So there is there are certain risks of the data being misused. And we all know some examples from the past like Cambridge Analytica example where the data of 50 million Americans were used to sway presidential elections in the United States. We also seen Facebook admitting that they paid contractors to transcribe users audio chats. Facebook recently also admitted that there was a privacy or data leak in some of their groups where customers like where users were discussing some of their medical information. But this is not only about Facebook. We've seen examples of Marriott, Equifax, the data leakages that they had. And even on a smaller scale, any application on our phone actually collects enormous amounts of data about us. And what is even scarier is that all these data is shared with some third parties. The recent report by Financial Times showed that the median application could transfer data to up to 10 third parties. And some of these third parties can be criminals, hackers, some data brokers, and so on. But sometimes this can be governmental structures like FBI, like it was the case with the company called Family Tree DNA. This company, this is the company where you spit into the tube. You send the saliva sample to the company. They give you the DNA analysis. They forgot to mention though that they were also sharing this data with the FBI. So, of course, these scandals, these privacy scandals led to the customers and users being more worried about their privacy. And it led to appearance to new laws and regulations. And one of those laws and regulations is European General Data Protection Regulation, GDPR. So this law essentially introduces privacy by design, requires companies to introduce privacy by design in their operations. In the United States, there is no centralized regulation. There is no centralized law. One of the strictest, I would say, regulations that the United States has is the California Consumer Privacy Act. It is a little bit different from GDPR, but the premise of CCPA is pretty much the same as GDPR. So all these laws and regulations, they are aimed at protecting the data that the consumers are sharing with the companies. But there is a problem with all these laws and regulations. And I'll use the words of Edward Snowden who recently said this is NSA vis of lower. He recently said that the problem with data protection laws like GDPR, CCPA and so on. The problem with them is that all these laws, they presume that the data collection was okay. So yes, they protect the data, but these data were already collected and potentially more data were collected than what the customers, the users of the platforms wanted the platforms to collect. So in some sense, Edward Snowden makes this distinction between data protection, data collection and overall this identifies this data strategy of the company of the digital business. So there are these two dimensions to the data strategy data protection and data collection. And this leads to the questions that I'll try to answer in today's talk. So I would like to see how in this world where like in which we live in where our data is amassed by the platforms and there is a good side to the data but there is also a bad side to the data. How the data strategy of the company affects the trade-offs of the customers incentives of the platforms. How can we design data strategies from the perspective of the platform and whether these data strategies are efficient from the regulators perspective or from the societal perspective. And to give you a sneak peek in today's results, the answer to some of these questions is that in general data-driven platforms collect too much data and the more data-driven they are, the more data they will collect. And the higher will be the discrepancy between what the customers want and what the platform collects. And we will show that the current regulation, some of the current regulation focused only on one dimension of the data strategy is not helping. So this will be the focus of today's talk and I'll jump directly into the model. I'll skip the literature for now. We can talk about it later in the talk. So we'll build a stylized model which will have these three types of strategic agents. We'll have users, users decide how to use the platform, how actively to use the platform. The platform decides what kind of data strategy to implement and adversaries. And sometimes I will call them criminals. They decide whether to enter the platform or not. So I'll focus on each of these elements in more detail just in a second. So users decide how much to use the platform. In some sense, think of our decision whether to use Facebook. This can be a decision how much time to spend on Facebook, how many photos to upload, how many stories to write, and so on and so forth. So this is a decision of users. We can also like the model extends to the decision of users whether to adopt the platform, whether to enter in the platform. This can also be an extension to this model which we can see in the paper. So the platform decides on its data strategy and as I defined it before, the data strategy consists of two elements. The data collection strategy, how much data to collect, and data protection strategy, how to protect these data. At the beginning of the talk, we'll talk mainly about data collection and then we'll add data protection on top of it. So platforms differ in terms of their data strategies. We can think of a couple of examples. For instance, Facebook WhatsApp recently launched Anto and Encryption, which presumably means that any of the conversations that the customers of WhatsApp have on WhatsApp are not available to WhatsApp to decrypt to extract the data from and so on and so forth. Recently we've seen a scandal with WhatsApp where WhatsApp presumably shared the data with Facebook and they shared more data than what the customers expected them to share, but this is a little bit like a different story. So Skype on the other hand also uses Anto and Encryption but it doesn't provide it so by default it's not switched on. So whenever the customers type in something on Skype or have their conversations, these are not Anto and Encrypted. That means that potentially Microsoft can take these conversations, process these conversations and get some useful data out of those and show us potentially ads based on these conversations that we have on Skype. So this is data collection strategy. Also platforms differ in their data protection. In general, platforms might invest in antivirus software, in firewall software, in hiring ethical hackers to find certain vulnerabilities in their platforms and so on and so forth. Adversaries, the criminals, they decide whether to enter the platform or not and here I'm using this word criminal or adversary but this can be pretty much anyone, any third party such that when possessing the data, this possession of data by this third party brings certain discomfort to the user. And sometimes this possession of data can be illegal, sometimes this possession of data can be illegal. So I have a couple of examples here on this slide, the Family Tree DNA is the company in which we've talked about a couple of slides ago. ICE of the United States which is immigration authorities, they use Facebook recently to actually track down illegal immigrants. So this could be also an example of an adversary. So these are the three players in the model, they are all strategic and they will all be making some decisions and this is how the timeline looks like. So we have the platform. The platform first decides on the data strategy, how much and I'll focus only at the beginning of the talk on the data collection. So the platform decides how much data to collect and this will be this decision XI which is in between zero and one and I'll explain what that means just in a second. Users decide how active to be. So their decision is this variable AI. It's positive and adversary decides whether to enter the platform or not. And let me show you how users activity is translated into the information. We assume a very simplistic form for this transition. So we assume that users activity AI is directly translated into the information about the users. So think of all the activities that the users do on the platform. We can think of this as the information equal to the information that is generated about the users. But only part of this information, this part XI will be stored on the servers of the platform. And this is the information that the platform will use to actually provide better products and services to customers. But this is also the information that the adversaries will see whenever they enter the platform and try to steal users data. So once again, AI is users activity. AI XI is the information that is available to the platform. Think about the rest this white segment here as being completely removed and encryption. We don't we don't see it at all. The platform doesn't see Greg is there a question on the platform. Oh, sorry on the on the chat. Yep. So there's a question from Yossi Spiegel. He said that you present data protection as being all about protecting consumers from attacks by an adversary. But this is quite different to data protection in terms of protecting the consumer from exploitation by the firm. Can you comment on that? Yeah, so I would say so this model focuses on this data protection from the attacks. So we specifically model these third parties who are strategic or in like indulgency, making the decisions whether to attack the users or not. So I would say that this model doesn't capture the the social image concerns and so on. So I would need to think how we can translate this model to these examples. So I will leave it to you later and then potentially we can discuss this question a little bit more. Okay. Good. So this is activity. This is information that we have and I'll give you like a very simple motivating example for this model and attributes model. So imagine that the customer has an attribute theta I which is either zero or one and the platform would like to show ads to the customer based on this attribute. This attribute can be whether the customer these customer likes comedies or dramas and the platform wants to show either comedies or dramas in the Facebook feed of the customer. So the platform connects these data AI XI and we can say that with probability f of AI XI the platform learns theta I and then it shows the recommendation theta I and the customer obtains some value V from this recommendation. Or the platform doesn't learn anything about the customer with probability one minus F of AI XI and chooses recommendation at random then the user obtains just V divided by two. So this is just a simple model but it should illustrate the value that user is obtained from the possession of data by the platform and it should illustrate this transition between the users information and better products and services. Now, additionally on top of that another layer would be to add the adversaries who come to the platform they hack the platform and they see this they observe the data about the customers AI XI and they will steal this data. And I'll explain how the model of adversary adversaries works just in a second. So before that I would like to show you the utility function of the customer so customer is choosing activity level AI. The first so the utility function consists of these four components. The first component is just direct benefits and costs from using the platform so you cannot message infinite amount of time. So it's it's it's just this very simple quadratic form. The second component is the very use form positive network benefit that the customer obtains from using the platform. So a bar is the average activity of other users AI once again is the activity of a customer. Beta is some coefficient between zero and one which is the strength of the network effects. The third component is the benefit that the customer obtains from using from the platform knowing something more about the customer. This is exactly the benefit that I showed on the previous slide the customer enjoys obtained like enjoy seeing better recommendations based on the data that were provided to the platform. So coefficient raw is just fixed exogenous factor which is capturing the marginal benefit to the consumer. And this function f of a excite just maps the information that the customer has to the to the to the benefit to the in some sense probability that the better matches are shown to the customer. The fourth component is this expected loss from adversarial activity. It's a negative component. This is how much customers are damaged by the adversaries being present on the market. Omega is the adversaries adversaries demand for information. We can think of this as the expected number of adversaries who are present on the platform and g a function g is just the function which maps users data to the damage that the customers are are facing from adversarial activity. So this is the utility function of the platform. You will notice that I write index I here so talking about one individual consumer but there is pretty much no heterogeneity in this utility function. So this is a base model we can add heterogeneity either in the direct direct benefits or in customer sensitivity to the to platform providing better services or customer sensitivity to adversarial attacks. So I would like to keep this eye in the utility function just to have it there for the future extensions. Okay, are there any questions about the utility function of the of the customer. Let me ask this. Greg is the chat. Okay. No questions so far. Okay. So this is the utility function of the customer. The customer is again choosing the activity level on the platform for simplification reasons. We will assume that these two functions F and G are just linear. So just to simplify the exposition, but we can we can easily extend this beyond that. So I have a very quick question if I'm if I may. Yes, please. It seems to me the biggest one of the biggest things in this literature is the existence of externalities in terms of information transmission. Am I right in assuming that you're assuming that there are no externalities in other words that the adversarial effect is individual. So we will see. So that is right. So when we look at the utility function it looks like there is there are no externalities but it I will show you in couple slides that this adversarial effect because it is endogenous. It will structurally be equivalent to negative externalities. So it will be essentially the so through this decision of the adversaries whether to enter the platform or not, we will be able to introduce these negative externalities into the model. Is it clear. I'll wait. Okay. So, sorry, can I ask also another question. In your model here I just see activity I don't see data as a separate from activity so are you implicitly assuming that data is just, I mean, what's the difference between data. Where does the choice of data intensity by the firm comes in. So essentially, we assume this very simple model where the activity is essentially equal so we can think of your active on the platform and the firm just collects all these different types of activities. Right. And we will assume. So this is this part right so user activity translates into information generated so in some sense, this is simplification to say that activity is equal to the information. But then what we are saying is that the choice of the platform is essentially this variable psi, which essentially sets the fraction of this information about the customers, which will be stored and used and processed by the platform. So this is the decision of the platform. So the data here, the difference between the data and activity is that the data is just some parts of this activity that was collected which was stored and which can be processed and used by the platform. Is it clear. Okay, that's why that's why the adversaries get also a high time. Exactly. Yeah, so whatever like on this picture whatever is in this red segment. This is exactly what is stored. And this is what the adversaries will get and everything else is kind of we can think of this as and turned encryption was used and it was removed from the platform. By the way, why should psi be between zero and one I mean if the if the platform combines data and analyzes it it can, you know, that there could be increasing returns to scale like you can generate more insights than Yeah. So this is a thank you very much for this comment. This is a I think a great suggestion so we didn't look into this combining the data and we didn't look at psi larger than one in the sense that we didn't look at this increasing returns from from from scale, but it would be a great extension of this work. I would say like considering this connected data sets right and merging to data says getting more data. So I don't have a good answer for this question. So for now, I would say for the purpose of this model the base model it's in between zero and one. So we don't like we assume that there are no effects like this in in the universe. Okay, I'll get back to the utility. This was the utility of the customers. This is the pay off function of the adversary adversary decides whether to enter or not. Whenever an adversary enters he gets these exactly this red segment of information about the customers so I put here a bar which is average activity of the customers. So and this is average information that is available on the platform about the customers. The second component is just the cost to access stored information, and we will assume that adversaries are heterogeneous in their abilities to access data protection. And so gamma J will capture this heterogeneity and this C will be just some constant factor which will capture in some sense data protection strategy of the platform the platform can increase C and make it harder for the adversaries to enter the platform. And this is the way we will capture these data protection later on in the presentation. So, once again we have a large number of potential adversaries they decide whether to enter or not and they're heterogeneous in their abilities to, to enter the platform and steal the data of the customers. This profit looks like this. It essentially it is a function some function of two things, average users activity and average users information. So the first part, the platform might enjoy users being more active on the platform posting more messages like paying more subscription fees and so on and so forth. The second part is more about using the data to earn money with the help of this data. And we can think of different platforms like depending on the marginal benefits of activity and marginal benefits of information. So we can think of usage driven platforms for instance which would have large marginal benefit of activity but small marginal benefit of information. This would be the platforms which do not capitalize on the data that much. So if you think about Uber, Uber doesn't use the data that much. The only data that they use is GPS location, which is quite a lot but still, but they don't use the data for advertisement purposes and so on and so forth. Well, now they do but in their original model. So in some sense, this would be some subscription services where you pay certain fee every month. So these are usage driven platforms. On the other hand, we have data driven platforms. These are like, let's say data brokers which sell the data about the customers. And somewhere in the middle will have a driven platforms like Facebook, Google, who would need certain activity of the customers for them to be active to generate information. And they would also sell this information and capitalize directly on the data. So this is again, this is not very precise but this is just to give us certain favor of what certain business models of the platforms might be. So once again, yep. The platforms don't have the criminal activity is going into the payoff function. So if they get hacked all the time, it's not part of their utility of profit. So it will be, it is not part of their profits directly but it will be a part of their profit indirectly through the response of the customers. So if they get hacked very often customers will stop participating and that will lead to lower profits for the platform. Okay, so I will continue. So we'll start with the first question will study the impact of the data strategy on the users on the platforms incentives. And for that will focus on the second stage of the game will focus on the equilibrium response of the users and adversaries to the platform strategy. So the results will there exists a unique equilibrium we can characterize this equilibrium and what is more interesting here is that if we take this omega star equilibrium omega star from here and plug it back to the utility function of the of the users, what is actually the same. So this red term right so this is the red term with the plugged omega. It will be pretty much the same as the network effect that is in blue, but the only difference will be the sign and the conditions the sign will be negative, which would mean that this is some sort of a negative network effect, negative network externality. And also that would mean that the data strategy of the platform through psi and through choice of C has a direct effect on this negative network effect I think that relates to Lewis's question about negative externalities information externalities. Right so through this adversarial activity we are essentially structurally bringing the negative externalities into the picture. Are you meant to lose or. No I'm still lost in here first of all I also had to ask are you making functional form assumptions regarding F and G because before they were generic functions and now seems to be there linear, am I right about that. So F and G in the utility yes for for this talk we will assume that they're linear. Yeah. So what I meant by externalities was that was that my the information that I provide benefits the adversary in ways that are different from a one to one problem. In other words that that I can be hurt by information that's provided by people like myself. I see I see what you mean yeah so you mean the externalities and the data that you provide right so we don't, we don't capture this in the simple model I suspect that we can capture this for the. Yeah, I suspect that we can capture this I'm not sure how this will change the model so it's a little bit different angle I would say there is a paper by Jim Ogle and. Ali Mahmoudi and medicina medicina. It's, I think it's called privacy. I forgot how the paper is called I can send you the paper so these paper like directly looks at this information externalities and the privacy considerations with the information externalities when your information mind for me potentially me uploading a photo to Facebook. And in this photo I might have myself but also my friends but there are also these, if this photo is leaked that means that my friends are also hurt with the, with the help of my information. Okay, it also relates to a bigger question which I did not want to draw to this point but maybe I should talk later. I mean the empirical evidence regarding the harm that individually from using information does to that user as far as I have seen is very, very, very small. Yeah, so that backs the question of what is it that we're measuring in here what is it that we're modeling in here. Yeah, so I will. So I will get back to that in, I would say five or so slides so I'll show you like some new source which says how much, how much are users are damaged and so on. Another claim would be that users are potentially they might not be damaged that much directly through the adversarial activity. But because users became more aware of all these privacy issues they're changing their activities and that has certain implications in their surplus so overall effect might be quite significant but direct effect of adversarial activity so just this Omega AXI might be quite small. So this will be the claim that we will be making and I'll get back to that, like, I would say in five or so slides. Okay. So there is one other question from Khan Melly who asked in the model the number of users does not influence the utility of users, or the cost of the platform, right. We, yeah, we can assume the continuum of users yeah it doesn't. Okay, so I will continue because I think I have eight minutes or so. So the, if you look at the activity and information so once again the it turns out like it looks like the structural form of this adversarial activity are essentially negative externalities. And if you look at the activity and information this A bar and A bar XI depending on the data collection strategy of the firm they will have this increasing then decreasing form with the maximum for the information being to the right for as compared to the maximum of the activity. And essentially the that would also mean that the platform like when we start thinking about the choice of the platform the platform would want to be somewhere in the middle because the platform is maximizing both the activity, sorry maximizing some function of activity and information. So the platform doesn't want to be here where the data collection is so low that the customers are not that afraid of the adversarial activity that they still actively participate start to use the platform, even more. When the data collection increases because of the positive benefits, but the platform also doesn't want to be on the right hand side, where the data collection is so so strong that adversarial activity is also very strong and customers are very afraid of this adversarial activity so the platform would ideally want to be somewhere in the middle. And that. So essentially what if we will look next what if the platform chooses this side now. And as I've said the platform is somewhere in the middle and this size star is a solution to the first order condition. One thing to note here is that on the left hand side we have essentially the ratio and I'm like, I'm a meeting the arguments of the functions here for to make it a little bit cleaner. So on the left hand side we have the ratio of the marginal benefit to the platform from the activity versus the marginal benefit to the platform from the information. So that would mean that if we look at the claim of Mark Zuckerberg who says that advertising model encourages companies like Facebook to use and store more information. That would be in some sense true a more data driven platform that is the platform with a lower ratio between marginal benefit of activity to marginal benefit of information. It would collect more data so the society would move to the right, but that would also mean that users will start to use the platform less and consumer surplus of the users will be will be lower and in general, there will be more adversarial activity so it seems like it's bad news for the customers. And the question is, like, if you think about this size star, it looks like it's like the question is that is it efficient and if it's not efficient how can we correct for inefficiencies. And we can introduce the social welfare which will be just the weighted average of the consumer surplus of the customers and the profit function of the platform. It turns out that any data business who has at least some data driven component that is any business which uses the data of the users to earn money to capitalize on it. Such businesses will always collect more data than what is optimal for the customers only purely usage driven businesses which only rely on let's say subscription fees of the customers and only rely on the activity of the customers they'll collect more and more amount of data. This is good, but how can we fix it. And currently in the United States, the FTC Federal Trade Commission imposes certain fines for data breaches and imposes a minimal requirement on the data protection. So in some sense if you think about the model of the adversary, that would mean that the FTC comes and says, hey, you need to set this level of C. If you're ethical hackers, you need to have this firewall level, you need to protect the data to certain level. But the problem with that is that for the results before we were derived and inefficiencies before we were derived for any level of data protection C. So irrespective of the data protection C, there were inefficiencies in data collection. In fact, those there are like we propose in the paper two solutions to fix these inefficiencies. One solution is to impose liability fines on the platform based on the inflicted damage. Another solution would be to introduce the tax on the amount of data collected. The liability fines would essentially be equal to introducing this liability component into the profit function of the platform. So it would be just proportional to the damage that is inflicted on the users this omega a bark side. And we can derive this optimal liability fine if we only focus on the consumer surplus, this will be just the ratio between the marginal benefits of the data to the platform to the marginal benefit of the data to the users. So just the ratio between the two essentially aligns the incentives of the platform and the users in the same way we can derive the tax the taxes just the tax on each bite of information which is collected by the platform. So we can we can also derive it and when we only focus on the consumer surplus these taxes exactly equal to the marginal benefit of information for the platform. Are there any questions here. So Lewis had a question but I'd suggest it's quite a broad one so maybe we could say okay so yeah let me finish up I would say. So, when we think about the data protection we can introduce data protection here and essential data protection is this factor C and introducing data protection would mean that we need to introduce some costs of data protection to the platform. It turns out that when we assume sufficiently strong complementarity between users activity and information for like from the perspective of the platform's profit, both data collection and protection are compliments in equilibrium. But that essentially for instance if you assume the simple linear model that could mean that we can be in either of the three regimes we can be in the regime where we have under collection under protection. We can be in the regime where we have over collection and under protection again as compared to socially optimal, or we can be in this regime where we have over collection and over protection. But it turns out that irrespective of the regime that we are in, we can always fix the inefficiencies in the same way as we did before by introducing liability fines and taxes. But also additionally to that requiring a minimal protection level, which is equal to the socially optimal level so this such two level policy ensures that the platform will collect and protect the data in the best possible way from the social perspective. So, liability fine or tax plus the minimal requirement on the data protection. So, this, like getting back to this question about the empirical evidence of such effects of adversarial activity on the customers. Indeed, if you look at the news. So, we talked about this efficient like how to how can we guarantee efficiency so imagine now we do guarantee efficiency but the question still persists of how our consumers how much our consumers damaged for this adversarial activity and if you look at the news will find some citations like this BBC recently said that UK cyber crime victims lose $190,000 per day, which would translate to 0.3 pence per person seems like not that much seems like consumers are not damaged that much. But this is, I would like what we argue is that this is direct damage from adversarial activity this is essentially only this component but this damage doesn't take the into account the fact that when adversaries are present in the model, users are changing their behavior. So now with all these candles of Cambridge Analytica and so on users became more aware of the problems they started maybe to use Facebook class. And platforms also change their data strategies so there is this response loop to the changes in the activities by the users, which means that the consumer surplus of the users is changed. And what we can do we can write down what we call adversarial loss multiplier, which would be the difference between the consumer surpluses of users with no adversaries present in the model. So essentially, completely removing this term versus consumer surplus of the users with this term, and we will divide it by the direct loss. It turns out that we can put a lower bound on this adversarial loss multiplier as larger than two divided one minus beta. If you remember beta was the strength of the network effects so whenever the network effects become stronger this adversarial loss multiplier expands, which is quite sensible, because that would that means essentially that because users are afraid of adversarial activity they're changing their activity, but that also has an effect through network effects on other users. And when network effects are strong, they will be quite a large response and therefore the change in consumer surplus will be much larger as compared to the direct loss from adversarial activity. So to, I think I will spend 20 more seconds on finishing up. So the, the adversary like what I what I try to show in this talk today is that we live in this world where there is a good side of data but there is also bad side of data there are certain risks that the data entails. And the presence of third parties adversaries is essentially we showed in the model that it can be struck, it can be put structurally equivalent to the presence of negative externalities in, in some sense, and the platform through its data strategy has an effect on this strength of these negative externalities. So the platform choosing the data collection or the data protection can actually change the strength of these externalities. And in the world where this adversarial activity is present platforms that are more data driven they collect more data of their customers. But they're also less attractive to the users users are getting lower consumer surplus from these platforms we can fix inefficiencies of such platforms by introducing either liability fines or taxes and with this additional requirement on the minimal data protection levels. So, this is all for today. Sorry, I think I finished with with the phrase that I use in my class. This is all for today. This is all that I have for you. I'm happy to engage in more discussion and answer some questions and learn more what you think. Thank you very much for attending and listening. Thank you, Ruslan. So we do have a few questions lined up in the chat. What I would suggest first is that we go to Alex and hear his discussion, and then perhaps, Ruslan, you can respond to that and some of the questions we have in the chat as well. Thank you, Greg. Thank you, Ruslan. Great presentation. And I like the paper. And I think in general, it works on a very important question because of data privacy because it's really a hot policy issue, and we need theoretical models to give predictions and guide economic policy. And I think it also gives important vision, your paper about kind of methodological way how to think about the problem because it discusses benefit and harm of the data. And it also, I think, gives a nice taxonomy between data driven and usage driven business models. That is, the question is how to think about the business models of a firm and how it affects its data collection practices. I think it's important and should be relevant in practice. Now, let me just then continue to the ways how I think could be what could be improved in the paper. I think that would be more useful. And I think so I have three broad comments. The first comment is that the, as I read the paper, it aims as broad and bold policy recommendations, what we should do with these industries. And to address these questions you offer a very specific model. So you look at the monopolistic setting, you have reduced form treatment of information how you model it, you also have specific functional forms for surplus, surplus of the adversary of the customers. And I think that would be very beneficial to provide empirical support for this model. Because in some sense the policy recommendations that you give and the quality results you obtain before quite naturally from the assumptions that you make. And I think it's very important to justify these assumptions. And I understand that this is not an empirical paper, but I guess I would very much appreciate seeing more references to empirical literature that would justify particular assumptions that you make in the paper. And also, perhaps seeing some more institutional details, that is what is the real market structure, maybe even discussion of cybersecurity, and the policy paper on digital privacy, because you discuss this minimum requirements, but I was very curious to know what exactly this means. Because it can be just a number that they require something to be back then, so it must be some policy. And I think that that would discussion of that and positioning quality of your model would be very beneficial and providing empirical support. So that is the first comment. And the second comment regards the modeling of data or information. And I think it resonates with some questions during the talk. Because you currently model it in a very reduced form as a single dimensional variable in a model when there is no uncertainty. So, and I understand it is not attractability, but it's useful to think whether we're missing some important aspects of the environment when we do that. I mean, from the micro theory perspective, information is relevant when there is uncertainty, and information is used to resolve this uncertainty. Right. And in particular, in these settings, you may think about you as a user, there's some uncertainty about your type, your characteristics. And you know these characteristics that the platform doesn't know and advertiser doesn't know. So that that the platform would like to know these two, perhaps provide better services, maybe for something else, adversary would like this data for different reasons. And, and, first of all, their problems, the use of data may be misaligned in the current model, they're perfectly aligned. In reality, maybe you would like to collect as a platform some particular substance of data and protect the others, park relevant for adversary. But more importantly, perhaps, and which is easier to incorporate in the model is this idea of adversary selection. Because in some sense, the, if you know your information, it may affect whether you joined the platform, or you may affect what information you're willing to discuss about you, or if you joined the platform, how you behave, act on. That is, there is this particular behavior response to the information to the type which I think currently is not very much present in the model. And I think that could be incorporated in this heterogeneous extensions that you consider. But they perhaps you need to require the value for the, for this data for the platform and the value of the adversary also depends on this heterogeneity, not just the value for the user. I think this would be one way to incorporate it and see study this adverse selection, which might be important in practice. And another issue relevant to this modeling of information is this difference between data and information. And I think this is what Jack asked a little bit in the in the in the chat right now. Because usually we in our models we can find them together. So data information. But I think especially in this setting, it may, there may be different distinction. Because the amount of data measure just gigabytes may be very different from amount of information relevant information, which is a content in this data. So the, a lot of data may contain very little information and vice versa the very few data about you can actually reveal a lot right effectively if you think about the platform, if it wants to assign you some type for advertising. There's not too much data to do that. And I think it's very relevant, especially given your one of your suggestions when you suggest to tax data. I think that I thought that would be a crucial thing, because if you do impose this, this strategy, this policy, then companies could just, you know, they could just shrink amount of data to the store, and I think the amount of information they collect. And I think that's very important to think about in this. So, from this perspective, I find your policy through limit liability finds more convincing. Perhaps I would stress that more rather than that some data storage. And the last comment maybe just opening for the discussion is so that you advocate the governmental intervention into the market. And in some sense, that would be useful to see, understand why can the market solve the problem itself. In particular, your government, it can use monetary transfers, but the firm itself does not. So I would first ask, perhaps, what is the platform could use money for transfers, for example, the platform could allow the users to pay for privacy, for extended privacy protection, would the platform like to do that, wouldn't mitigate the problem. And if not, why it would be interesting to know and whether would perhaps market competition would resolve it. So in this sense, one way possible way policy to promote privacy, perhaps to encourage competition between different platforms. And that would be perhaps maybe easier from the governmental point to do, because they are not experts in the platforms, they can observe how many firms they are, it's easier, but they really, they may not understand exactly how what parts to tax and how to force, how to kind of restrict the actions of particular Yes, so these are my comments, and hope some of them were useful. Thank you, Alex. Breslin, would you like to respond briefly? Yeah, so Alex, thank you very much. This is an enormous amount of comments and comments that we've got and yeah, I will have to rewatch the recording to I wrote down some of those but I'll have to rewatch the recording to the collective of those. So just couple comments, just in response to some of the items that you've mentioned so you said that the model is quite specific so we tried to, we tried our best to make the model as general as possible in some cases we showed that we can generalize further the utility utility function of the, of the customers, the only thing that we need from the customers utility function is this trade of between data and information we want to get these two, like the, this form of the activity, equilibrium activity and equilibrium data, such that they have these humps and these humps are kind of separated from each other. That would create a tension for the platform. I, yes, so thank you very much for the comment about adding more references, justification of the assumptions, empirical so we'll take a look at this. So the data. So when you spoke about the data and the type of the consumer so I showed this attribute model right with the theta being a type of the, of the customer and the recommendation that the platform wants to provide is based on this theta but it doesn't have to be based on this theta so it could be one way but I think what you meant is a little bit different right what you've meant is this type of the customer being a part of the platforms profit function right or No, it could be, it could be an example that you have it could be, it could also work but then that the consumer itself knows this theta and can act based on the theta so if we assume that the consumer like I would say that not that much will change in the model if you assume that the customer knows this theta, because we can introduce this heterogeneity, either in this theta, or we can introduce heterogeneity the consumer will know the private information but I, like, I'm not sure I need to take a look at this I'm not sure how this will change the model and the results I'm, I'm more or less quite convinced that it will not but I will have to take a look at those. So you're right that the incentives of the adversaries in the platform in this model in some sense are aligned in the same in the sense that they collect and they are sorry they they want to get hold of the same data. All right, but I can see how that can be different in reality so So in the, so what, what we do in the paper right now we introduced this new, I would say extension which covers the idea that the data can be different there might be several dimensions to the data some data might be the data that you require at the registering step. So let's say when you register on the platform, the data about your credit card might be required the data about the name and so on. And this is this might be the data that the adversaries might want to steal versus the data which might be the data about your activity the photos the GPS location and so on this data you need to process or it might be harder for the adversaries to extract something useful out of this data. So we make this distinction in the paper now and I would say the main like the base model focuses on the data like activity but then we have this additional extension where we also focus on this data as credit card information and we implement like we focus on this by studying the model of adoption where customers decide whether to enter the platform or not, knowing that whenever they enter they reveal this information to the adversaries and to the platform, and this information essentially is revealed So in terms of taxing liabilities that's interesting I saw a comment like on the chat I think from Louise who said commented something about how how can we actually implement this data Yeah, so but but I think so yeah I think jack yeah actually jack asked us. Okay, so jack asked how can we implement this taxing the data. So when we all when we thought about this we for some reason we thought that taxing the data would be easier than liability because with taxing the data you can just count the number of bytes that you transfer from user to the platform right but what you're saying if I'm the suits correctly is that the data might be different right there might be several dimensions to the data. Yeah the information may be different from the just share amount of data. Yes, so the useful information out of this might be there. Yeah, I see that so and you're claiming that the way like you're saying that liability might be actually more convincing so this is a good comment like we always thought in the opposite way about this but yeah So we will need to think about this we had an extension where the platform pays users for their privacy in some sense where platform essentially the same transfer of money from the government to the sorry from the platform to the government we implemented an extension where the platform charge sorry where where platform pays money to the users so the same money goes to the users now so instead of going to the government. So some of them like it helps but I think there are some regimes where it doesn't fully restore the efficiency so sometimes it helps sometimes it restores the efficiency but there are regimes where it doesn't fully help. What if the users could pay to the platform. So that we didn't consider so that could be one possibility. That would really be a disastrous move for the platforms like Facebook. Actually actually this new proposition proposition 24 in the United States introduced in California. So some claim that it opens the doors for users paying the platforms for the privacy. So yeah so maybe they would have to view more ads. Maybe how many more we have so many ads. More but less relevant ads. So in terms of market competition we did some preliminary models it looks like indeed market competition will drive down like we'll drag down these privacy concerns so in general the platforms will start to collect less. We didn't end up we didn't because we always thought of this market as having a lot of platforms like Facebook or pretty much monopolies on the market. We didn't and we like for this paper we wanted to look at the monopoly. We didn't proceed with this analysis but we'll take a closer look at the competition. I would say like the preliminary results suggest that yes that can actually drag down the Xi in this data collection. So that would be my reaction. Once again I'll have to re-watch and thank you very much for all the comments that you gave. Thank you. It was very helpful.