 Thank you very much, it's a great pleasure to see all the friends here and okay so I don't need to introduce myself. As you can see from the title this paper is about the study of data privacy regulation. So we want to examine the impact of a general data protection regulation. GPR was enacted in 2016 and entered into effect in May 2018. It is widely recognized as a tough privacy and data security law in the world. And after it's thrown out becomes a blueprint for privacy regulation in many other countries and states with a very similar regulation in California, the so-called CCPA. So the key rule in GDPR is the requirement of consent from the data subjects as stated in GDPR and a freely given specific informed and unambiguous indication of the data subject to wishes. So that's the key rule and we are going to examine how this rule is going to affect the market. So after two years of its rollout in 2020 the European Union in formal paracation claims that there is a great success of GDPR. So as I mentioned the GDPR has been an overall success meeting many of the expectations. Unfortunately there's disagreement from economists and there's lots of imperial studies record its adverse effects. For instance there's a wide evidence showing that a total number of cookies therefore the number of data collected actually dropped a lot and that is going to affect the platform's revenue according to Johnson's estimation by 50% and also estimated by John and others find that the European countries the venture capital investment would drop 26% comparing with the United States. As also people would argue that GDPR compliance costs a significant level of resources and eventually they will be passed on to consequence. So clearly there is a disagreement between regulators and economists and such kind of disagreement actually raises several fundamental questions for economists to think of and to study. The first question we want to address is how do we reconcile the conflict between consumer data acquisition and privacy protection? Second question on what economic basis do we evaluate the overall impact of regulation? Third also very important so what is optimal regulation optimal mechanism of data acquisition that maximizes social welfare? So this paper aims to answer these questions and I hope I can answer these questions. So in this paper I present a new model for the analysis of consumer data acquisition and privacy regulation. So before addressing this feature model I would like to rethink the economics of consumer data. So we all know that consumer data has become a fundamental resource for the modern digital economy but according to current business model consumer data is actually treated as a free good. The existing business model for data acquisition is very simple free service for free data that is digital platforms provide free digital services or contacts for consumers and when you browse your website generated consumer data and then they can collect for free and then monetize from data. So that is a business model of course there's common sense behind this business model economists or businessmen would argue that because the data acquisition costs zero marginal cost digital platforms mainly invest a large amount in digital infrastructures that's fixed cost however the marginal cost of collecting data and processing data is almost zero so then this data should be priced at zero yeah but this common sense is wrong because it does not count the cost of data acquisition on consumers data acquisition generates a negative externality of consumer privacy and consumers incur a private cost when digital platforms harvest their data so this is very important and actually this feature is going to turn over completely the concept of free data. So now I would like to briefly introduce some key modeling features of this paper first of all I would consider a concept of private cost harvesting consumer data causes a private concern and this concern can be regarded as a loss of utility of a cost from the point of view of economics and it is also nature it's natural to think that private cost increases with amount of data being collected I call this data scale but what is also important is private sensitivity and attitudes actually are subjective and idiosyncratic that is we can imagine that consumers are heterogeneous in their privacy sensitivity so this is a very important assumption and actually this assumption is motivated by some experimental evidence and empirical tests which I not cited here cited here of course the concept of private cost is also used in several other papers but it is treated as homogenous in most papers so introducing heterogeneous privacy sensitivity is a key difference from other papers so our second modeling feature is to introduce data analytics the existing literature treated mostly or focused mostly on processed data and regarded as an informative signal for product pricing and recommendation here we try to distinguish from this method knowledge by separating the data and the process as two essential inputs as we all know that raw data does not have much value and there's a capability of processing raw data to extract available information it's called the data analytics actually data analytics is an like umbrella term that incorporates lots of activities including data capturing data processing data storing data analyzing and also like interpretation of data but what we needed to emphasize is it is not simply like a software like people think about machine learning actually it is exploratory undertaking such as research and development and the investment in data analytics becomes the most important asset for most digital platforms so consider this data analytics now with this new concept we can consider both raw data and the data analytics as two essential inputs and the combination of these inputs generates a revenue to digital platforms and benefit to consumers so in the baseline model we take a reduced form for the revenue and the benefit functions but we do discuss the micro foundation of these functions now we have we can characterize the benefits of the data we can also now discuss the costs so digital platforms build the investment costs for data analytics while consumers incur private costs for data provision so with this methodology we can identify the benefits and the cost of data acquisition and we can construct a social welfare function that is equal to the aggregate benefit minus the aggregate cost so our methodology is very simple we examine the impact of regulation based on the comparison of social welfare function so let me briefly introduce the baseline model there's a digital platform as a representative platform who provides free digital services because we do not focus on this service this service so we assume the cost of providing the service is zero and there's a continuum of consumers so with total population equal to one and each consumer derives utility u from using digital service we assume that u is sufficiently large so from data collection digital platforms earn a per consumer revenue which is called r s and fee from data collection and a consumer derives additional benefit b here s is the measure of data scale we harvested from each consumer and the fee is data analytics so you can think about revenue as a bad revenue generated from online advertising and a consumer benefit may be the benefit from the improved experience in online activities or personalized services digital platforms can control the data scale by scale by varying the number and types of cookies okay so on the cost side now digital platforms build the investment cost called i of fee which is convex and the consumers incur private cost s of theta so here s is the scale and the theta is the key parameter notes the privacy sensitivity and is distributed according to a cdff we assume theta upper is not too large to ensure full participation so consumer now derives a gross utility is u of s fee theta equal to little u that is the utility from using free digital service plus the extra service capital v of s v theta here capital v of s v theta is the net surplus from data sharing benefit minus cost and we also needed to impose the regular assumptions like concavity and for we also denote capital b as the per consumer social benefit that is the revenue plus consumer benefit a key assumption is we assume s and fee are compliments that is partial derivative cost derivative is positive so that could be interpreted as the relationship between data analysis and data scale because the better data next allows for larger data scale and of course the more data can improve data analysis okay so with this construction of the cost and benefit i'm going to introduce a benchmark for welfare comparison so we consider first the best as a benchmark suppose a social planner runs a digital platform and has completed information about theta so the social planner chooses a type contingent to scale as of theta and transfer to maximize total social welfare the total social welfare is equal to the integral you can see from the bracket so b of s theta fee minus s theta theta is the net surplus from net social surplus from data acquisition while there's also the investment cost of i and fee so the first best data scale is very simple you can see that actually it's the balance of marginal benefits of data to the marginal cost and the first best data analysis can be recognized in a way now i should pause here and see if there's any questions no question good just can ask one quick clarification so is it important in your in your model that this utility of consumers that does not depend on phi because in practice you could imagine that you know consumers don't necessarily care about the firms having the raw data but about the fact that the firm can interpret the data and and learn stuff about them that's you of you you mean you uh yes yes you have s s times theta something like that and i was wondering why you don't have that's the cost oh that's cost yeah so you mean the private cost uh he has supposed a specific analysis i just see that but in general we can consider quite a general cost of function but uh it says analysis is quite messy yeah i should consider extension thank you very much and sit you yeah so when you compare the the social planner and the firm the social planner is allowed to offer a menu of options and yes you don't see to allow the the firm to offer a menu of options you know more privacy at a higher price for example you will see actually i will consider this many options so it performs okay thank you very much okay so so with this simple setting we can discuss the impact of GDPR first it's quite natural we can see actually there's a market failure before GDPR because before GDPR digital platforms just bundle free service with data acquisition and digital digital platforms pay zero price for digital for data acquisition as a result platforms chooses uh choose s and fee to maximize its profit uh here we assume a full participation of consumers before GDPR and uh clearly digital platform will collect as much data as possible that's the upper bond of the data scale uh there's over collection of data uh this over collection of data causes consumer welfare laws the surplus the extra surplus from data collection could be negative for consumers with high privacy costs as the overall the gross utility still could be positive because we assume you is sufficient large that is in order to use this free digital service actually consumers with high uh private costs incur a loss as they incur negative surplus another effect is excessive data could lead to excessive investment in data analytics because they are complementary okay so uh there in mind there's a market as a market failure before GDPR now we are going to address the role of GDPR so under GDPR digital platforms must have a consumers consent to collect data as under this requirement platforms are required to unbundle with digital service uh from the default to consumer consent so uh according to GDPR platforms must allow users to access their service even if users refuse to allow data collection this is called the GDPR up out but further if consumers allow data collection through non-essential cookies we call GDPR opinion they enjoy additional benefits be or maybe extra compensation from the platforms as they incur a private cost so with this option consumers with high privacy sensitivity they will up out but they can still use digital service and uh we we can see yeah yeah one comment is that uh you assume that quality of service is the same whether i accept or i reject but because i get the feeling that in practice they have various ways to degrade quality of services if you refuse like to pop up or i mean i pop up every five minutes so yeah yeah yeah yeah so very important comment i was worried about this point but i checked the GDPR website at least according to GDPR uh this regulation it's illegal to degrade the service but uh then we can see that actually uh the platform will upgrade service for open consumers as a compensation so we can see that with this consent requirement actually GDPR opens a door to fix a market failure because now GDPR entitles consumers to trade personal data for extra benefit of course they needed to incur extra cost and the consumers will do so if extra benefits exceed the privacy cost and in order to attract more consumers to opt in now digital platform needed to compensate consumers for data acquisition as we know mentioned that they needed to upgrade services for open consumers offer personalized services so in addition to BSV maybe they want to offer some kind of extra transfer we call TV so many digital platforms now provide a uniform option for accepting all cookies for opting when you log in their website they are accepting or or not if you accept then you can use the extra service so this is uh called the uniform policy so we will first analyze this uniform policy we assume that s and t where t is the transfer so under this uniform policy consumers will opt in if the extra benefit plus b is greater than uh cost as theta so this determines a cutoff of the threshold for participation as called tau so with this tau we can write write down digital platforms profit as equal to f of tau times b of sv minus s tau minus i of f is the investment cost so that is digital platforms need to compensate each opting consumers uh with a cost equal to the private cost of the marginal time and that retains the net social benefit from uh data sharing so uh digital gdpr actually changed consumers default choice and under the gdpr all consumers are secured a non-negative surplus from data sharing up how the consumers receive zero opting consumers receive positive surplus which is equal to uh it's true uh private cost the cost between its true private cost and the price cost of marginal time and so we can just use the first order conditions to solve the uh the characterized equilibrium and uh clearly the equivalent data scale is in dodginess determined by balancing the marginal social benefits with the marginal cost of the type of the marginal type tau and uh so digital platform collects less data from opting consumers this further increases consumer surplus uh so based on this uniform policy we are going to just compare the the impact of the gdpr while we we can clearly see that consumers are better off but uh consumer up how to actually reduce this the equilibrium data analytics and uh because the investment cost can be recouped only from opting consumers so this further leads to a lower data scale due to the complementarity to s and p such kind of reduction can be quite significant i will uh will generate a negative impact on data analytics and what's the evidence from the venture capital as mentioned by ja uh you can see uh will be quite significant about 30 percent so the wrong run negative effect on data analytics is a real concern uh now as i mentioned uh in the first best uh sorry benchmark so uniform uh data policy is not optimal uh because we have had to join as consumers so quite a nature uh quite a natural we needed to think about next design approach actually what we found that many digital platforms like premium leaga and prolific they uh already started offering menu of options of cookies so consumers when you click they will give you a menu you can click different types of cookies uh this this is a menu according to your preferences so we are going to characterize the optimal mechanism which is a kind of menu option but suppose digital platform offers a type dependent data scale s theta and transfer t theta of course the digital platform does not know the value of theta there is a private information and uh so there's a standard ir constraint and i i see constraint sorry okay and uh uh we can use a standard max design approach to characterize the platform's profit uh as uh after uh some calculation we can rewrite as a platform's profit as equal to w df theta minus i theta tf w is the net social benefit from a type theta consumer uh comparing with uh the first best you can see the extra term extra cost which is type c that's the information right uh so here uh a profit pi is equal to the social welfare function so that indicates that is the optimal uh policy uh actually maximize the total social welfare and the asymmetric information so by solving these first out conditions we can characterize the second best contract a second best policy we call the s double theta and d theta uh they are characterized as first out of conditions uh so this uh optimal mechanism design approach actually uh can generate some kind of important policy implications uh so first of all we can see that gdpr established a set of principles but uh they do not provide the detailed guideline and in order to implement the second best there is a long way to go uh in particular according to the current gdpr compliance policy you can see actually there's uh different cookie policies can take different forms uh most digital platforms do not provide the detailed specifications on different types of cookies even uh some platforms offer many of options but consumers do not understand how much and what kind of personal data will be collected so a particular cookie so of course choosing an option of cookies is not as straightforward as choosing mobile phone and so in this paper we are going to uh propose uh some kind of guidelines for for the implementation of so a second best the most important one we would suggest is to characterize and standardize cookie specifications uh for instance a cookie a non-essential cookie specification should include the variety of data being collected the volume of data being collected the purpose of data how data is going to be analyzed and the value of data uh actually uh coincided with my idea actually later I found that Apple indeed introduced a similar kind of concept they called privacy levels in December 2020 uh they required the Apple developers to disclose their data collection by filling out a so-called nutrition levels this is a similar idea to the nutrition facts levels for the packaging you can see maybe you are most worried about calorie and protein similar so from the option or website you can find that the option form actually defines 14 data types 32 specific data items and six data usage uh further Apple also classified three categories of purposes like data are used to track you data link to you data not linked to you so data so Apple's movement actually is an important endeavor to standardize specifications of data features but there is one limitation because its design does not allow a manual publishing so this contributes to implementing its optimal uniform policy but not the second best but we expect Apple can improve this in the future uh okay so with limited time I can only introduce some one or two simple uh extension applications or one important impact or significant impact of GDP is on the third party cookies third party cookies are generated by external domains totally by pricing platforms and this after GDP has rolled out several dominant uh browsers like Safari and Firefox they have brought a third party cookies recently Google announced its movement to block all third party cookies in 2003 so this movement of course is going to have a significant impact on online pilots and according to uh empirical concept estimation that could reduce revenue by 50 but we think it's also important to examine the welfare effect of consumers so I use a variant of the baseline model to address this issue and try to capture the two features of third party cookies first third party cookies can track consumers across different websites therefore it causes a higher private cost than first party so I assume the private cost is gamma s theta with gamma greater than zero versus s theta for the first party cookies second digital platforms do not bear the cost of data analytics but they share a third party profit from selling ads so with this very simple setting we can understand that before GDPR the digital platform does not bear the cost of data collected by the third party but can share the cost so of course it will practice maximum data and it does not care about consumers private concern but after GDPR the digital platform needs to compensate consumers for the private cost caused by the third party cookies so now they face a cost benefit trade off but small platforms they don't have their servers or the demand side platforms they still rely on third party to generate revenue from a private but the dominant platforms like Google they will face out the third party when the private concern is significant significant enough but we find that actually facing out the third party ads increase consumer surplus of course here's the main concerns Google could further enhance its power from this movement as they're needed to balance I also consider data acquisition with personalization but I don't have time to discuss this and actually the baseline model can be easily extended to incorporate personalized pricing and the results are quite similar and policy implications are similar now so with about two or three minutes left let me briefly discuss the rate of literature so there's a growing literature of data collection and data intimidation as yeah I just need to list several like as mogul beg my and the harsh is excellent work and they treat the processor data as information as a signal and their study provides very important solid micro foundation but here we separate the data and the processing as two strategic inputs and emphasize the different nature of benefits and cost from these inputs they also treat consumer privacy costs as a reduced form of utility and assume homogeneity across consumers we consider heterogeneous consumer privacy cost also the most important relation is that with Detroit and the co-authors excellent paper but we have different modeling features and policy implications they consider heterogeneity of privacy sensitivity on different types of personal data for instance maybe consumers are more concerned about their income data than address as they find is that excessive data collection before GDPR is mainly driven by data externality and also argue that monetary inducement for opting should not be allowed we consider heterogeneity of privacy sensitivity across consumers and so we do not consider data externality but we find that excessive data collection is caused by a marked failure due to bundling and we show that the monetary transfer plays an important role in fixing massive data and also characterize the segment as health so concluding remarks so we develop a new model to analyze consumer data acquisition and privacy regulation and we have two features we treat both data and that actually a separate input so consider heterogeneity of privacy cost so we identified market failure before GDPR and it reveals GDPR's role in fixing market failure so the limitations we do not consider data externality as choice paper and other second we do not consider heterogeneity of privacy sensitivity on different types of personal data as choice okay thank you very much that's the end of my presentation stop sharing thank you shijun now we have a discussion by jpv joy jpv okay so it's a great pleasure to discuss issues in the paper on privacy so it has a very clean analysis with policy implications for an important topic so given the limited time allocated to me let me highlight some technical aspect of the model that are crucial for the for the main results after my discussion i'm sure that other people can also chime in for policy implications so in my comments are somewhat technical so let me use a slide for to facilitate my discussion okay so okay so can you see the slides yes okay so so one crucial assumption of the paper or the basic premise of the paper is that the platform is kind of additive so services are provided for free these assumptions are fine because there are so many other finance platforms and that they are very very important however since the main research are driven by this assumption we need to understand the what role it plays and why so to summarize so services are provided for free and in addition the value of this service is just sufficiently large so what are the implications that are all consumers are subscribed to the service even with the maximum data collection so in other words there is no extensive margin so the market is essentially covered and without regulation then this automatically implies that too much collection of personal data because of privacy and what happens with regulation such as a gdpr where gdpr requires opt-in so the platform now needs to compensate to induce a consumer opt-in so basically the role of regulation in the model is to create a market for personal data and in this model there is some an easy asymmetric treatment of a basic utility and the extra utility so the gross utility has two components so one is a basic utility and the other one is extra utility and in the paper all actions are taken place in this extra utility term so this appointment has also raised earlier by Bruno so the assumption is there is some basic utility nothing can be done on this component so you cannot charge any price and also you cannot adjust the quality depending on the amount of collected data so all actions are coming in this component so now then one question is why nothing can can nothing can be done on this component because for instance if the business model is different and the services can be charged then what happens and also what if the basic utility can be can be adjusted depending on the amount of data collection okay if that is allowed that essentially creates a market for privacy so in they can say actually even without regulation we are going to have exactly the same result okay so so let me go to this point a little bit more and by the way with regulation actually what's happening is that there's a two-liter collection of data and this is due to the expensive back in other words the marginal consumer in this model is has actually the highest sensitivity among consumers who are up in okay so that means that average consumer the sensitivity is lower than the marginal consumer so that automatically leads to two-liter data collection so this can be actually pointed out in the in the panel so what if the service is not provided for free okay so then as I mentioned earlier okay so even without regulation we have a price mechanism that will induce the same result as a regulation so then we can imagine a suggestion where actually data collection is gives too much advertising revenue for the platform so actually a platform may want to actually charge a negative price to induce more consumers to up the e okay but we couldn't we can have a suggestion we are actually not negative so negative price cannot be charged so in other words the price cannot be negative so then the price mechanism does not work anymore and then we have a suggestion uh akin to the suggestion of a no regulation in in june the model so here now we have two different interpretations of the model so in june the model so he considers no regulation as the kind of no property right for the consumer and then with regulation now consumers have the right to to to their data so in a sense the role of regulation is to which party to give a property right to to to data so that's the one I wanted to think about so I mean I'm actually taking looking at this this suggestion from a different angle more like a quotient perspective so we know that as long as there is a efficient contracting the allocation of a property right does not matter in the in the overall final allocation so basically as long as the price is the price mechanism is there so actually we are going to have exactly the same problem and then the whole issue becomes uh the non-existence of uh uh market or not well defined property property right that's I mean another way to look at it so that's the main main comments and there are also some small minor comments so every person's individual and this has been covered by and the second point is actually raised by Alex earlier so the assumption is that consumer price causes to depend only on data collected but how data is processed I think the main qualitative results we will go through but I saw in the paper there are a few results that actually depend on this this assumption so also one thing to think about is here data analytics investment we can also think about what kind of investment okay some investment may just try to enhance consumer experience or some analytics investment maybe more focus on how much revenue to derive okay so there could be also what type of uh data analytics investment to be done this can be also something to be discussed and also uh timing in the personal price model I mean this model has not been presented by Zhijian but there was just a little bit uh a strange assumption in the timing so they are actually consumers were allowed to opt out after receiving a personalized offer if I understand it correctly but personalized offer is based on data collected so there seem to be a little bit of time inconsistency in the search so these are some some some uh minor comments okay