 We will have an introductory talk followed by short presentations of six papers. We'll have a short Q&A session after the third paper presentation and we'll have a hopefully longer Q&A session after the last paper presentation. Please feel free to use the chat room to post your questions or ask them directly during the Q&A sessions. The introduction will last for 15 minutes and then each speaker will have eight minutes to present their paper. I will be very strict in enforcing time limits because we're on a very tight schedule as the session needs to end at 5.30 sharp. At the end of the session, this room will close and everyone will be automatically transferred to the main room where you will be able to choose between the next two parallel sessions. The introductory talk will be given by Alessandro Bonatti. Alessandro is an associate professor of applied economics at MIT Sloan. His main research fields are microeconomic theory and industrial organization. He has worked extensively on the economics of data and is widely acknowledged as a world leading expert in this area. Alessandro, we're thrilled to have you with us today. You have 15 minutes for your introduction and then eight minutes for the presentation of your paper. The floor is all yours. Thank you, Yasin. Thank you so much. Please do keep me on time on both segments. Also, thank you to the organizers, Alex, Daniel, Jacques and Paul for thinking about me for this. It is a great honor. It also goes without saying that yesterday was not a very easy day to focus, at least in the United States. But as we all know, data sharing and data markets matter for things also beyond economics, so motivated by that. Let's go. I was reminded thinking about the session of the catchphrase that data is the new oil. The economist has been running this for at least four years. But it is ever more true and shaping wider and wider sectors of the economy. And it's true that data sharing, which is the theme of our session, enables the creation of enormous surplus for consumers and firms alike. Ratings, recommendations, traffic directions, means of transportation, personalized results, tailored products, tailored news coverage, but also for advertisers, custom audiences and various types of consumer metrics like scores or profiles. There are associated risks. Product steering, we heard John talk about search results, tailored prices, election influence operations, addictive social media and the ever more relevant and related phenomenon of echo chambers. As the picture suggests, this enormous potential with an ambiguous sign comes with incredible concentration of power, market power in the hands of few platforms. We know which ones they are in the U.S. but possibly even more so in China with JD Tencent and Alipaba. So I think it's useful to begin this session by thinking about how market power and the success of these players is related to their ability to facilitate data sharing from consumers and for consumers. So let me take a theory step back and think about how you want to share data and how you do not want to share data. So we often hear, at least in Econ and Io talks about companies selling data. So we should not take that too literally. You could imagine Google as an intermediary collecting consumer data and then selling it back downstream to some firms. That type of model would run into a lot of problems. And they've been pointed out by Arrow in 1962. First of all, technically buying consumer information quality is hard to verify. Second of all, once you've given the data in the hands of a firm then they can use it for presumably a very long time. So this doesn't sound like a very successful business model and it's not the business model that they use. Most data is shared in the form of bundling services and information online where this is to lose. We all know that those large players act as platforms with two sides where users submit queries to the platform think of sponsor search and advertisers submit queries to the platform. Who do you want to target? Where do you bid? And the interaction happens on the platform or is mediated by the platform in a way that never makes the actual data ever change hands. Of course, the value creation is exactly the same because as a firm here, I would like to use data to customize my action, my price, my message to consumers and I get to do exactly the same thing in the platform model by bidding for targeted keywords. I think this distinction is important for two reasons. One, to clarify what we all mean in this session when we say we're selling data, we mean the second model. Second, a lot of the key elements that we're going to see in this session are actually central to this indirect sale of information type of business model but I should say it's not even our invention has been pointed out in the context of finance since at least Martin Fleider in 1990. So what are the main themes of the session that we're about to participate in? One is that there's a lot of potential and value from combining data and combining data is going to be a feature that plays on the list three levels. One is statistics, can I merge two data sets and learn more from them? Another one is strategy, how should I leverage additional data? And the third one is regulation, should it be allowed or should it not be allowed and to which extent? The second one is that the platform's market power manifests itself to consumers through, let me say the advertisers or whoever buys the data. Platforms monetize the data on the advertiser side which means that a lot of the implications that are going to come for consumers are actually through something like the pass-through rate of advertising costs or the implications of putting data in the hands of firms. And third but in some sense first even in my earlier diagram the information must be sourced first. So consumers' privacy preferences are going to matter for this and for the amount of data that is endogenously available. So these three teams combining data, market power essentially exerted through the buy side of the data and privacy preferences matter. These are the three teams that if I were you I would look for in all the talks or in most of the talks that we're about to see in this very short format. Now let me spend the remaining I guess seven minutes on talking a little bit about the slight details of the papers we're about to see. I'm not even going to try to summarize them. I will not do them justice in 90 seconds per paper it seems. It doesn't seem optimal. So here are the catch titles. I'm going to fill in a network graph or something like that. But I'm going to start with Tessari's paper on privacy preferences. At the end of the day we're all interested in data and in observing market outcomes and her point in the paper is that it's really hard to infer privacy preferences from realized market data because we don't know whether the consumers with the most stake obtain or opt out or the ones who are most privacy conscious and have an intrinsic preference for not revealing their information. Now she will tell you all about how she set up an experimental study to separate the intrinsic and extrinsic model of preferences in a way that is from my ignorant perspective impossible from actual market reduced data. But my favorite feature of this paper is that it can actually then estimate the gains from trade in the context of the experimental setup from trading data. In the experiment there are users deciding whether to reveal their information and a firm that's going to do something with it. And so the main result there for me is that acquiring data from consumers to build a model and then to leverage it say to set prices does not generate positive gains. What does generate enormous gains from trade is acquiring data from a sample of consumers, a very small number of them, using it to build a model and then leveraging it with everybody else which is of course what you would do in AB tests and the like. So this highlights the presence of a data externality across different agents. This data externality is going to come back in my talk but I don't want to advertise it now, I'll come back to it at the end. What it does is that it relates to the paper that Alex will present on data and product targeting in a very different sense. In Tessary's paper this externality is going to come across markets. So Alex's paper is one of two in this session where we're going to be essentially looking at information structures in the hotel model. In Alex and Uli's paper two firms on the hotel essentially line competing prices and product varieties and have access to different informative signals about the consumers location. And to me the key element that he will emphasize is that better signals allow you to offer better targeted products but if you have worse signals than your competitors then you're going to have to cut prices. What does this have to do with data externalities and across markets? Well, is that if the signal precision is endogenized and one firm has an exogenous advantage from a different market to capture more data about its own consumers well the resulting equilibrium is going to be asymmetric meaning one firm is going to have very precise information and one is going to have very imprecise information with interesting implications for consumer welfare that we'll talk about maybe later but the asymmetric nature of what we will see emphasizes how combining data not from one consumer to the other but from one market to the other can be a barrier to entry. Alex's paper you can almost view it as a micro foundation for one step in what Daniela is going to present where it's a paper that doesn't name too many names but it's essentially a study of Google Fitbit and what we can learn from that and it shows us how a firm in a large primary market can use a data intensive unrelated secondary market to deter entry into its own ground. It's almost like using Fitbit data as a barrier to entry. The idea being that if you're Google operating in an advertising market you can supplement your data with the Fitbit data from a secondary market not to offer better deals or monetize Fitbit users but to accept yourself and your core business from entry in targeted advertising. So a key question that's going to be prompted and Daniela and Jorge are beginning to study it is then how does competition unfold in the primary market here and a key distinction here will be whether the Fitbit data gets merged with the Google data or it gets siloed and that is again a statistical and a strategy question and also a regulatory dimension. There is tangential work on this topic by Darren and co-authors and by Shotaichi Ashi but you know competition with different levels of data here. It's an open topic. So at this point we've seen how externalities and combining data across consumers and markets can generate a ton of value and also have rich welfare consequences and the last two talks in my distinction here are going to be about what are implications for consumers when prices are essentially strategic and there are two very related papers that use two very different methods. So Richard's paper is if anything bringing a fresh modeling device at least to me to this IO theory topic where it's a directed search model on an offer platform and data sharing by consumers makes the platform more attractive. What happens at this point is that better data and better matches are going to lead to externalities to a different channel to the channel of product market concentration and the story there is going to go along the lines of if the platform is more attractive the advertisers are captive because all the consumers are there and the advertisers have to go there and it's going to mean that in a free entry world they will have to be fewer of them and demand they're more concentrated, they'll get better terms of trade with the consumers. So this is one way in which we can come back to that point of the consequences of our consumers coming from the other side. Here better data squeezes advertisers and leads to greater market concentration. You could say exactly the same thing in a market with sponsor search where the platform would offer better matches at higher prices and then again the question of pass through would be critical to what would be the consequences for consumers. I think this is a fruitful area of research. There's an empirical paper by the Cairoleis and Rovigatti that at the end connects to this but it's not a direct match. And last but not least Antoine's paper is also going to look at selling information to competing firms and implications downstream with an information structure that is very dear to me because I call it course cookies from an old paper that I had with Dirk where essentially firms are enabled to identify specific consumer types and then pull everybody else. The innovation in Antoine's paper is to compare very different selling mechanisms auctions, negotiations and take it or leave it offers that really get at the heart of should information be sold exclusively or not. There's also an I might imply there a reference there to an even earlier paper about you can't sell information to a Bertrand competition because there will be no gains exposed. So the finding here is that when the data is very precise then it gets sold exclusively via an auction for exclusive access. And so here again we have a connection between the amount of data that gets collected and whose hands it's going to go to and what are they going to be the implication for consumers. Again I could say the same story here with selling each consumer point by point up here and then looking at the implications of raising the marginal cost of advertising. There are also just different channels through which this idea that the welfare effects for consumers is mediated by data buyers is going to come. So those are the three themes that we're going to see. The data must be sourced and the privacy preferences be them instrumental or not matter and combining various kinds of data that are mostly endogenously priced or just information acquisition all the papers are different of course but that is I think that is the starting point for where I think this literature is cool. So I am very much looking forward to the other five talks. Yasin how are we doing on time? Well time for the introduction is over. Thank you. Thank you. So I mentioned data externality a couple of times and of course my own reading of all of your papers is shaped by what I've been thinking about. The paper with who's a graduate student at Yale that I will present that I am presenting is entirely focused on this idea that correlation traits across consumers can lead simultaneously to a loss in privacy if this data changes hands and to gain information because I get to learn from you. Of course this only happens in the context of a platform that mediates information. So the picture here at the bottom is small but the idea is that there are going to be many consumers who will trade information really in this indirect way but let me not say it again with a platform that will send information back to them in the form of recommendations. On the other side the intermediary is just going to sell the data to advertisers and then at the bottom of my triangle here at the base the advertiser is going to have an interaction with the consumers which will be informed by the data. I'm about to tell you a story of price discrimination but that's really to give you the easiest illustration thereof. What's the key economic force? It's an externality so there's a wedge between social efficiency and equilibrium outcomes. The wedge is given by the fact that consumers are only compensated on the margin for the consequences of revealing their data given that everybody else is already doing it. So we will highlight two types of market failure intermediation of inefficient data and lack of intermediation of data that would be actually profitable and sorry for the lack of intermediation to intermediate. The more distinguishing feature I think of our paper is that we want to think about what kind of data will get endogenously intermediated. Will it be aggregate or will it be personal? And the main result that I won't have time to go into any details about will be that this platform will optimally choose to sell anonymous for all synonyms in the model if and only if that intermediation is inefficient. So that while on the one hand we are going to obtain inefficient intermediation in general the aggregation level of the data will be socially efficient and not only that it will drive increasing returns in sample size and sort of provide a foundation for how and why we think that these platforms have an advantage in acquiring humongous data sets. So let me let me tell you a little bit more about the model. So in the model we are going to try and be as detail free as possible while maintaining quadratic payoffs so really we're going to be thinking about a joint distribution of consumer types but also in complete information on the consumer side. So each consumer has a type these types are correlated that's why there is exonalities but we only observe informative signals. Signals and their errors can also be correlated we can all have the same biased impression of the value of a new product and filtering out the noise will be the source the source of value. This is going to be cast in a world where my type is the intercept of my demand but I don't know my own demand and the producer could charge personalized prices but they don't know our demand functions. So in the data market the information of the consumers is going to be supplemented how each consumer can agree to sell their data their signal exam for payment and the producer who is going to buy this data can also pay a fee and access a data outflow policy the data outflow policy means how much data does the firm get and what recommendations for buying do the consumers get on the other side which of course is also pay off relevant for the producer so how is the data going to be used well I'll use the data and the recommendations to update my type the firm will use it to update their beliefs and charge personalized prices that's sort of the illustrative example let me just tell you two things which are what are the key modeling choices that we see as crucial and it's descriptive of this environment and then there's a summary of the results and I'll be there what are the modeling features that we insist on and that are important so I'm de-emphasizing this price discrimination aspect because again you could plug in any other game one that any information beyond the common prior is held by consumers so again it must be sourced before it can get used and create value for consumers by teaching them about their own preferences so it could be that we all have the same type and so by learning your signals I learned something about my type it could be that we all make the same mistake and by learning all your signals I can filter out the error and by difference I can learn my type at the same time the social data can be exploited by any data buyers so the price discriminating monopolist can have all the other negative features that I mentioned at the beginning of my introduction and then the third feature that it's not ubiquitous but I think it's representative is that participation constraints hold exact so I don't decide whether to adjust my posts on facebook depending on my type but I do decide on average whether the terms of service and conditions and whether I use the platform or not so with these essentially four key ingredients then we look at the contracting game between the platform and the consumers we show that those MI or MI stars the payments that go to consumers are actually only determined on the margin where consumers are compensated for individual harm the data might cost to them the social one and we show that the cost of acquiring information vanishes while the gains persist as the market grows even if the total effect of information is to reduce social surplus so that's a pretty bleak picture it's just saying there's very little chance and in this market with externalities and market power you're going to get an efficient allocation of data then we things become a little bit better once we look at the optimal data sharing policy because we find that it's actually more profitable for the platform to induce uniform prices rather than personalized prices and to give personalized product recommendations if we extend the model in that direction why is it optimal to forbid the price discrimination by mediating aggregate data the wrong intuition is that oh you're protecting the consumers privacy therefore it's cheaper for you to buy this data that's true but it's also true that the firm is willing to pay less because it's less precise so that can't be it the reason is that aggregate data makes the consumers signals more substitutes and because you're only compensated on the margin that makes the marginal compensation proportionally smaller and makes intermediation therefore more profitable it remains that there's a lot of work to be done on the policy side and on the regulation side and on the design side because we do get socially efficient anonymization but not socially efficient intermediation decisions Alessandro I must I think I'm going to leave it at that look forward to the rest of the papers okay thank you so much we're now moving to our next talk the next speaker is Antoine Dubu from the Free University of Brussels Antoine the floor is yours thanks Alessandro for your very nice introduction and thank you everyone for attending the presentation of my paper Market for Information and Selling Mechanisms that I've been correcting with David Bruny and Patrick Webrou from Telecom Paris and in this paper we deal with the issue that access to data is providing firms a very strong competitive advantage and in particular the starting point for research was that in 2018 it was revealed that Facebook was providing several firms with a very specific, special access to data and so this was giving these firms a very strong competitive advantage while other firms were denied this very same access and so by doing so Facebook was shaping competition in markets and so it is I think one of the many reasons why in 2019 the German competition authority decided to prohibit Facebook from combining consumer information from several sources and by doing so there is this recognition that controlling access to data which is usually done by a data protection agency actually will also help restore competition in markets and so there is this strong relation between access to data and data protection and market competition and we are actually investigating this relation in this paper but what we do is that we show that this one way relationship is actually both ways and that if you control for market structures in particular we will focus on the selling mechanisms you will also have an impact on the profitability of data and to the incitations of firms to collect consumer data and to market competition in terms of consumer privacy and so what we do in this paper is that we consider several selling mechanisms and we see how the strategies of consumer data collection and consumer data sale of intermediaries are changed with different selling mechanisms and so to give you an idea of what we consider in this paper we look at the following now classical representation of the market for consumer information where you have a data intermediary that collects key segments of consumer data and then it will recombine these segments and sell it to competing firms that will then use it to optimize our interactions with consumers so in our model with price discrimination and so what we are interested in is the relation between the data intermediary and the firms and in particular how information is sold and how this will impact on the one hand the amount of consumer data that is collected based in the intermediary and on the other hand how it will impact consumer surplus and competition in the downstream market and so we like in the paper we consider a more general representation of any selling mechanisms but for the presentation we focus on three of them which are TikiTali Vito first, T-Control Bargaining and First Price Auctions that are commonly used now in the industry and what we do is that for each selling mechanism we look at the number of consumer segments that is optimally collected and sold in equilibrium and then we compare them for each same mechanism so the timing of the game is the following first we, the data intermediary will choose how many consumer segments to collect so I will define in the next slide what key and GR then the stage after the data intermediary will sell information strategically and then firms with or without information will set prices and compete and so the way we will represent the data intermediary is using a standard modeling competition model where firms are located at the extremities and uninformed about firm's location and the data intermediary so here a data broker has a representation information that allows to segment the unit lines into key segments of sales whenever a game and so with this information the data intermediary can recognize whether a consumer belongs to one segment or the other and so this allows to have information and so the number of consumer segments game is actually a proxy for the precision of information because with more key segments you have finer segments and so you can recognize consumer more precisely this is for the data collection stage but also the intermediary will sell information strategically and that is it will sell an information structure that we describe at the bottom of this slide and more for instance here it will sell to aeroflot all consumer segments up to a cutoff point and nothing after a while and why is this partition optimal for aeroflot well because consumers with a high valuation are identified and this allows to extract this surplus from this high willingness to pay consumers but the more segments are sold and the higher competition will be on the market and this will lower the price set by the other firm as a reaction and this will lower the ability of the informed firm to set high prices and so there is this threshold after which setting an additional segment will increase competition too much compared to what you can expect from more surplus extraction from consumers and so there is an optimal gene so with the strengthening mechanisms that we consider and in the article the J will be the same for a given precision K but actually in the general formulation we see that this J can be a strategic element that is chosen by the intermediary to optimize its revenue and this will also be linked to the level of competition on the market and so to jump to the results and what we show is that actually the selling mechanism has a strong impact on the amount of consumer segments that are collected by the intermediary and in particular as you can see here the consumer data collection is minimized in take it all if it offers and maximized with the sequential bargaining mechanisms and you have an inverse relationship with a consumer surplus and actually you can see here that the data protection agency willing to minimize consumer data collection would rather have the take it all if it mechanism and it's the same for a competition authority to maximize consumer surplus but then if you turn to what's chosen by the industry take it all if it offers usually not what's preferred because it minimizes the profits of the intermediary and so conclusion of our model is that you have this strong potential conflicting view over what is best for the industry and for regulators and as the selling mechanism will have a strong impact on the amount of consumer information collected and sold but also on consumer surplus there is this tension between what is chosen by the industry and what would be chosen by regulators and so now if you look at selling information to both firms, the information partition has the same feature as before and we decided that a share of consumers is left unidentified to firms so that competition on the market remains a bit low and this increases the profits of the firm and their willingness to pay for information so you have the same mechanism as when selling information to one firm and we show that for the three selling mechanisms only in take it all if it offers the intermediary wants to sell information to both firms with two other mechanisms selling information to only one firm is preferable because selling information to both firms lowers the price of information because it increases too much competition on the market and just decreases too much profits of the firms and so it's another conclusion of this research that selling mechanism has a strong impact on which firm can acquire information and in particular again take it all if it offers it's preferred by regulators because it allows to have a fair and equal access to information and a higher competition on the market and so finally in regulatory rituals that we explore one stands out which is open data which has been proposed by several reports and scholars and we show that in all setup open data would go back to the same equilibrium as the one in take it all if it offers where both firms compete fiercely and have there is a fair and equal access to information and this is better for consumers but also it minimizes consumer data collection so this is the end, I am happy to answer any questions if you have some. Perfect time in, thank you very much Antoine, we'll take time for questions later but for now let's move to our next our next speaker is Desiree Lien from Boston University Desiree, the floor is yours Right, thank you Yacine so I'm going to talk about measuring consumers valuation for privacy and here I will distinguish between the intrinsic and instrumental preferences and also so the paper also uses the variation of privacy preferences as an entryway to understand consumers side of the firms data collection and inference strategies now the distinction between the intrinsic and instrumental preferences was first proposed by Gary Becker in 1980 so the intrinsic preference is a taste people may value privacy regardless of any particular economic consequences associated with data sharing because they associate privacy with personal freedom and autonomy you can see this in the recent contact tracing example where people seem to be intrinsically averse about the idea to be checked now the instrumental preference is more familiar to us as the concern that revealing one's private information to the firm may lead to negative economic outcomes for example a risky driver may be very reluctant to share their driving history to an insurance firm however a safe driver may be pretty okay with doing so now why is it useful to empirically tease apart the intrinsic and instrumental preference the first reason has to do with selection so in the previous example we can see that a model with pure instrumental preference will predict that consumers who choose not to share the data are mainly the low types meaning that these are the drivers or the consumers who will otherwise get negative economic outcomes upon revealing their private information however such insight does not necessarily hold when consumers also have heterogeneous intrinsic preferences for example suppose safer drivers actually intrinsically care about protecting their location information more than the rest of the population because they feel more strongly that letting the firm know about where they are at what times is creepy in that case we should expect that the set of people who choose not to share the data may be statistically tilted more towards the safe drivers so empirically characterize the respective heterogeneity of the two preference components to help us better understand this selection pattern the second reason has to do with endogeneity so the intrinsic preference is a utility primitive but the instrumental is indulgences driven by the ability of the firm extracting the insights from the data and how the firm is using the data to deliver targeted payoffs so measuring the respective magnitude is going to help us understand for example the impact of a new policy shock or it can also help us understand when the firm's data uses strategy changes for example when the data collection strategy changes how would the weather or to what extent the privacy choices actually respond so in the paper this is what I'm going to talk about for the sake of time so first I will talk about how I use an experiment to empirically separate the intrinsic and instrumental preferences I will show you the review preference measure and how the dollar magnitudes and I will also show the heterogeneity across different demographics and then I will show how the structural model that I estimate is going to back out consumers belief on the instrumental outcome as the utility primitive in doing so it takes care of the fact that the instrumental outcome itself is indulgences then I will show you briefly how the empirical selection pattern is driven by the heterogeneity and the correlation between the two preference types now one of the I guess one of the very part of the paper which Alexandra also mentioned is the data collection strategy which takes care of this information externality however for the sake of time I'm not actually talking about that so I'm very grateful that Alexandra actually summarized that part at the beginning so let me give you a very simplified version of consumers utility model there are actually several components that often get confounded with each other especially in observational settings so I want to show you why they are different and how my experiment is going to help in teasing them apart so think of a consumer that gets a request from a firm to share their personal data the decision is binary and involves the tradeoff between the privacy cost and the benefit from sharing so here you see actually there are three components the intrinsic preference, the primitive the instrumental preference which is induced by the targeting scheme and then there is an additional benefit term so this compensation is what the firm offers to the consumers before the firm knows the private information of the consumers so for example it can be a one of discount that the firm offers to encourage the drivers to adopt a UBI device the compensation is independent of the private information because at that point the firm doesn't know about the consumers private type yet as a result it's very distinct from the instrumental incentive which is always a function of the consumers private information so the experiment will solicit reviewed preference for privacy from the consumers by requesting the consumers to share their responses to sensitive person questions I will turn the instrumental incentive on or off across treatments by using a bonus term that the firm offers so this bonus term is an additional term when the firm learns from the data that this participant is a lively customer based on their income profile and their enthusiasm about the product that the firm sells in doing so I will be able to tease out the intrinsic and instrumental preference in addition I will independently vary the compensation term this is a price for data that is the same across participants so then I will be able to translate the preferences from the utility space to the dollar value space so for the sake of time I will show you three sets of results briefly first is the intrinsic preference distribution so this is the intrinsic preferences across different personal data requested which is along the y-axis and among the consumers which is along the x-axis and the scale is in terms of the winningness to accept a wider spread of the distribution here indicates more heterogeneity so you can see that for example a consumer at the 95% quantile often value their privacy intrinsically more than twice as much as consumers at the median I also find meaningful variation of the privacy preferences among different demographic strata for the instrumental preference what we are interested in is what the consumer's belief looks like and whether it's consistent with the actual targeted incentive scheme that the firm offers now what do we care about this it is this belief scheme that ultimately determines the scale of the instrumental preference for example suppose an insurance firm gives vastly different contracts to different types of drivers but the drivers are actually not aware of that in that case the instrumental preference would have been zero so my estimation result shows that the consumer's beliefs are first-order consistent with the actual payoff there is a caveat here though which is that the estimation results are based on an information environment that is relatively transparent and lastly we want to understand how the two preference component jointly determines the empirical pattern of selection into data sharing now it turns out what matters are two things which one is more heterogeneous than the other for example if the two are positively correlated or when they are independent then we see that classical prediction holds which is that the set of the consumers choosing not to share the data are more likely to be low types however the opposite selection pattern may hold when the two are negatively correlated and when the intrinsic is more heterogeneous so that it dominates the selection pattern indicated by the instrumental preference so to sum up consumers care about privacy both intrinsically and instrumentally the heterogeneity pattern and the magnitude shown here has implications on data collection which is that the firm will only find data collection feasible by leveraging this additional data externality among the consumers and my paper also talks about how an inference so from a statistical or econometric standpoint I'm better at methods to analyze consumer data that account for the selection pattern I think that's all from me for now so thank you thanks very much sorry so let's move to our next speaker Alex Gamble from TSE the project is called data product targeting and competition is joined with Li Hege it's a research in progress as it were so I don't need to motivate the general topic of the paper just to say that this issue of how data is used this increased access to data is used a lot of the research that has been done on this looks into data enabling price discrimination and that's obviously a very important issue what we want to do is move a little bit away from that and think a bit more about how data can be used for product targeting that is how to actually tailor a product offer on the basis of available data and what we are then interested in is of course to understand how this affects competition incentives for data collection incentives to provide data and possibly regulatory implications so what I'll present here is really the basic model and some of the economics that go on in it without going too much into kind of the extensions and regulatory implications so the model is incredibly simple we just have a consumer with a unit demand and an annotate so this is like a hotel line of infinite size and there are two firms who get independent noisy signals about this preferred location of the consumer and the size of the interval so they basically learn in which interval the true taste parameter lies but they don't know where within that interval the preferred product specification lies now based on this information the two firms simultaneously choose a product specification and the price so there's no first choosing location and then learning from it and then setting price so you obtain this game where you choose the location and the price at the same time and the production costs are normalized to zero and the consumers utility from purchasing from firm i is just a valuation that's a common knowledge parameter in this model v minus the distance between the actual product that's being offered and the preferred specification and the price that you need to pay so that's a model so it's really very very simple now what are the economic effects that are going on here so let's think about what happens if you improve the information of one firm so that means you make the interval smaller where the true realization of the preference may lie given your information so you improve the information of one firm what happens well on average both firms will now offer more similar products because they actually know more precisely what is the preferred product of the consumer if they offer more similar products they end up competing more fiercely with each other so more information actually makes competition fiercer there's a countervailing effect to that which is that if you give one firm better information then it knows that it can tailor the product better so it increases consumer capture so it allows that firm to increase the price and still sell with the same probability if you want so it can afford to increase the price so there are two effects that go in opposite directions how do they hand out in equilibrium as it were so we distinguish between what we call an information laggard so this is the firm that has less precise information and the information leader who has more precise information and the laggard will charge a lower price in order to compete against the dominant information superior leader but what's interesting here is that the improving the laggard's information actually reduce both firm's prices so the effect A here if you want is dominant so competition gets fiercer but if you improve the leader's price this has an asymmetric effect so it will actually improving the leader's information will increase the leader's price so effect B will dominate but will reduce the laggard's price so what's the effect of this on profits and consumer surplus so if you have more data then this always improves the available choices for the consumers because products will be more better tailored if you give more data to the laggard the market becomes more competitive so that's good for consumers but bad for profits if you give more data to the leader then the market actually becomes less competitive in the sense that the expected purchase price that the consumer ends up paying actually increases because this effect of the information leader charging a higher price will come through here and this is obviously bad for the consumer good for the leader and bad for the laggard so there are quite nuanced effects if you want going on overall more data will always improve consumer surplus and also total surplus but as I showed before and I want to just show this briefly again it's the effect on profits is not that straightforward so if you think about the incentives for data collection if you're the information laggard so this is the profit line in this segment here then more information is actually bad news because it makes the market more competitive you actually prefer not to have that information because it allows you to if you want to end up coordinating on a lower price now the only time when it's not true is if the we're not talking about marginal changes in information but a real jump in information so you can if you want leapfrog as a information laggard to jump from being a laggard and actually becoming an information leader so if you can improve your information a lot then this may be desirable for the laggard just briefly to compare this with monopoly actually the incentives to provide information are quite different so in duopoly the consumer always wants to give more information because essentially this improves the average product offering and makes firms broadly speaking more competitive so this is a good news in monopoly that's actually not true so in a monopoly you have a hump shaped consumer surplus with respect to information basically because here the consumer trades off on the one hand a better product with better information but also ends up being subject to a stronger price discrimination and that effect may dominate now maybe the one result we found quite surprising is that total surplus in monopoly may actually be higher than in duopoly and the reason for that is that when the laggard offers because the laggard offers a lower price sometimes he ends up selling a product even though it's less well tailored than the competing product and that's an allocational inefficiency you of course have to trade that off against the increased choice that's available when you have two firms with two signals making an offer but in particular the monopoly dominates when the additional firm that enters if you want is significantly less well informed because then you have very different prices and this allocational inefficiency may be important now this is the basic framework today we can use it in a number of directions to think about various extensions one is to think about data driven mergers so you combine the two signals in one firm and this may increase total surplus we can think about the predatory effects of data to destroy profits of a competitor if you want to thereby deter entry or forced exit and we can think about cross market learning where we show that firms may actually offer biased products but may do so at lower or higher prices so this effect can go in either direction okay I'll stop here thank you okay great so thanks to the organizers for having us and thanks to Alessandro for a wonderful introduction this is joint work with Tomah Filippen so I think by now the this introduction is redundant but it goes without saying that large internet platforms have fundamentally changed the way you know market participants interact and one reason for this is because they gather and analyze large amounts of data from both consumers and sellers and of course there are a variety of benefits and costs to the some of which have been discussed in some of the talks so far so in this paper we're going to ask three related questions the first is how does data gathering affect buyers and sellers interacting on the platform the second is there a rule for collecting data collected by platforms and third how do new internet platforms differ from traditional retail brick and mortar ones as a preview of the answer is well as to how data gathering affects buyers and sellers we're going to focus on one set of benefits and costs the benefits are going to be the enabling of better matching on the part of buyers and sellers the costs are going to be the increase in market power of the platform relative to sellers and I'll describe this in a second which leads to a preview of the answer to the second question is to whether there's a rule for regulating the platforms and it turns out because of the latter effect regulating data can have the effect of reducing the market power of the platform relative to sellers which in turn incentivizes seller entry and then benefits consumers as a result of that finally as to how new internet platforms differ from traditional ones we find that this need for regulation arises only with the data processing capacity of the platform is particularly large and what we argue is this is exactly what distinguishes newer platforms from older ones so let me briefly talk about the implications for regulation you know as we know there are currently two largely separate regulatory measures being taken against platforms you know on the data side this GDPR and of course there's a bunch of recent antitrust measures being taken in both the EU and the US against platforms and our paper suggests that these two measures should be closely linked in particular in our model the ability to gather data contributes to the rising market power of the platform and so consequently the regulation of data collection may have the benefit of increasing competitiveness let me give you a brief overview of the model it's sort of a standard platform model that you've seen so far so it's going to consist of buyers platforms and the sellers and this other trading venue which we call the outside market it's a static model and the first thing that happens that buyers decide whether to search for a new product in the platform or the outside market conditional on participating in the platform they make a simple disclosure choice that tells the platform some data about their tastes next sellers decide whether to sell the good on the platform or the outside market if they decide to sell on the platform they have the option of purchasing this data from the platform and finally buyers and sellers interact on either the platform or the outside market so buyers are pretty simple there's an exogenous mass of them they have uncertain tastes across varieties I and we're going to model this pretty simply so they're going to have taste you for some variety where you is positive and zero for all other varieties they make an information disclosure choice in the platform and this information disclosure choices is delta between one where I and delta bar and this disclosure choice is going to be associated with the signal with the property that the signal value is I conditional in the true value being I is delta okay and on the outside market there's no such information reveal sellers can sell one unit of good on either the platform or the outside market they have an entry cost kappa they purchase this data on the platform and this price is going to be determined via bargaining I'll talk about that in a couple slides once they have this data they decide which work which variety to produce on the platform they end up producing the correct variety with probability delta and in the outside market because there's no additional information they just uniformly pick a variety so the probability they get a correct is one of rye and finally buyers and sellers are going to interact in competitive search markets on either the platform or the outside market so the platform consists of a matching technology and a data processing capacity the matching technology here is simple called Douglas so if they're NB buyers and NS sellers the probability that the buyer meets a seller is just alpha bar N to the one minus gamma where little n is just the market tightness the data processing capacity essentially captures the maximum precision of the signal the platform can compute so higher data bar is a higher delta bar means that a more precise signal is feasible on the part of consumers so they can get higher precision and they sell this data to sellers and the data here consists of the information disclosure choice delta and the realization of the signal sigma the outside market is just associated with the simple Cobb Douglas matching technologies are pretty similar to the platform but the key thing is that the matching efficiency alpha bar maybe a little lower than that of the platform that's the only potential difference between that but the key other difference is that there's no additional data collected here okay so competitive search in the platform is pretty standard if they're NS sellers and NB buyers share what payoffs look like for the buyers and sellers the key thing to note is that both payoffs have this direct effect of being increasing in delta which is this match efficiency but the buyers payoffs are increasing in tightness and the sellers is going to be decreasing tightness and the key property that's going to determine results in our model is how this endogenous market tightness moves as a function of the information disclosure delta so bargaining between platform and sellers is simple Nash bargaining so here theta is the weight on the platform and here you can see the price is determined by this simple formula which is pretty standard the key thing determining this price are the outside options of both the seller and the platform and this is kind of crucial to our results so let me get to what these these outside options are so for the platform we think of these outside option as you know think of Amazon basic so the idea is that instead of selling this data to sellers the platform can use the data and produce a good itself and this VM of delta is a payoff and the sales of the product are associated with that the key thing I want you to note is that this object is increasing in delta so as more information is revealed by buyers the outside option the platform increases with sort of increases of bargaining power this will be called a copy cat effective information for the sellers the outside option is to selling the outside market and the key thing to note here is that the sellers outside option is actually decreasing in delta because as delta increases the market and so that reduces the seller's payoffs so this is what we call the customer access effective information so the punchline is as delta increases outside option for the platform increases outside option for the seller decreases which leads to a key equation in the paper which is what is the relationship between data and market power so here's the payoff of the seller of interacting on the platform there's sort of two key effects I want you to take away the first is this which we talked about earlier more better data enables better matching but the second is a sort of increase in market power the platform relative to seller due to both the customer access and the copy cat okay and this discourages seller entry so consequently you know the effect of delta on seller entry and hence market tightness is ambiguous and what you can show is that and is actually decreasing in delta if the bargaining power theta organ is particularly large so given that when we go to the information disclosure choice that is a simple exogenous cost function lambda of delta I want you to take away is that buyers do not internalize the effect of disclosure and market tightness so they do not internalize the effect of delta on it which is exactly what the difference is between the buyers problem and the planners problem the planner in addition to taking into account the private marginal benefit of delta which is the first term in this equation also takes into account the effect of disclosure and market tightness and internalizes that higher delta can actually reduce seller entry and lower buyer surplus as a consequence so that may be a reason for why it wants to regulate now that's so in particular the buyer wants to restrict information disclosure if the derivative of this equilibrium market tightness at delta B which is equilibrium delta B is less than zero and this turns out to be true when delta B is very large i.e. when a lot of information is being collected by the platform and this is exactly what we want to say is what distinguishes newer internet platform from older ones is that the ability to collect so much data that you're sort of on this decreasing part of this market tightness so that regulation may be necessary and with that let me conclude thank you very much Rishabh