 Good morning. Good afternoon. Good evening everybody. Welcome to our seminar on the economics platforms. I have the honor to moderate today's seminar. We'll have Tiffany Tai present her work on steering by algorithmic recommendations. And then Daniel Ershav will take five minutes to discuss the paper. The structure of the seminar is 40 minutes for the presenter, five minutes for Daniel for the discussant, and then we have five minutes of Q&A. If you have clarifying questions during the presentation, feel free to write them in the chat and I will interact Tiffany as she goes so that we can ask those clarifying questions. If you have much broader questions, keep them for the Q&A at the end in the last 15 minutes. Okay. And another thing, if you can, you don't have to, but if you can keep your video on, it may feel more like a seminar for the speaker. And without further ado, I'll leave it to Tiffany to share her slides and introduce her work. Okay. Thanks for having me. So I'm presenting a paper on steering, and this is joint work with Nan Chen, who is currently at the Department of Information Systems at NUS. So algorithmic recommendations are becoming an important tool of information intervention for many platforms, and it is becoming more and more prevalent and may have a large impact on consumers. For example, 80% of the movie watch on Netflix are coming from these recommendations generated based on big data and machine learning methods. And at the same time, we also see these large internet platforms develop a vertical market structure. Or in other words, a dual role, meaning that they are not only the information intermediary, but also players in the related market. And this dual role can potentially create incentive conflicts between the platforms maximizing incentive and how they use the tools for information intermediation. And one example would be that Google was accused for its search ranking bias that potentially favored its own affiliation. So this paper is trying to understand empirically whether and how this dual role may affect the behavior and quality of algorithmic recommendations. And our empirical context is Amazon.com, which is Amazon marketplace in the United States. And Amazon is known for having a dual role. So it owns the marketplace and guided consumer using product recommendations. But it is also one of the retailers in some of the product markets. And previously when I presented the paper, when I mentioned Amazon as the retailer, many people will first think of Amazon's private brand. So as you can see from the graph, for example, Amazon offer its own brand of battery, Amazon Basics. And this product is only sold by Amazon. But Amazon also sell other non-private brand product together with some other third party sellers. For example, they also sell a dual role sale. But Amazon does not sell all the battery. So some of the battery are only sold by third party sellers. And here due to our identification strategy, in this paper, we won't focus on Amazon's private brand product. But as you can see from the table, this market only account for less than 5% of Amazon's first party sales. And we will focus on the non-private brand product where Amazon is selling the identical product together with third party sellers. And these markets also account for more than 90% of Amazon's first party sales. But here we mainly focus on the case A and B, which we call then Amazon selling market and third party only market. We can also see these two markets as integrated and non-integrated markets. And Amazon has millions of products on the platform. So it is important to use on-site recommendations to guide consumers. And it has been shown that 30% of the page views are coming from these on-site recommendations. And Amazon has different on-site recommendations with different names. And in this paper, we focus on the recommendation call, frequently bought together, and I will call it FBT from now on. And the figure on the slide illustrates how FBT works. So the first product, we will call it a referring product, which is the product that the consumer is currently looking at. Okay. And the second and third products are, we will call it recipient product, which are the product recommended by referring products. And each referring product can only recommend maximum two products. And a product can receive different FBT from different recipient products. So a popular product can receive more than 10,000 FBT. And here also, we would like to know that the FBT are made on the product level. So it's not recommended to a specific sellers. So consumers can choose to buy from all the sellers if they click the FBT. Okay. And FBT also are generated based on item-to-item collaborated filtering, meaning that the recommendation only depends on the referring product. Okay. So we also check whether the FBT would differ using different devices. And we do not see FBT changes under different devices account or browsing history. And let me just give a simplified example on, you know, why Amazon as the retailer may lead to economic incentive to steer. But of course, the real world will be much more complicated and many variable like referral fee might be endogenous. Okay, but let's consider the same product with the same prices and popularity. But the first case is that the product is sold by Amazon together with some third-party sellers. And another case, case two is that the product is only sold by third-party sellers. Okay. And here if a consumer purchase from Amazon, the platform can earn the full retailing profit. And if consumer purchase from third-party sellers, the platform can only earn the referral fee, which on average is around 15%. So the platform actually earn a higher profit when it is the seller itself, as long as the share of platform sales are positive. So if we are thinking that the platform is allocating the limited loss of FBT to maximize its profit, it will be more likely to recommend Amazon selling product which gives the platform a higher profit. So we will call this a steering, because the recommendation depends on seller identity and is not completely driven by consumer preference. And under this example for a given product, conditional on prices and sales, the probability of getting recommended will be higher if Amazon is one of the sellers. And this will be the key prediction we would like to test empirically. So Tiffany, this is a perfect time to stop. Jacques has a question, a clarifying question. Wouldn't the cost of Amazon be higher when it is selling the product? So here we only kind of thinking about this is like the margin, right? And the cost of Amazon and third-party seller, which one is higher, is not very clear to me, right? But Jacques, would you like to unmute yourself and clarify your question? Yes, sir. I don't understand because you're saying that you're saying that the revenue of Amazon is bigger. Basically, if you understand where your computation, the revenue of Amazon is bigger when it's selling directly. Right. The cost is going to be higher because, I mean, for instance, if a third party uses Amazon logistics, it's also paying Amazon for the logistics and so on. I don't understand what we're supposed to get from. Why are we supposed to understand that Amazon makes more profit when it's selling directly? Right. So yes, here we kind of make a simple assumption that there's no cost, right? But indeed, kind of Amazon will incur some cost of fulfillment or shipping or even the production, right? So here we didn't take the cost into account, right? If you think about why Amazon will enter the market in the first place, of course, there's a profit for entering, right? If actually only third-party seller sales will generate a higher profit, Amazon will not choose to enter in the first place, right? So here we kind of thinking that Amazon still earn a higher profit even with some cost. But do we know that? From what I've read, I thought it wasn't clear that Amazon was making more profit when it was selling directly than when it was third parties. Because we don't really observe the cost, so we cannot say that 100% sure. Then I promise that's my last question. But in your empirical work, do you assume that Amazon makes more profit when it's selling directly? So we tried to identify steering and we didn't try to kind of make any assumption on the cost side, right? So we cannot say the steering is driven by because Amazon is earning a higher profit, right? But I think, yeah, maybe we need to be a bit more careful on this argument, right? Great, we can bring it up back in the final discussion. Gary has another question around the costs. I'll just mention it, but maybe we can bring it to the discussion later. Shipping time and cost can be quite different when purchasing the two products. Right. Okay, so these people would like to ask, does Amazon recommend Amazon selling products over third-party only products? And basically we would like to identify whether there's a bias in algorithm-making recommendations. However, it is a bit challenging to clearly identify the bias and this will require a causal analysis. And suppose we find a steering, what is the eFAT on consumers and third-party sellers and overall efficiency. And answering this can give an implication on policy design. And maybe it might seem natural to finding a firm maximizes profit, but it's actually not very clear whether the steering will actually happen. Okay, so first we don't really observe the algorithm because algorithm is not available to the public or the government. And even if the code is available, it's not easy to understand the algorithm's objective. And even if the steering may lead to a higher share-round profit for Amazon, the platform may not do so if you worry about the steering may lead to a regulatory or reputational change in the long run. However, this incentive may also depend on whether the algorithm are transparent or not. So overall we think this is an empirical question and that's why we take an empirical approach and try to identify steering using a research design and causal analysis. So let me briefly describe our approach. We first collect large data in high frequency. We collect data that covers six million products that have more than 100 customer reviews at the time of data collection. And we track each product's prices, sales ranks, and recommendation records over time. And our data will have two main variations. So first we have variations in product recommendations. So for a given referring product, it may recommend different recipient products over time. And this variation allows us to identify, estimate the EFI of FBT on recipient sales. And the second, which is the crucial variation in our data, is the variation in Amazon's presence within product over time. And this temporary variation is driven mostly by Amazon StarGal or reentry after the StarGal event. And this variation allows us to identify whether the probability of receiving FBT will depend on Amazon's presence or not. And this figure illustrates our identification strategy. So Amazon selling product has both Amazon's offer and third party sellers offer on the product page. And when Amazon is out of stock, the same product is still available to purchase from third party sellers. And we will examine whether the recipient product receives fewer recommendations during these StarGal periods, conditional on prices and sales. And let me briefly summarize our main finding. So first, we just simply compare the summary statistics and we found that Amazon selling product received 1.5 additional recommendations than third party only product. However, this does not directly suggest steering because whether Amazon sells a product is not rendered. There could be like observed product features that make Amazon product receive more recommendations and are correlated with Amazon's entry decisions. For example, Amazon is more likely to sell a popular product or a product that are more complementary to others. So therefore to identify steering, we use within product variation in Amazon's presence by comparing the same recipient product with and without Amazon as the retailer. And we found that controlling for prices and sales, the same recipient product is 8% less likely to be recommended during Amazon's StarGal period. And note that this is unlikely to be driven by product availability because the same product is still available to purchase from third party sellers. We will also show more robustness check later when we discuss the results. Sorry, quick question from Luis. How important is that 1.55? So what's the baseline? The number of recommendations. Yeah. Sorry, I forgot a bit how what's the percentage here. Let me look at the summary statistic very quick. So on average, a product will receive about one recommendation. So it's actually quite big more than 100% than the mean. Got it. Thank you. Okay. And the second result is that we try to provide evidence that the steering is consistent with the platform's profit maximizing incentive. Okay, so we found that there's more steering in the product category where recommendation can generate more sales. And we also exempt the efficiency in order to understand the policy implications. So we found that recommending third party only product can generate more sales than recommending Amazon products. So this implies that the platform is not allocate recommendation to maximize the total sales and the steering could potentially be harmful to consumers and third party sellers. But of course, this will depend on some assumptions, and I will discuss the assumption in detail later when we get to the last result. Let me briefly summarize the literature. So this paper is mainly related to three strands of literature. First is it is related to empirical paper on anti-competitive effect of vertical integration. So we can see the platform's dual role as a type of vertical integration and the anti-competitive effect of vertical integration could come from a market for closure. And we can see the steering as a special case of information for closure. And this could be particularly important for digital platforms. And the paper is also related to empirical and theoretical work on digital platforms, information intermediation, and most of the work focus on recommendation system and search design. And this paper is also related to work on algorithmic bias. So let me kind of talk about data in detail. So we check the lowest price among all offers of each product over time. And we also collect a sales rank, which measures the relative ranking of a product within its category. And we follow previous work and use sales rank as a proxy for sales. And these prices and sales data are updated daily and allow us to track the real time change in price and sales. And then we will merge the data with our FBT records. If any good question, quick question. How do you record the FBT records? In the next slide, because we have some technical difficulties, so our FBT are not as frequent as prices and sales. So overall we collect five rounds of FBT over three months. And over the five rounds, about 2% of the products experience at least one change in Amazon's presence. And we check that most of the variation in Amazon's presence is temporary. So we are not capturing the effect of Amazon's permanent exit in the market. And we also find that half of the pairs of referring and recipient product experience at least one change in the recommendation pattern. Okay, and the table shows the summary statistic of our data. So the panel A shows all the data in the sample. And overall we have more than six million products. And here the FBT receive means the number of recommendation a product receiving. And many product actually receive no recommendations and some products can receive more than 10,000 recommendations. And FBT initiated means the number of recommendations a product giving other products. And the number is between zero to two due to the capacity constraint of FBT. And panel A shows the summary statistic for product that receive or initiate at least one recommendations. And in this case we have about four million products left. Okay, so by comparing like the sales rank in panel A and B, we found that more popular product, which means that they have a lower sales rank, more likely to get recommended. Okay, so we first just compare the number of recommendation received and initiated for Amazon selling product and third party only products. And we do so by comparing the sales rank detail so that we are comparing the FBT conditional on popularity. And here we found that Amazon selling product received more recommendation than third party only product. And this is consistent across a sales rank detail. But the difference in FBT initiated which is the results are not very long. However, here we are just simply compare the average statistic across products and it could be that Amazon choose to sell product that are more likely to get recommended. Basically we are worried about whether Amazon's presence is random or not. So by comparing across product tell us that Amazon selling product receive more recommendation but this does not directly suggest steering because the effect is not causal. So the next step we will conduct within product analysis by using the variation in Amazon's presence within product over time. So the figure illustrate our identification strategy. So first we construct a balance panel of referring products and recipient products that ever appear in our five rounds of data. And, and given a pair of product over five runs, we may observe Amazon sells the within the recipient market, or does not sell due to auto stock. We will compare the probability of getting recommended under these two scenarios for the same pair of product. By doing so we are not comparing the same we are comparing the same product so we do not need to worry about many other product level characteristics that are correlated with Amazon's presence. So we will round the regression on the slides to estimate the steering. And the independent variable is an indicator of whether referring product is recommending recipient product at the time T. And the independent variable, plant recipient is an indicator of whether Amazon is present in the recipient market at time T. So for a product pair fixed in fact, and category they fix it. So in this case, the coefficient of a plant recipient theta is identified by the variation in Amazon's presence within product pairs over time. And therefore will measures for a given for the same recipient product, the increasing probability of getting recommended when Amazon sells again by comparing the same product over time we also do not need to worry about the other product characteristic that can bias the theta. So the table shows our estimation result. So the theta here measures the degree of steering. So the higher the theta it means that Amazon's presence will result in a higher probability of getting recommended. And for column two and three, we control for real time cells and prices into the regression. So here the coefficient are all around a person. So okay so this means that for a given recipient product, Amazon's presence increase the probability of getting recommended by a person. Of course, even our major on steering is comparing the same product. There might be some other omitted variable that can be correlated with Amazon's presence and lead to a bias in theta. We briefly discussed some potential and important omitted variable that people care the most and how we rule out these channels. Okay, so first, of course, it's possible that consumers prefer to buy from Amazon. Right, so we might worry that there's a decrease in sales after Amazon is out of stock. And this could potentially explain the decrease in FBT. But since we observed the real time in sales rank, we can control for the real time sales rank in the regression. So in this case, we do not need to worry about the changing sales that are correlated with Amazon's presence is the source of omitted variable bias. Before you get to other concerns, Ananya has one, not sure whether this is happening, but could it be that the algorithm just has access to finer information just because it is a product that is sold by Amazon. In terms of what gets fed into the system. Ananya, would you like to add color to your question? Yeah, just that information is being fed into the algorithm and since it's like an in-house product, they just have a lot more details on the product as opposed to some third party product based on the systems that are in place. Right, but here we are comparing the same product. I only depend on whether Amazon sells or not. So if Amazon just have the detail, the FBT should not temporary change. Okay. And then Clara has a question about your data scraping data collection. Was it done with a browser where history cache and cookies have been deleted because otherwise you may risk having personalized results which are affected by your searches. Right. So the we actually try like different devices and like with different browsing history, but we do not really see FBT change under different devices. So kind of like this kind of recommendation is not like personal life in our experiment. Thank you. Okay, so let me keep on discuss other like potential concerns. Okay, and the second one is that people usually will worry there's a shipping charge from other sellers after Amazon is out of stock. And we show that the shipping charge need to be very, very large in order to explain the 8%. And we also still find a positive effect when we focus on the market that has at least one seller who offer free shipping using a fulfill by Amazon services. And there we might worry that they are observed demand and supply shocks that are correlated with Amazon's presence. And to rule out this channel we were around the same regression but using third party seller stock out as a placebo test. And we actually found zero effect of Amazon third party seller stock on FBT. And we also suggest that the demand and supply shock are unlikely to be the omitted variable that are correlated with a seller's presence. And we also conduct more robustness check, including using logic logistic or probably repressions or controlling for like cells, and overall our results are kind of quantitatively or qualitatively similar. If any you have about 10 minutes. Okay, and as mentioned before we safely run the regression depending on whether one of the remaining third party sellers using fulfill by Amazon services. And the FBI seller is very similar to Amazon in terms of shipping and related services. And Amazon also receive a higher referral fee from FBI sellers, and we find that the theta becomes smaller to around 6.5% for product with FBI sellers. However, we still cannot rule out steering even for product with FBI sellers. And then the steering depending on the referring product's capacity constant. And remember that each product can only recommend up to two product. So some referring product didn't utilize all the slots. So here we define the constant product if we observe the product recommend to recover to recipient product for at least one run in our data. And this definition about 80% of the referring product or constant within them whether the steering depends on these capacity constant. And interestingly, we did not find any effect for the unconstrained referring product. This makes sense because this product did not fully utilize the two slots in the first place. It is likely that there are no other product to replace the existing fbt. The referring product should recommend other product in the first place. And our steering effect is mostly driven by the referring product with capacity constant. In the next step, we will like to understand what the effect of steering on consumers and other sellers. So we first try to understand whether the fbt changes recipient sales, we will call this measure recommendation of factness. The figure shows how we identify the effectiveness. Again, for a given pair of recipient and referring products over five rounds of data recipient product may recommend the recipient product in the pair, or recommend other recipient product. We then measure the change in correlation between the sales of the two product under these two scenario and see whether recommendation significantly change the sales or not. So we will run the regression on the side to measure the effectiveness. So the dependent variable is the lock of recipient product sales and the independent variable is the lock of referring product sales and its interaction with whether the referring product is recommending the recipient product in the real time. Question from Eva does Amazon have an incentive to smooth demand or sales of a product over time. So does it suffer some reputation or loss if at some point no seller sells a product. If no seller sells the product that that might be possible. Yes. Right. Okay, so the coefficient of the interaction turn delta measures the incremental correlation of the sales driven by the recommendations. And in the regression, we control for pair fixed product pair fixed effect and category they fix effect. So again, the delta is identified by the variation in fbt we think product pairs over time. And here because we only observe the sales rank previous work has approximate the log off sales using sales rank the log off sales rank and we follow the similar procedure and in a constant turn in the function is not. It doesn't affect our result because it will be absorbed it by the fixed effect. So we just choose a be equals to negative one here. And the table shows our results. So the delta we put measures the effectiveness here is positive suggesting that the recommendation indeed has an impact on recipient product sales. But the magnitude itself is hard to interpret, because this will depend on our choice of be for approximation. But we can actually compare with data, which is the, you know, the sales correlation between product without recommendation. Right. And then we can see that the, the delta is about 70% 17% of of it. And we also extend the hydrogenous effectiveness depending on referring products capacity constant. And we define the constant product as the same as before. Okay, and we actually found that the delta seems to make sense, because delta is close to zero for the unconstrained referring product. This can also explain why they did not fully use all the slots in the first place because recommendation does not really generate more sales for the platform. And most of our recommendation effectiveness is driven by the product that use all the slots. Okay, and next we want to show that the steering is consistent with the platforms profit maximizing incentive. So the platform is more likely to steer when the fbt is more effective. Because in this case steering can lead to a higher profit. So to even these we use heterogeneity across the largest 30 product categories. We first estimate the degree of steering which is the theta in the first regression across categories. And this figure shows our estimate for each category and we can see that category like movies, kitchen, or supplement have a higher degree of We also extend the heterogeneous effectiveness in the same way. And which will be the delta in the second regression for each category. We can see that the recommendation are most effective in skincare makeup, kitchen and diet. We then extend the correlation of these two estimate across product product category, and we actually found that the two estimates are highly positively correlated. So this means that there are more steering in the product category where recommendation can generate more sales. And this also shows that the steering is consistent with the platform profit maximizing incentive. And finally we would like to explore the implication on efficiency in the last step. Okay, so to do so we compare the recommendation effectiveness for Amazon selling product and third party only product. So we just estimate the second regression but allow different effectiveness for each type of product. And then we will compare the delta for Amazon selling product and third party only product. But here we have to be a bit careful because we are comparing the effectiveness across products again. Therefore we want to make sure that we are comparing you know similar set of product and the only difference is whether Amazon sells or not. Okay, so to do so we will use propensity score matching that match product characteristics so that we are comparing the effectiveness for product with similar characteristics and the only difference would be Amazon sells or not. Okay, so we match the referring and recipient product sales and number of seller in recipient and referring products market. And you can see that in the match sample on the characteristic becomes more similar. So we don't estimate the third question using this match sample so that we are comparing the effectiveness. For products with similar sales and market structures. So the table shows the heterogeneous recommendation effectiveness. Okay, so Delta one is Amazon selling product. Delta one and delta zero is for third party only product. And column one and two are the pairs which referring product is third party only product. And column 304 is the pairs which referring product is Amazon selling product. But here regardless of referring product type, we found that the effectiveness is higher for third party only product than Amazon selling product. And this actually means that recommending third party product can generate more sales than recommending Amazon selling products. And we have shown in the first result that Amazon selling product are more likely to get recommended. However, this result also suggest that to maximize the sale third party only product should get more recommendation. And in other words, the recommendation system does not seem to maximize the total sales. And to think about the implication in terms of efficiency. First, consumers are worse off with steering because they did not get their most preferable recommendations. And third party seller are worse off because their product are still way. If we assume third party sellers and Amazon has a similar cost of production and fulfillment, then the recommendation system may not be efficient because consumers and third party seller here are worse off. Well, a couple of more minutes. Great. Yeah, let me conclude. So the paper provide a causal evidence algorithm mixed steering for a large and dominant platform. And we provide an empirical framework and research design to identify the bias in algorithms and self preference in behavior from observational data. We focus on a product recommendation in this paper, but the steering could also do like other information intermediation like search ranking or seller recommendation like by box design. And we think this framework how help us to think about the ongoing debate on regulating a large internet platform and also the discussion relating to platforms incentive and algorithm make accountability. So as you can see from our exercise is quite challenging to identify the bias using observational data, because there are so many missing variable. Right so that if the algorithmic decision can be more transparent this might alleviate many concerns we have today. And that's all for my presentation and thanks for listening and I'm looking forward to your comments and also Daniels discussion.