 So I hello everyone. Good morning. Yeah, so I wish to share an interesting story which happened with me a few weeks back So I was walking along the streets of Kormangala. Just pop-hopping It was some late hours on the weekend when I got a call from my manager. So it was very Abnormal and it was very an odd hour for getting a call and it was quite unusual. So Reluctantly, I picked up the call So my manager on the other side is like Akash looks like something's wrong One Indian girl is all over recommendations Now being a bachelor. I was curious like who this girl is So anyway, I asked him for more details like What what girl are you talking about and he Replied back that there is this new release by Chetan Bhagat one Indian girl Which is like all over recommendation widgets on our website. So Yeah, so that was One of the things so I immediately rushed back home and like booted up my laptop went on to Browsing so Yeah, so I picked up this book GRE for dummies and Yeah, as well my manager was saying this one Indian girl started like was surfacing somewhere in the middle like looking out of place Yeah, then I went to this book by Shashi Tharoor again, so Again one Indian girl was here as well. So I was like very confused and like what Went wrong and I started debugging but then it struck me that okay There was this campaign marketing campaign which Flipkart had launched which was selling this Book for say one rupee when bought along with other books. So yeah, so Probably that was the one of the reasons why like this was appearing all over where it shouldn't have So clearly this was an example of our ranking gone wrong so So in this talk, we will look at some of the like relevance and ranking aspects How we do relevance how we do ranking what might have potentially caused such a Such an error and Yeah, so before that, let me first tell you what recommendations are so Say you go to an offline salesman, right? You go to an IK showroom and you ask the salesman for a Say a running shoes of certain size you specify your need in terms of certain maybe Price maybe a certain type of color maybe your purpose say running shoes, right? And he comes back to you with a bunch of suggestions So he not just uses your direct query instead. He applies the context as well So he checks what season is it is it a Like winters or summers or is it like raining so that will help him prune down Also, he might look at the build of your body the shoe size for getting back to you some suggestions. So In the online world, this widget is solving that in some respect So in the online world, we have some context of the user. Maybe it previous purchase is or it could be the browsing history of the user and On top of that, we know that their user has landed on Puma shoes and we want to recommend a set of similar products, right? So here again, there are the size and color variations that we offer And different suggestions based on a bunch of criteria. So some of those criteria will cover as a part of today's talk Yeah, so Okay, so I'll I'm gonna structure my talk like this. So first I'll cover about the relevance bits So what is relevance different techniques towards relevance? You have content based and collaborative techniques Then I'll go into the ranking How we can rank products optimally given the context. How do we Create an architecture which allows us for quick experimentation What features do we use for ranking and finally I wish to leave some time for Q&A Yeah, so let's first right jump into What relevance is so given a context right in the previous slide the context was the shoe in question So is the suggestion relevant? so we want to ask such a Boolean question to the to the recommender system right so The question is does the question does the candidate which we are recommending cross the certain relevancy threshold So how do we define such a threshold? So here is a graph showing a precision requirement versus the context requirement So something like home page of Flipkart, right user has not typed in any query. He just visited Flipkart So the nature of context is quite implicit. We have we might have some history But he has not explicitly said what he is looking for on the other extreme We have a very explicit context of a search page. So search page users typed an actual search query and In that case we have to live with a very high precision. So we can't deviate much from the search term So if the query was in Ikea sport shoes, we have to be within the limits We can't show we can't deviate much from the required suggestions. The product page is somewhere in middle So product page is the details page which we saw it somewhere in the middle So nature of context is moderate where we are not as lenient as home page and as strict as search page So the precision requirement Will be also moderate and that will help us decide what threshold of precision do you want to play in? So there are a bunch of techniques majorly divided into content and collaborative techniques Which helps us figure out the relevant So obviously I don't want to recommend a shoe for something like a mobile that is totally irrelevant But within shoes, what what how do you define relevance? So let's get into the Collaborative aspect, right? So first let me tell you what collaborative filtering is. So say on the left side You see a category hierarchy. So this can be any commerce hierarchy that we might have So for example, you can start with a footwear or footwear on the top below footwear You could have shoes sandals within shoes. You could have casual shoes formal shoes So it's basically a taxonomy that you help user in the browsing of the thing browsing across flipkart Then we have user activity. So user when he came to flipkart, you could have Yeah, the user browsing through the through the website, then you can add to cart at the wishlist or perform some other activity Finally leading to a purchase. So all that will be Coming under the umbrella of user activity So once we have this category hierarchy and the user activity, we want to perform this Collaborative filtering to give a set of relevant suggestions, right? So Let's see what is inside this block of collaborative filtering, right? Yeah, so here you see three books which are there. There is an introduction to algorithms There's computer networks and there is 5. Someone right? So the Venn diagrams here represent proxy to the volume of These visits to these and the Venn diagram overlap will represent an extent of overlap So it's quite intuitive that more the extent of overlap more related. Those suggestions are for example, like for algorithms Networks might be nearer than say 5. Someone However, there's one nuance to it. So that nuance being The popular products might end up coming in most of the suggestions So something like 5. Someone which is like top seller will creep into all academic books fiction books nonfiction books Almost everywhere on the website. So we have to discount the Overlap by the popularity of the product. So cosine similarity is one such one such measure which is used wisely Where uh intersection count is kept on the uh is normalized by the popularity of the individual products Right. So cosine the extent will give you an extent of overlap and will tell you whether these two suggestions are limited Related or not So here we saw that there are these three individual products where we mind some kind of patterns. So We said that algorithms is nearer to computer networks rather than 5. Someone it need not be at individual product level. So Take this example for the case of saris So this slide might not be fully visible, but let me read it out allowed So you have this in the center you have a fabric Type of a sari called synthetic georgette, right? So there are around 150 or 200 fabric types of saris we sell which which Basically helped user down the decision. But it uh now if you see synthetic georgette, uh Using collaborative filtering and user basically user activity data. We can mine these Things like pure georgette is somehow related to synthetic georgette. So is art silk. So is chiffon So this attribute this gives us kind of an attribute graph which we can work on top of and basically, uh That uh extends our product to product similarity that we saw in the last slide So this example which you saw is constrained to a single category. So we had sari as a category and within sari We were able to recommend few attributes. So but there is no reason just to be restricted to a category. For example During the recently concluded like world cup one of the patterns that we saw was this theme Ronaldo surfacing Cross diverse kind of recommendation. For example, this decal laptop decal is from the electronic accessories category So if we do a category cut, we won't be able to surface such a pattern. But uh since the User activity can be across categories. It can be very broad. Uh, we got such patterns So there these wallets and t-shirts which were from the fashion category were showing somewhere down the electronics category so To uh, if we apply some rules like fashion should be shown only within the context of fashion We will miss out on such patterns and hence it might be important to balance out The category mix with the such kind of patterns which are coming directly from user activity Right, so uh till now we saw how user activity can be used towards mining patterns Another important aspect to derive recommendations might be Using the product attributes. So here we start with some product attributes We apply some kind of attribute similarity and we get a set of relevant products So what like let's get into what attribute similarity is right? So there are two TVs here There is a view TV. There is a codec TV They have some bunch of attributes which are defined some are visible here like there is price. There is offers There is some kind of warranty. Uh, there is a screen size here. Some are not visible to the users So I have listed down some of these attributes. So there is screen size brand Whether it's a 3d or not hdmi cable all those things So some of those attributes will match and some will so more the attributes matching more similar those two products will be Uh, now there are certain things with respect to attribute those attributes might not be complete So those are the nuances we have to deal with. Uh, what do we do with for case of incomplete attributes? Also, all these attributes are not equal in the sense. So like let me take a quick show of hands So we have these screen size brand. So how many of you think screen size is more important for making a purchase decision of a TV? Okay, uh, and how many think brand is more important? Yeah, I see a very divided audience. So yeah, uh, like, uh, we don't rely on this intuition To do this judgment. We one of the ways to derive what is the relative importance of these things is Looking at the data itself. So we have these filters which are present on For screen size brand for any TV kind of query, right? So number of people clicking screen size is a good proxy of the importance of screen size Similarly, number of people who are clicking on the brand filter will tell us that, oh, brand is more important than screen size. So, uh, That is one kind of, uh, uh, way where data can be used to power the important relative importance of these features Another, uh, important thing which, uh, which is the search queries itself So if people are typing more of 42 inch TVs kind of queries Then 42 inch is a more important criteria in the minds of the user while making a TV purchase Yeah, so here we saw the catalog structured attributes. Another thing which can be used is the image itself. So image of the product Say we start from the image of the product. We apply, uh, some kind of visual similarity techniques and again, you can get a set of visually similar recommendations Yeah, so here we have a sari. Uh, we want to our objectives to show visually similar saris to the, uh, this Sorry, so it can span across brands different price ranges, but we want it to be closed In the sense of looking wise so, uh, we, uh We train a neural network for, uh, identifying some such queries. So we give that a query image a positive and a negative image So the objective is of this is to bring the query and the positive image as close to each other as possible And the negative image further apart. So basically this neural neural network will try to learn a lower dimensional representation So starting from the raw pixels, it will try to learn a representation of each of these images such that Say this query and the positive image will be closer in that, uh, Low dimensional space rather than, uh, the negatives will be further apart And then we can do for this sari search for the nearest neighbors in that space to find the, uh, Some proxy to the similarity so one other, uh, observation that we had while dealing with such data, uh was Since we are a marketplace, uh, there are a bunch of sellers which sell similar similar kind of products And when we first put this out, uh, when we first put this model out into production What had happened was, uh We had almost near duplicates coming up. So, uh, in fact, like exact same sari started surfacing in the recommendations Uh, Which was given as a feedback back to the catalog that these are near duplicates. These need not be different items. So these can be clubbed into a single item with say multiple sellers and then That could lead to a better customer experience like rather than showing the exact same item again and again right, so, uh Till now we saw, uh, like how user activity and product attributes can be used to compute some similarity metric between a pair of products, right, uh, so Uh, initially what we had is we had different widgets for these. So we had something like customers who bought this also bought Uh, based on say collaborative filtering. We had the attribute similarity a module for attribute like matching attributes A module for visual similarity. So these we could afford to have different modules, but uh, recently We saw an interesting trend like with geo and all coming up. Uh, we The mobile traffic started exceeding. So at this point like majority of our traffic comes from mobile So it may be mobile website or android and we might not have space for like separate widgets of visual separate widgets for Uh, collaborative and this so for the user we might want to combine these two widgets So like how do we get into combining them? So like we had these relevance criteria We had collaborative filter attribute and visual similarity. How do we combine them into a single ranked suggestion list, right? So Let's see what we have at our disposal. So we have first we have the user clicks. So we showed bunch of these Things to the user. Uh, we got some click feedback. So if the user has taken more and more it tells us about engagement. So you have Some module might lead to more engagement than the other. So that could be one of the uh, criteria to judge which of these is performing better in terms of user clicks Right, uh, one problem with clicks which we saw at the start of the presentation, uh, which was like this one rupee book was leading to a lot of clicks and That that led to that chetan bucket book probably led to that chetan bucket book surfacing all over the place because People are browsing it a lot out of curiosity, but probably not purchasing it So another important thing that comes into picture is why don't you take conversion itself? So not the engagement but the conversion. So it's important to specify what you want to rank for you will combine these but How to combine will matter. Uh, will uh, will be dependent on like what Do you want to optimize for conversion or you want to optimize for user engagement? So, uh, typical goals of a recommendation system. So there's a tradeoff between engagement versus conversion. Uh, also important are these diversity and Serendipitous elements so user don't user expect the recommendation engine to show some surprising content One of the examples which we saw was this, uh, ronaldo based patterns where people might be browsing They might be fans of ronaldo which might be browsing, uh, uh, products across categories with a unifying theme Right, so we use the learning to rank for combining these different relevant signals So it's a machine learned model to generate the rank list. What it does Basically, we have a set of items to recommend those might be obtained from any of the categories. So, uh, any of the category of relevance So you have collaborative content based techniques, which will give that thing and we have product conversion data. So, uh, like positives will be much lesser because Out of a lot of people that come to flipkart many don't end up purchasing So a fraction of the people will be purchasing and that will be positives and there will be a lot of negatives in the product conversion data So we train our model to, uh We pass those items to the model and which will help us score each of those models Right, so some So what we are doing in essence, right? We are taking those clicks and purchase signals and passing a feedback loop back to the ranking block So let's see what this feedback look Like looks like at flipkart, right? So we started with a linear logistic regression based model One important thing was to ensure the production sanity, right? So Uh, once we deployed this model, uh, there were a lot of uh gaps in the incoming data So we are relying on a lot of data that is coming in. There is a click data conversion data and there's relevant signals Some of them might get delayed or some of them might, uh Not be up to date. So that sanity. So how do we measure the performance of such a model becomes an important consideration So the scale at which we work like these are rough numbers. So like out of like 100 million odd products, we have recommendations of So we are confidently able to recommend a subset of them. So around 15 to 20 Percent of products is where we'll be confidently able to say, okay, these are the relevant products rank among them The training points which we saw the conversion we take one month back one month window of data. So The conversions of which happened in the last month From the recommendation widgets around the order of a billion training points and then we have the scoring which is The pairs being scored, right? So for a pair, we'll have different scores So we'll have a score from the collaborative filtering engine We'll have a score from the attribute basic attribute matching engine We'll have another score from the visual similarity engine and they will Come together To be scored by this ranking module Right. So one thing about this Thing which we experienced during big billion days, right? So we got a lot of traffic and As models started behaving quite abnormally During this big billion days. So we were like curious like why why is this happening, right? So apparently what we realized is the traffic that comes during big billion days is a very different kind of traffic they are looking for probably some deals and The way we are optimizing for conversion. We were not taking these deals as a factor factor in our recommendation. So Like we initially started we treated this data as anomaly So whenever there was a like a spike sale on our platform, we just ignored that data Another way to deal with this could be to adjust the training data with sale day as a feature. So for example, you could have these There are certain days in the calendar which are marked for us like there are they might be Diwali sales or new year sales They you could take them as a feature and rank specifically for the sale kind of scenarios Yeah, so till now we saw how these relevant signals helped us rank the Rank the overall Rank the overall set of products, right? So let's see what other features can be useful to us So one of the features which are important in this respect are the quality features So what I mean by quality so every product on flipkart We sell Has a notion of quality. So it even in the minds of the user in the minds of Whoever is purchasing there is a rating ratings data that we have right so people rate all the products Then there might be people who are writing some reviews So reviews are very useful in the respect that they can help us mine some information Is it a positive review? Is it a negative review positive review? Tell us that okay people like this product So it's a good quality product. So the hypothesis here is the quality more the higher the quality the higher will be the conversion Right, so let's see. So we started with this human labeled data. So There was a category team which supplied us with a set of good products Initially it performed quite well, but as we move to More and more as the data became stale, right? So the category managers can't tag So they initially say gave a set of one lack to one to two lack essence products which were Which were human labeled saying that these are good products, but these products are very hard to refresh it's a So we move to the crowdsource version, which is the ratings reviews and The return rate so return rate is one other one another aspect if a product is getting returned a lot that seller or that product class of product might not be suitable for the Because people are returning it a lot. It's a huge cost on the marketplace Yeah, so these quality features have a Sparsity problem because they are explicitly asking the user to Mark a rating so very few of our products a small fraction of our products have actually have ratings and reviews In that case, it's useful to fall back to the next higher level of aggregation. So we saw the category hierarchy which can be used here as well So a category hierarchy can feed into the Model and just we can average it by seller or we can average it by brand. So they say there is a poor brand Uh, for example, uh, relatively speaking, right? Bose will have a much lower return rate than say something like skull candy. So, uh, we can basically Penalize the entire brand say there is a new skull candy Earphone versus we are when we are comparing with a boss earphone Probably people like both kind of earphones more and they they will have a higher quality and quality will naturally have an impact on the conversion which we are trying to predict Another set of features that we found in useful was the historical features. So like all of us here are attending fifth elephant It's a very good example of historical features. Uh, so all of you might have seen those feedback forms floating around So like, uh, do you know why those feedback forms are useful? Uh Yeah, so like, uh, what fifth elephant use uses those forms is, uh, it Gets the speaker's feedback and uses is that feedback for the next edition of fifth elephant. So, uh, there's a feedback loop built right into here Similarly in the e-commerce space these historical products historical features will be helpful to us right so, uh One thing about historical features. So they get stale very fast. So there are these new product editions Prices will change offers will change. Uh, and hence we need to refresh these things very very quickly boost in our When we did a new update index update, right? Uh So that's why historical features are like very powerful. So historical feature will be something like Say there is a brand hp brand and products from those hp brands are performing very good Uh in contrast to a sir something like, uh A sir brand will not perform very good. So we might want to take that factor Historically speaking, uh, if there are a good performing products rank it about the, uh, lower performance products So one other interesting nuance was about presentation, right? So we have different Presentation channels. So we have this desktop website. We have android and the nature of, uh, the interface makes it very different So there are devices of different sizes that are available in the android market. So all these presentation aspects Matter a lot while making a prediction The ranking prediction, right? So, uh, on the left you see a screenshot of android, which is a two cross two grid So your entire focus is on the android widget Uh, here this is a screenshot from the desktop website where you have a one You have one widget but In the overall scheme of the page, this is just one of the rows. So user's focus might be on, uh, multiple places in the Uh, multiple places in the desktop website, right? So what do you call an impression? Uh, So impression is basically when the user is seeing your module So impressions are quite useful to measure the performance of the widget, right? So in in case of android, all of those four are most likely to be seen by the user On the other hand, these six For the desktop setting might not have been seen by the user because user retention might be divided So we need to factor in all those scenarios while accounting for the performance So say android, it will naturally get more clicks because of this behavior So android tends to get a much higher clicks because your entire attention is on your device On desktop, we get very less clicks in that comparison. So we have to normalize those impressions by the channel Uh, one other interesting thing which we found was the usage of the thumb on android, right? So you have you are scrolling on the android device and your most of us are right-handers like majority of us So, uh, people tend to click the right-sided positions much more, right? So the odd positions are clicked much lesser than the even positions This is one of the findings that we had from when we analyze the position data for android For desktop, we have a mouse click. There is no such There is no such behavior which is observed on the desktop setting, right? So now how do we treat that is a it's become an important question like do we Uh, initially, uh, our thinking is we'll deploy one model in production which will work for all channels But is it is that really true because user users are looking for very different Users are behaving very, very differently in these channels So do we deploy two different models for different channels or do we check taking channel is just a feature in one hour computation, right? So all those trade-offs have have to be taken into account while doing this presentation trade-offs So overall, right? Uh, let's see what all we covered till now So We started from a bunch of input signals. So we had category hierarchy User activity data, uh, these attributes and images We applied a bunch of relevance techniques. Um Also fed in these features the quality historical and presentation features to a ranking module Optimized it basis the clicks or conversion, right? So these clicks or conversion is what we Used as optimization and gave us a final picture of the ranking Uh, block. So, um, these things were developed over an iteration Multiple iterations. So we started initially with few of those blocks Uh, some took multi month effort. Uh, some took like years of effort to build So I tried to like color code this into this slide Uh, so the darker represents like more amount of time up to like years of effort Uh, the lighter was something we were able to achieve, uh, relatively faster So historical features, although they look very simple, uh, to get them right and to Analyze them at the right granularity of category is quite important Yeah, so this pipeline, right? Uh, so there are these bunch of pipelines running here So there is a user activity to collaborative filtering pipeline attribute similarity pipeline visual similarity pipeline So most of these are scheduled once a day in our case. So Each of those gets refreshed once a day. Uh, and those features are prepared from the user activity data itself So historical history is again prepared from the previous data of the user There is presentation which is prepared from the feedback which is coming from those android and desktop channels Quality are prepared again from ratings and review datasets, right? So those get refreshed daily and All together this training and scoring also happens once in a day. So we have these job schedule which Basically start from, uh, the raw data sources and compute and every day a new index gets pushed to production for our scenario Yeah, so that's it. Uh, these are the references that I have used in this Uh presentation and yeah, I'll be happy to take some questions Hello, yeah, uh off the charts demand guys Can we have the people sitting on the stairs move to one side so that the mic runners can run? Let's start Uh, my question is here too Studying the behavioral pattern of a customer Based on the clickstream data that is real time analytics. Suppose the customer is new and he is uh Clicking on your products and after that That is a real time analytics. He may buy or Do not buy suppose he buy he chooses some product in the cod What are the probability of the chances that the customer uh should return the product or not the return the product? And in case what is the success failure rate of flip card and even the customer do not buy What is the uh, how is the classification done? Okay, so, uh, you asked about what is the usual success rates, right? So I can't actually disclose the actual, uh, raw numbers Yeah Yeah behavioral pattern of a customer On based on the clickstream data, right? So here we had So Yeah, so on the right you had clicks and conversion data, right? So the previous data is analyzed here So what we try to do is the last 30 days of data is what is fed back into the model, right? So, uh, we already have the conversion rates of the user. We already have the click rate So say there is some module which has 10% click rate, right? So 10% will be treated as positives and other 90% will be negative. No, I'm not talking about historical data. Yeah I'm talking about real time analytics. Yeah. So in this case what we are doing is we are limiting ourselves to historical itself We are just uh taking the previous few days of windows Which is sufficient for us to make a recommendation because the patterns don't don't change that much in uh, within a day So what we have figured out is the cost of doing real time is much It's much more than cost of doing batch. So in our case batch was the right fit So we relied only on the historical like how the classification is done when the customer does not buy the product Yeah, in that case it's uh treated as a negative sample So if you buy it's a positive sample if you don't buy it's a negative So that's how you try to divide your data Hello, thank you for your speech. It was a great one. So I have two questions primarily So first one is since you since flipkart would have large amount of data How do you schedule your jobs because collaborative filtering takes so much time? Yeah, probably assuming ours for your case And second question is have you ever tried uh supervised uh machine learning techniques for uh classification such as Xeboost or Any other technique. Yeah, so uh, yeah, so first how do we schedule our jobs, right? So it depends on the data. So downstream jobs depend on the Availability of the data upstream, right? So as soon as there is a refresh say ratings data is refreshed, right? So there there will be a trigger job which will trigger As soon as the ratings it will notify the next job that okay We are ready to process because we have the greatest rating data available. Similarly this feedback This feedback data is collected. Uh, so as soon as the feedback data preparation will complete it will feed into the scoring block, right and uh For attribute similarity, it's just a bad job which runs the collaborative filtering. Yes, it does take a few hours So we try to run it in a distributed fashion on a over a distributed Hadoop cluster. So yeah So these jobs are chained to each other. So first job Output will be chained to give uh output of the second block and so on, right? And the second question. Yeah, we haven't tried Xeboost and all other those techniques We kept access to a simple model which was partly interpretable as well So which helped us debugging all these case studies and use cases I talked about like were because we relied on very very simple model, uh, which were quite explainable and We could actually debug the weightage of each of the constituent features Kash, good job nicely presented as well. Uh, very simple question. Um, do you have an alert mechanism over here to identify whether the model is failing? Yeah, yeah, so it's very important to so in our case this model So there are alerts at each stage, right? So there are two kind of alerts that we have one is the volume base So there should not be a sudden volume dip in any of these data sources So if you're computing collaborative signals, it's based on user activity aggregates, right? So it can easily deviate across time to time example in big billion days There could be a huge volume coming in. So we have those anomaly detections for volume, right? Then model. How do you measure it? So this is a ranking model. One of the metrics we use is AUC, right? So area under the curve we regularly monitor. So, uh, there are two kind of So one is the train test area under the curve, right? So you measure this during the, uh Offline modeling and scoring pipeline, right? There is a online component to it as well. So say you have say an AUC of 75 During your offline deployment, right? Does it translate to online scenarios? So when the users actually came the next day, were you able to predict the ranking? How many times were you able to predict the ranking in the online scenario will tell us the contrast Right. So these AUC we have alerts over in even if it dips by a single point We have alerts on the AUC and we look back at the model. What weight it has learned Has the has any anomalous data crept in. So those channel specific nuances I was talking about They were derived from this only. So we had to segment Data by channel to get a complete picture I just wanted to know how do you all go for deploying your model because there would be Large number of requests coming in in real time and the model has to serve those requests Of course, caching cannot be one of the great solution because it would be different result for different user, right? So, yeah, so like that. How do you all scale up the model for serving the request? Yeah, so there are two kind of models here. The one I talked about was a product pivoted model So product to product preferences don't vary per request. So they can be cashed, right? So product to product you once you rank it entire day unless the offers are changing unless the price is changing a lot Those don't deviate. So these similarities between a pair of products don't change that much For user side of models, we have a distributed service. So, uh, if uh, so still there could be those heavy users which are coming is They might be served from cash, but there is a tail of users for which we actually make the model call And that does happen in real time. So we have a hosted distributed service Which is a centralized service which will help us do the prediction in runtime. So, uh, usually those we keep try to keep It very lightweight so that that can be done within the desired latencies Sir, please connect with me offline. There are the people So when you give this Zari example, so you are going into the feature space and uh, just see how much difference is there Right, so can you repeat? Okay. So when you give the Zari example, yeah So you were going to the feature space you see that how much are they related and how far they are, right? Yeah, these are you taking which Feature to take into account. Let's say not sorry a sunglass a guy wearing sunglass here. So you're talking about this, right? Yes, yeah, so let's say a guy wearing sunglass. Okay So a fair guy and a dark guy those features you should not take when you are comparing sunglasses So how you actually filtered out which feature to take when I am taking in the in the feature space and doing that comparison over products Yeah, so, uh, there are like two answers to that. So there was a traditional There was a earlier time when these used to be handcrafted features, right? So what feature do you give importance to? So there was this object detection block, uh, where, uh, you would detect that. Okay. What is the foreground? What is the background? Right? And then you will say I'll give more focus to the foreground in our case. All that is not essential So what? So if you see this query positive and negative, right? Um, so what happens here is, uh, if you supply it enough, so you supply it enough examples saying that, uh, the query is a Say a white guy wearing sunglasses and the positive is a black guy wearing sunglasses, right? So, uh, it starts to focus on the sunglasses itself, right? It knows where to focus on. So once you give this neural net enough examples of this kind It will help you learn, uh, these kind of, uh, patterns, right? So you don't need to explicitly hard code any of these features in the model, right? Right, right, right? Oh, so you are so in our case, we had this catalog was quite clean So we didn't have such a composite image when we were with a real world setting that is much more harder problem So for our case, like this was a very simple setting where see this catalog, it's only a sari. There is no shoe in focus here, right? So similarly for sunglass, it will be only focused on the sunglass. So we had, uh, a luxury of having that clean dataset already Otherwise you need much more pre processing Um, do people have more questions? Okay, so Clearly this is a pretty awesome talk because we have more questions than we did before. Uh, so before we let you have the questions Our next speaker is umma who's going to be talking after the chai break Um, I like to ask her to like plug why you all should come back here for her talk Hi all, my name is umma I currently work at LinkedIn, but as part of my phd thesis, I worked on something called as entity search Whose aim is to really upgrade your search experience. So, uh, all of your uh, do use web search engines and For our day-to-day work, right? Um, so I'm going trying to I will try to give you an awareness of this Way in which entity search hopes to upgrade your search experience And if you're a search practitioner, I hope that you can take back some of the insights to you know, uh, improve your search So hoping to see you At my talk. Thank you Thanks, umma. Um, so Akash, do you want to go forward with the questions? Yeah, we're gonna have a yeah people who want to go for the chai break can go but uh, people want to stick around a welcome to me Yeah, so I want to sorry. I just want to ask Um, I know the asking questions Uh, it takes a bit of like, you know going outside of your comfort zone definitely does for me So in case like if you're feeling not too comfortable taking the mic and asking a question feel free to put it on twitter And I'll try to read it out here on behalf of you So in case you don't want to like, you know stand up and ask question, which is completely cool So feel free to put on twitter and we'll read it out for you here Um, thank you for the talk. Uh, you mentioned that, um Across channels, you need to sometimes identify whether to use the channel as a feature or a different model entirely And thumb rule case as well So what is the what are the factors that you usually consider apart from the cost benefit analysis? Like just say the technical technical analysis that you conduct to use that The channel as a feature or a different model entirely So yeah, uh, like you said the first answer is cost cost cost benefit analysis. So Running two parallel models and insurings. So even to run this single model it takes a lot of effort. So it was like, uh, multi quarter like Like over any almost an year it took to Reach to this state just for the ranking model itself. So maintaining two models is like quite hard. So I'll always go with Of having a simplistic model But in our case, we started with the linear model. So we might want to Have the crossing kind of features, right? So the presentation features cross with the channel Could give us then equal substitute of instead of having a Separate model. So if you're able to so what we can do is we can prototype prototype Like what performance these two individual models are giving is it far better than uh, keeping a single model or a single more Model with a multi layered model with some kind of crossing then that is the trade off basically So crossing is one of the answers, which uh, even we are evaluating Thank you. Quick question. Um, I don't see users buying behavior being modeled here Every user probably has a different buying behavior and it also varies based on you know When you received your salary or whatnot Yeah, so in the scope of this work, this is like you correctly pointed out. This is product pivoted Right. So most of these things are happening product to product relevance, which Uh, has to be overlaid with the user centric model. So we have a very Uh, parallel like a different model which deploys on uh, which uses this as a background So this forms a backbone kind of our recommender system and then we have user signals overlaying So one of the things that you said uh, user receiving salary in the start of the month Uh, so should we recommend uh more expensive products to the start with or should we recommend Uh more value products at the end of the month. Absolutely. So that that model sits on top of this model So we have a user to product and user to so we saw the category hierarchy, right? So we have user to attribute as well, right? So what attributes you like and within those attributes what selection will you go for is a second level of detail which can be overlaid this uh over these two. So in our case we have kept this user to product and product to product quite separate So that they are more interpretable and we didn't combine the user to product production in the direct manner that way There are alternate ways where you do a factorization and compute a single embedding for user and product single like in the combined space But in our case we prefer to keep it separate. Thanks for the informative talk So my question is very specific on collaborative filtering. So it suffers with this cold start problem Now there could be some uh, uh, products which are newly added to your catalog Which you might show up in your ranking for some reason. So how would you deal with that? Yeah, so here we saw that there is a feedback loop that is comes in handy for that, right? So say we started showing so we have a lot of Selection for some selection we started showing those newly added product, right? Due to the feedback loop it will get penalized, right? So what we do is we have this click and convergent data Nobody will click on the new selection next day. So this thing is daily refresh So at most during the day it will get highlights and next day It will automatically penalize the content which is poor in performance. So that feedback loops comes in very handy So what you are saying is basically how do you make new product surface at all, right? So yeah, there are various techniques which are available for that So one of the techniques is called explore exploit, right? So what we are doing here is we are exploiting in the approach described We are just saying that whatever in the top of the list show that So usually the practice is to leave some kind of bucket, right? Say 20% for exploration So for the exploration you just uh, randomly choose some of your selection Uh, from the tail say beyond the top 100 list you randomly start showing that for the 20% users Now again, due to the feedback loop if that starts performing it will start surfacing up So explore exploit is another technique which is used to bring that newer selection. That is a problem because we have a long tail of products with no activity So this is the ongoing problem we keep on attacking, right? So like explore exploit is one of the techniques we found useful for that Thanks Akash. Uh, my question actually goes back to that same thing we have talked about two, three times even in the equation. So you have user to product mapping and product to product mapping and product to product mapping is what you're caching for faster service And that makes sense. The question is, is the user to product mapping lost in all of this process? When you come to the final recommendation I'll just give an example So for example, I like hiking and I browse a lot of hiking products You like fashion, you browse a lot of fashion products Both of us as users go to search for electronic cameras Is the recommendation system more likely to show me a GoPro and show you a DSLR? Does that come into play? Is the user feature retained or is it lost over this whole process? Yeah, so yeah, that's a good question. So if you deal with products it will get lost Like you rightly pointed out if you deal with product and product to product that team will get lost but here we saw something regarding latent behaviors, right? This one. So if you can capture latent concepts somehow, right? So travel as a theme, hiking as a theme. So these are not dedicated categories on Flipkart, right? These are some kind of themes which unite So this is a football or Ronaldo CR7 kind of theme, right? Which unites them. So if you can identify such themes which are running across So one of the ways to do that is factorization, which I was mentioning Where you represent them into say a 20 or 30 dimensional, a lower dimensional space Each of these products and when you compress that into that space Probably you are likely to find such kind of patterns more So that is a tradeoff between but then there is a problem with interpretability So in that case you might get some patterns which are like Which you might not be able to understand Model is saying it's correct but you might not be able to understand and hence Yeah, and hence there could be false positives as well. So we have to be careful Yeah, hey, thanks for the talk. So I just wanted to ask that you talked about the model to basically rank the products Okay, so now from whatever you spoke about we have a lot of product level data As in you have the image mapping data, you have the collaborative filter data But basically what does it rank on taking all that data? So if it's a supervised learning what does it rank on? What is that why? Yeah, so on the right you see there were two metrics which we covered So one is customer engagement which is like how is the user clicking a lot And another one is conversion. Did this recommendation which it lead to a conversion? So it's very important to specify which do you want to optimize for So at the start of the process what we start with is whenever we are building the block We have to first identify the criteria like you were mentioning So your basically it's your product managers along with you who will help you decide Okay, I want to optimize for clicks. I don't care about people purchasing or I care about people purchasing I don't want them to stick on my side that much or it could be a balance Okay, so that balance will help us Choose that trade off As in as data scientist you are equal partners. So yes, you have a say so in our case we combined It's a combined process. So yeah, so along with product business and So follow-ups offline, please there are other people waiting Thank you Akash. My question is actually related to your team structure Okay, so one have you ever considered I could gather it has taken you a long time to build this entire recommendation system Have you ever considered build versus buy decision and what you know what led you to build decision Secondly, how is your team structure to ensure sustenance of this model on an ongoing basis? Um, maybe one question and the second one you can take offline. Yeah a lot of people Yeah, so to answer the first question build versus buy Yeah, so uh ad clip cut most of the things uh, we use we do rely on open source component But most of the things are in build because Once you there is a very when you buy things, right? There is a big loop involved. So any corrections to like you saw the nuances, right? Those are very hard to capture in an awful in a external crowd like outsourced scenario. So That position scenario when you are sitting close with so for example for those We had to sit with android team to understand like is it really true your click clicks happening on the right side? On desktop where is your click? What is an impression? So all those are best done Sitting with the so even for the outsourced setting It's very recommended that those teams sit with you and do a combined effort rather than just Focusing on model part instead of looking at the data Seeing the distribution of the feature. So for our case, we have mostly preferred with Building simpler stuff rather than off the shelf. We have not tried much into using the Prebuilt full full solutions because they like won't be understanding such Hi, Akash. So Based on this recommender system once the conversion has taken place, right? How does the recommendation completely change? So let's say I'm been looking for a mobile and suddenly I one day I buy the mobile Next time I log into flipkart. What really happens behind the picture does it because it keeps still showing me the Lot of mobile. Yeah, so there are multiple things in picture here. So One is you said that once you so this is talking about product to product world, right? You are talking about the user world which overlays on top of it. Yeah, so There are a lot of reasons why like you are saying that mobile can be resurfaced. So Like here I said that optimization is on click and conversion. There could be An incentive for us to so people for example don't purchase their own Mobile for themselves. We have seen that people actual people purchase mobile and then again after a week they'll purchase two more mobiles. So I maybe they are purchasing it for their family. So So along with clicks and conversion, there are there might be business consideration as well, which might Lead to that. Also, there could be delays. So there there was this pipeline, right entire. Yeah So the entire pipeline there could be delays in the data which is coming in. So all that is happening in batch, right? So there could be delays leading to that another thing which we were like discussing before the talk Which is could happen is you could have So you could have the category mobiles might be placed under some say tablets category So that miscategorization does happen and in that case, we are actually recommending tablets and not mobiles As a system wise. So all those kind of scenarios might lead to This unabnormal kind of suggestion But yeah, there might be a case to actually show such a suggestion Because finally Users are purchasing it again. So there are those categories which you are repurchase Thanks, Akash