 All right, so guys, before starting the session, I would like to, I'm really thankful to the EuroPython team, especially the organizers and even the sponsors who enable this particular conference and definitely provided us a great platform to interact here and to exchange the ideas and to learn more. So really thankful to the whole team who make it happen. Now, before the course topic, we'll just give you a brief introduction about myself. I'm Abhishek and currently I work as a principal data scientist in Microsoft India, where I'm taking care of the ML-based implementations for all Microsoft Enterprise business-related solutions, products and different partners and clients that we cater here. I do have more than 13 years of experience where I have worked across different domains, different technology stacks and even different areas, covering NLP, NLU optimization, computer vision recommendation system and all. And I was fortunate enough to be the part of the foundation team. So I got a chance to develop things from this crash at the same time over the time. I definitely have grown in this area and now trying to deliver things and scale things for enterprises for their ML and EI needs. I have worked in multiple domains like finance, retail logistics and different payment network. I have worked for Musk, Visa, Fidelity, Dell prior to joining Microsoft and also I am recipient of distinguished 40 under 40 data scientists in India in 2021. I also, whenever I have time permits, I definitely try to do some publications and try to contribute in terms in research. So I do have five international publications and have few patents and trade secrets as well. I have done my masters from Indian Institute of Technology Kanpur, which is one of the PMIN Institute of India and definitely like all of you, I am big believer admirer and I really love working in Python since 2009, my college days. So yeah, that's about me. Now I would like to set up the agenda and definitely this particular topic is really not only very much practical in sense because now with the abundance of data, we can definitely have a lot of avenues open to implement more effective recommendation systems. So I'll definitely try to touch upon why there is a need of recommendation system and how we are evolving, how we are going from the traditional ways to the advanced way of creating these recommendation systems. How I have implemented this particular implementation, which I'm going to talk about, what were the improvements and the findings that I had and definitely the tech stack that I have used for it along with that I would like to conclude this. So without wasting much time and in interest of time, I would like to start with the need part first. So I can set up the context here, like why we are going to this level of customization or what is the need for an advanced or evolved recommendation system. So first of all, it's definitely not a choice, but it's not an optional choice, but it's a definite choice for the companies to provide personalized experience to their customers nowadays. And recommendation system is definitely one of those key aspect that each and every company, be it retail, finance or any other B2B B2C company working in different verticals, they definitely want to cater their customers in their best capacity using the recommendation system. So that's why it not only brings more revenue to their company, but also it provides a great customer centricity and connect with their existing customers or maybe the potential customer they want to target. So definitely it's a good thing, but what are the challenges? The key challenge that lies in the recommendation system is how you are going to make it customized or tailor-made solution for your customer. That's where it ranges from a generalized to a highly customized approach. And when we try to scale that thing and we try to achieve the more customization here, definitely we need to consider different things, be it from the implementation side of the skill side of the things or maybe the data dimensions that is available. So definitely each and every company wants to implement this recommendation system which can cater the customization part of it, but at the same time there are challenges. And that's the exact crux of the stock. I will talk about how we overcome that and how we implement a hybrid kind of recommendation system, which not only utilizes the different data dimensions but also club it well with the algorithm so we can generate and we can create much more personalized recommendation. So I would call it going from personal to hyper personal and definitely right now there are existing traditional ways and even some advanced ways to create these recommendation system which can create some personalized touch. But if you want to go further, how can we utilize different dimensions of the data and how can we use different techniques that is available right now to create more impactful recommendation? That is what I would like to explain the next. So definitely I will talk about the evolution part where I will tell how can we use these diversified data because one thing is the data availability in case if you have additional information and if it is not fitting to your algorithm or the format that which algorithm accepts, how you are going to fine tune it, how you will do the engineering for the features and how you're going to enable that information flow so you can create better recommendation rather than only relying on the purchase history or the interactions that has happened in the past. So I will definitely talk about that and I will reserve last few minutes for the Q&A guys. So in case in between if you have any question you can ask or you can put it on the chat window. I will definitely pick it up at the later stage. Now I will talk about the evolution part of it. So we all understood that recommendation is definitely something which is dealing with the which is dealing with the personalization side of it and we do have existing recommendation systems in place and why is there is a need of this thing? And in case if we are implementing what particular thing or what particular de-marriage it is going to overtake or overcome. So first of all, like if you will compare traditional recommendation systems versus the evolve one which I am suggesting or proposing right now, you can see clear differences which can definitely be leveraged in terms of implementing a more effective mechanism here. So for example, with the current existing systems be it like a priority algorithm or collaborative filtering or even different kind of embedding based methods that are existing right now. If you'll see they all consider the purchase behavior or the interaction behavior you can say if it is not a purchase maybe it will be something bought something reviewed, something seen like movies, Netflix and all. So definitely the current system tries to utilize the historical information like the transactions happened in the system where either a person has purchased something or reviewed something bought something. So that is what they use. They try to match it against the different set of customers who have similar kind of attribute but definitely they want to propose them something which may be of the similar taste. So they try to find the similarity between two customers and then based on the similarity they try to recommend something which has not been utilized by customer A but which may be a useful thing for customer B because customer A and B both have some sort of similarity. So that's how traditional method works but there are challenges in that. First of all, it only considers user item interaction and second thing, max to max it can take the meta information but that is not something which directly it can use. Meta information like item description and all that is also something you can use as a feature but in that case you can't use one particular method maybe you need to create some sort of hybrid recommendation system which will utilize some sort of classification approach and all. So when it uses the historical data, it uses only the transaction that has happened in the past. There may be a case where user is interested but he may not have bought the item at that point of time maybe because of some reason maybe some of the priority change but that doesn't says that user was not interested in that item. So that kind of information will not be captured if you're looking at the purchase history but that kind of information will definitely be captured in the web behavior of a user. So these kind of data points like how many times a person visiting time is spending on particular web page or how many stages in this particular purchase funnel he has or she has come across what are the different cross domain features if I can get some additional data for some other domain let's say I do have that customer information with me I'm operating in multiple businesses I do have retail information and I want to apply it for let's say some other category or some other particular like electronics product catalog kind of scenario then how can I use that cross domain knowledge? So all these things will not be considered in the traditional way of creating the recommendation systems and that's exactly where I try to bring this particular evolved technique and we try to overcome that particular thing where we can include more data dimensions not only depending on the user item interaction means the purchase history but also we are trying to overcome the data sparsity issue that is there in the absence of I mean that can be that can enhance our current system by providing more data points and by providing more richer information which will be helpful to create much more impactful recommendation. So these are the few things which is lacking in the current recommendation scenario and that's exactly where we are trying to bridge it. Now the question comes we do have data available with us and somehow we are doing the feature engineering but how we are going to use it in the current scenario let's say I am doing the metrics factorization method how I am going to use it because that is only taking input in the form of a matrix where it will decompose it and apply some sort of methods and it will try to create some recommendation based on that. So the answer is we need to convert all these intermediate data sources into a vector representation that's exactly where the topic the core part of this recommendation system which is network embeddings comes into the picture. So I'll talk about this architecture more but in nutshell, what we are doing is we are trying to utilize all this information which is definitely in addition to the user item interaction we are converting it back into a representation which can be utilized by the metrics factorization and then utilizing that information we are also trying to overcome the sparsity problem that we normally face in the recommendation scenarios. So that's exactly how we are using this evolved method and that is exactly the word evolution where we are going from the traditional way to the advanced approach. Now talking about the implementation this is the core how we are doing this particular transformation or we are trying to bring much more effective way of recommendations. So typically if you will see the first thing purchase history that is something which all traditional ways of recommendation system tries to use they try to take the user item metrics and sometimes they also try to take the content into the consideration like if you are creating some content based recommendation system they will use item description and all these kind of attributes in place. So purchase history we are also using we are having a metrics formulation the same way we use in the typical user item metrics format we are also using web behavior here where we will be converting the the web presence of a customer how many times a person has played what item has been played we are trying to create a metrics out of it and that uniform metrics something that we are representing in form of nodes. So ultimately at this stage we are creating a particular network a neural network which will be converting this web behavior into a neural network representation and the output of that neural network will be again embedding layer which we will be feeding using the node to back representation as the user embeddings. We are also using item description and metadata here need not to be TFIDF vectorization you can also bypass this phase and you can also create embeddings here. So that is just one disclaimer I want to put. So it can be it can also consider the the embeddings based feature vector rather than the TFIDF. So here we are converting we are taking these two things and finally feeding it through node to back into the user embeddings. So in simple words if I will say I do have a user item interaction based metrics which will be feed it into the metrics vectorization I need another factor which is the factor two which will be coming from my user embeddings. Now these both are metrics representation and obviously before that we will be applying the normalization why because the user item metrics will be having a different scale at the same time the user embeddings and the item and embeddings that we are taking will be having a different scale. So we will be feeding these two here in this metrics vectorization and after that definitely we will be going through the metrics vectorization iterations different sort of fine-tuning parameters their alpha beta gammas and trying to generate the recommendations here. And then you have the feedback loop where definitely you can show these recommendations to the user you can get real-time feedback how many clicks you are getting when you're recommending and how ultimately it is bringing more revenue and more customers of fall on your website. So that's exactly how this whole system works and this is the core architecture of this implementation and that's where we are converting this piece of information which was not used earlier in any of the recommendation systems be it like collaborative filtering or be it a priori algorithm we are trying to augment those systems by using these embeddings which will help to get more information flow into the system and ultimately with this additional information this recommendation system is in a better place and situation to make more effective recommendations here. Now I will also talk about the results and shortly I will come to the tech stack and the pseudo code also which I have used here. So in terms of the results definitely I have compared the similar implementations where this any MMF is network embeddings based multi-factor matrix factorization method. So this last category is what our implementation is. I evaluated it on precision and recall there can be other matrices yes but the reason why I have chosen this matrix because the earlier methods have been tested on this thing in the prior research. So I just try to see how this revised method is reflecting when I'm comparing to the similar kind of previous attempts. So these are the different methods and you can clearly see that across all but be it precision or recall definitely this particular recommendation is not only effective in terms of accuracy but also it is effective in terms of the kind of recommendations you want to make. So n equals to five, 10, 15 these are the three color coded things. So that means top five recommendation top 10 recommendation is top 20. So even when I'm going to increase the coverage in terms of recommendation this particular method is suggesting a great accuracy in terms of the top recommendation that it is making. Talking about the fine-tuning part because this whole implementation if you look at the architecture it goes through two things. One is the matrix factorization. So all those things which is applicable to fine-tune a matrix factorization like your latent factors and other stuff same fine-tuning will be applicable here and because it is using neural network so all those things like activation function, loss function, different layers that you are using will be equally applicable to the fine-tuning of this recommendation. So that's what we have done in order to get the optimal results out of it and doing the fine-tuning of this network. So there are two things and even maybe post this talk I will also put the source code and other things. So in case if you want to implement this kind of architecture if you find a good relevance in your industry or academia or the research that you are doing definitely you can go ahead and you can use it. So, but these are the two key code snippets. I pardon if it is not visible but the one part is the matrix factorization and the other part is node-to-wear combatings where I try to convert the features additional meta features and the cross-domain features into network embedding and feed it along with the core user item metrics to the matrix factorization method. So I'll definitely provide the code snippet as well after this particular talk. In terms of text stack, yes, I have used the favorite scikit-learn without that definitely none of the methods will work. Keras, TensorFlow, matrix factorization library and PyTorch and also a few of the modules of this IP in order to create this particular recommendation system. Now comes the next part which is conclusion how I will evaluate this particular method in terms of existing practice of the recommendation system versus the proposed one. Definitely in terms of accuracy, this is much better. The reason is information loss that we normally face in the regular recommendation system is too high. We never rely on the information which is not being coming to this transactional databases. Like if someone is not purchasing, we are not going to consider that particular information. That is point number one. So we have considered it. And that is equally important because most of the information will come before that stage. Maybe people are not able to complete the transaction because of the various reasons, but what we are trying to do in the recommendation is to know the taste, whether a person would really like something or not and what is the propensity. So that is something which is a good information, useful information that we are using here. Second thing, we are also using the metadata information. So any kind of, yeah, I'll just spend a couple of minutes here and then I'll be open for Q and A. So any kind of meta information be text or something else can be used here. And the third thing is, in case if you want to extend it with any additional data dimension, you can utilize the same network embedding framework and you can implement the same thing so you can get good exposure of those additional dimensions and you can utilize them pretty much in the recommendation systems. So that's where we can definitely use this thing. Now, what is something that can be still improved? Obviously, there can be some features like user acceptance rate, some sort of app or some sort of, you know, different preferences that users are having when they are coming to the website or they are using the web channels or maybe if you are dealing with the retail offline there can be other channels as well. So you can use those marketing channels, sales channels, information and the data. That is something that I haven't explored but that can be added. And also other embedding methods can be used. I have used this node to work but definitely that is just one way to solve the problem. There can be other ways also to deal with this problem. With that, I would like to take a pause here guys and I would like to open for the questions. You can either put it into the chat window or maybe you can come and ask. I would love to answer that. So over to you guys. Thanks Abhishek for that amazing talk. We have an on-site question. Could you please come to the mic? Yeah, please. Yeah, so thanks for the great talk, was interesting. Just a question you have mentioned that you have different implicit and explicit feedback channels that you are considering. Can you just share your thoughts on how you are joining them in one matrix? So how you would, for example, wait if you see a user just clicks or visits a page, a product page or whether the user purchases. So how much stronger is the feedback in a quantitative manner than if a user purchases something compared to just clicking or something like that? So are you training those separately or are you joining them in a sum matrix because you said that you will reduce the sparsity by this and that would be just interesting? Yeah, really interesting question. That's where this particular framework will help. So as I told that there are different scales when you're using user item matrix. So user item matrix is just a representation where you are having the interactions based information available in terms of what is the rating or maybe how much amount you have spent or maybe some other mathematical notation there. Now when you are having other metadata which I'm trying to use along with the user item matrix. First, I am creating the embeddings which is again a numeric representation but I need to apply the batch normalization to it and when I'm applying that batch normalization I'm also using the user item matrix along with that on the similar dimension. So you can just think it like you are getting two different sources of the data. One is your, one is the user item matrix which is on a higher scale dimension. Other one is the network embedding which is on the lower dimension. Once you'll apply again factor this normalization taking both into consideration you can merge both datasets and definitely that's the whole reason why it is generating better recommendation because this embedding representation is also bringing a lot of rich information along with it where it is giving me more useful insight for a particular user in terms of his or her liking or disliking before that particular purchase history has happened. So to answer your question how I'm merging both information implicit and explicit I'm getting these both matrix representation and embedding representation applying a normalization on top of it when I am feeding it to the matrix factorization method. Does that answer your question? Yeah, it does, thanks. Great and thank you for asking this great question. Yeah, we have a couple more questions. Yeah, please. Hi, thank you for your talk, very interesting. If you can elaborate about what type of behavior data you have about web behavior and what type of formulation you did exactly. Great, so if you will think about the web behavior just, I mean this is really easy to connect with just think about your own personal journey if you are using Amazon or any other e-commerce how you typically go. You just go click on the page that will be your first presence. So definitely it will be having one dimension coming from the number of visits. Second, in the whole funnel you will be also dealing with let's say there is a page for checkout but before the checkout and from the time you enter to that particular portal there are a lot more phases you just go check about the product details you go check about the price then again you navigate so there will be a lot of things that will be there before you go and check out that particular product and make your purchase. So all this kind of information time spent, pages that you are clicking or the kind of content you are looking for whether you are really price sensitive how many times you are checking the price page and all. So all this information will be available which will be standard across whatever product you are going to purchase be it electronics, be it apparel or something. So that is what I meant when I am talking about the browsing pattern. So this is what basically goes into a typical e-commerce purchase funnel before the purchase, find a purchase. Does that answer your question? Yes, but what type of formulation you did over this data to make it? So once you are getting that thing what I have done is against every user I do have this kind of metric. So one thing is I'm so sorry we're gonna have to stop here because the time is over but you can hang out in the Liffey boardroom 4 and maybe over the Zoom breakout rooms. Definitely, definitely. Thanks a lot, yeah. All right, let's give a huge round of applause to Abhishek. Thank you.