 Hey, everyone. Thanks a lot for taking your time to join this webinar on three steps to maximizing impact with machine learning. I am Geo. I work as a product manager for Glovo. Glovo is a super app that gives access to anything for their customers in their particular city. Prior to Glovo, I used to work at Revolute, where I was managing a team of data scientists, data engineers, focusing really on anti-financial crime machine learning models. Prior to that, I used to work as a data scientist, and indeed I did a transition from data science individual contributor to product management. So during this time, I'm going to share with you all how I used my experience in the product management and very specifically on machine learning product management. So if you ask me like about my entire journey as a product manager, the key responsibility of a product manager, according to my experience, is to make sure right things are done at the right time. So it's all about prioritizing ruthlessly and thinking creatively how you can give amazing features that's going to deliver a great experience. It's also important to our customers, and importantly for the business that is going to generate revenue. So during today's talk, I'm going to cover mainly three aspects. So the first aspect is quite generic as a product manager. It's about problem definition and some strategies that I apply when I try to get into the root of the problem. And once the problem is being defined, I also look into how you can mitigate the risk of machine learning models and how you can bring innovation there. And finally, how you can deliver and measure your impact of the machine learning model by a data-centric product lifecycle approach. I think I want to start with this scenario where we all been there, we read about something, somebody shared as an article, and then we got an insight. First thing we did is we went to Google and search for, hey, this is a solution and we need to apply deep neural networks because that's something trendy, right? So we all been there and that is like something that I also have done in the past, especially during my transition time from a data scientist into product management. I focus really on the solutions and over time it is something that I also thought myself how I can focus really on the problem. So getting to the root of the problem, I think that would be the first aspect that I'll be covering. And important aspect of getting to the root of the problem is that let's do that exercise of understanding how we can get in the root by assuming this hypothetical situation where we had a recommendation engine. Let's assume it's using machine learning and we realize that the customers are not engaging with recommendations as expected. So there can be two mindsets here. You can be a solution to one mindset or a problem to one mindset. The solution to one mindset will be like this. The DST might conclude that, okay, we need to improve this metric for say map at K. So map at K is essentially saying the relevance of the recommendations or the items that you recommended. So they jumped into the conclusion that, okay, if the customers are not engaging us as expected, the problem is probably map at K. We need to work on it. We need to retrain our model. We need to improve this metric. So irrespective of what the problem's root causes, we already have jumped into a conclusion. But now let's think from a customer perspective. What does that map at K mean to the customers, right? So do they know about this metric? Unfortunately, no, right? So what does it mean as a customer of your application that's using and interacting with the recommendations engine means map at K? So it's important that you translate every problem that you get, even if as a data scientist to understand what is the actual behavior of the customer. So I would say to reinforce this message that assumptions are cancer to any product. So especially in product management where ML is a core logic when ML can have different moving pieces, different parameters and different input data streams and output data streams. There is a tendency that you can assume a lot of things. You can assume the behavior. But I would say that's like cancer to the product. And that can actually reduce the impact that you're going to bring to the customers when you are not getting into the root of the problem. So how can we do that? So when we identify a problem or let's assume we have an insight. And I would say the first step that we need to do is the validation of the insight. So it could be a qualitative or a quantitative validation. So qualitative validation is like, okay, we know the recommendation engine is not getting enough engagement. So we need to get more data about the users. So what's the final analysis look like? What is the interaction of user? And what's the rate with which they are interacting? How much time they are spending, for instance? And what's the action they are taking afterwards and before? So these kind of interaction data could be useful. And we can also dig into different dimensions of qualitative analysis with the data we have about the usage of users. And the important step is we also need to do a quantitative validation. So the validation would be like maybe speaking with a beta user who doesn't have any machine learning background or a technical background. Just to ask this user about why are you not using and noticing the behavior, etc. So from the insight, post qualitative and quantitative analysis, you'll be able to generate something what we call as a behavior statement. And the truth is most of the data science team doesn't know about the context or not well informed about the context. They know what the recommendation engine is doing. They know where it is placed in the app, for instance, but they don't know the behavior of the user that is expected and what is happening. So as product managers, if you want to drive impact with the ML, it is important that you inform your data science with the context. And I will explain the importance of context later when it comes to monitoring as well. So it is important that let's take an example that we discussed earlier about map at K. So map at K, when you put that into a context, you can actually see the behavior of customers and how that behavior could be translated back into the metrics that we are measuring, right? So it's important that you inform your DS teams about behavior and it could be popped with a behavioral statement of the customer, like when the customer is trying to do X and Y and when they are interacting with the recommendation engine, they are doing so and so, and it is expected them to do so and so. So this kind of behavioral statements, it is important that you share and make sure the data science team is well informed about it. So it is essentially one of the things that we can do as a product managers and we should actively share the behavioral statement and expected behavior and the current behavior. So now once you have the behavioral statement established, how we can come up into the solution or coming into solving the problem. So one thing that we did in the past is we always insert directly solving right away. We also have this brainstorming sessions where you, the DS teams can come with different ideas and it can be exhaustive list and it can be a massive list and then during the refinement as PMs, you map that behavioral statement and the ideas based on your size, complexity and the effort required and the return of investment and finally you have a clear evaluation metrics and this evaluation metrics is really important because sometimes some product managers, I have seen that they just make the decisions right away because it's much faster without informing others. So it's important like one thing I used to do is I used to create like spreadsheets with different ideas and the evaluation criteria and what's the effort required, what's the impact you're going to create and etc. So once you have the ideas, you need to refine with the team and also assess the different aspects and dimensions that you're going to assess. So you can have an evaluation metrics and from that evaluation metrics you can come into a translate, choose one idea that's going to help you to solve the issue of having less engagement with your recommendation engine. So as PMs, it's important that we need to earn every decision you want to make. So it is important that you share the evaluation criteria and you inform the DS teams and you make sure that they are well aware why you went with this particular solution and you also considered their opinion and what they were trying to share with their ideas and how that idea is being chosen for implementation. So once you have this idea that you have to potentially to solve that insight, it's important that you also need to validate it, right? There are different ways of validation. I mean, in traditional approach, you can do experiments but sometimes in machine learning, it may not be that straightforward to do validation. So some previously, in the past, what I have done with my teams is sometimes we checked with pre-trained models so that data scientists can mark the performance or expected outcome based on existing models that's already trained for that particular use case. Sometimes it's not that easy to find either. For instance, we were trying to build a fuzzy name matching system and to find models out there which is pre-trained just for that particular use case was quite hard so we needed to break down that into even smaller pieces and we found some models for some particular aspect of the fuzzy name matching so we were trying to validate that. The other thing is the open source community is really big in machine learning in research as well so you can try to see if there is any open source models that you can validate it with. You can see the performance, how it works with your historical data, etc. Importantly, recently, there is a trend towards AutoML. Some of you may be familiar. If not, I recommend you to have a look. There are tools like H2O.ai that can help you to quickly prototype and get validation but it is not as easy as it sounds. You need to have your data set cleaned once and pristine data is important here but it may sound like it's a magic but it is not really so we may need to put more effort to making this work. Sometimes it could be used. I have seen some of the data scientists in the community using AutoML for prototyping while it is still something quite early. Once you have validated and once you have identified the solution I think what we need to emphasize is the importance of focusing really on how we can deliver impact and not focusing really on fancy solutions which you could fetch attention, etc. Here I'm speaking more about making sure your solution doesn't create unnecessary dependencies and it works as expected and it could be as simple in a way that you can even explain it to a five-year-old sometimes. Once you have defined from the problem, from the inside and then to have an idea that you have validated this important before that you start implementation you need to minimize the risk. One of the ways that you can do is to minimize the risk as product managers is to do that with research. I believe in the product life cycle especially with machine learning this is where the actual innovation happens and I think it is important that we as product managers make sure that we mitigate the risk of any failure or any customer impact by bringing in innovations. So one technique that we used to do in financial tech companies where there is heavy governance we used to do risk assessments for most of the products while for less governance products I know in the industry there is a term called premortem a lot of PMs are speaking about it and which is ideally essentially a risk assessment as well. So you are one in risk assessments what do you assume is you assume that your solution is in production and then you try to create scenarios where if it fails what's it going to be the impact etc. So some examples are like some scenarios where we are assuming let's take the example of a customer onboarding solution that we were doing for Revolute. So if you onboard a customer who is part of any sanctioned list or money laundering list it can be it can be lead to a lot of different factors like our banking licenses may get revoked we may get huge fines from regulators etc. So let's assume that we deploy a machine learning model for this process and what happen if we are not actually performing as expected so it means that we will onboard high risk customers like terrorist etc to the financial institution. So what's the risk of it? We assess this scenario with data scientists so they also understand the risk of the models that they are building and what's the fallback strategy what happen if the model fails and or the system is down how will you still deliver the value to the customer what will be the minimal rules that you need to implement etc. and what will be the financial impact so how much revenue are you going to generate so if you are replacing an existing vendor it means that you are going to save the cost of the vendor or what happen if you get a fine from the regulator what is the worst fine you can get that's financial impact is there then what happen if your model is performing is it in production but it's not performing that really well so they mean that there could be some other financial impact so you need to define at least the expected outcome and the impact as well and importantly we also were assessing repetition impact for instance what happen if the machine learning model is biased to certain rays etc. So these are some things that you can assess based on the circumstance with which you are operating it is important that as PMs you work with data scientists to create scenarios and choose the ones that's really important and try to make sure your solutions meet that mitigate that risk so some scenarios we were assessing is like for instance we had this onboarding machine learning model and let's assume we create we onboarded someone who was high risk and imagine since we are applying for a lot of financial banking licenses if the regulator is asking why did you onboard a certain person we should be able to tell them the logic of the machine learning model and the decision and how it was made so during our discussion with data scientists the solution they came up is to you know build a model that's quite explainable or having this explain and we were logging the query that the model is receiving the inputs the outputs and the decisions that being made so you can see that if this risk assessment and this mitigation was not done the impact that the ML model is going to create is going to be quite different and here working closely with your data science team is really critical because as PMs we can bring in the scenarios and define it clearly for the data scientists and it is important that you take their experience and expertise in coming up with defense mechanisms and mitigation strategies for this risk. Another example was like how to make sure your model is not biased so if you take an example of let's say a selfie verification system or etc face verification systems for instance we don't have much data available for colored people so the way this matching works may not be as good as for colored people so this means that there is a bias already when you are building the model itself so we need to make sure we capture those and in Revolute what we were doing is we assess the inherent bias in machine learning models and then we set thresholds to make sure that if the model is exceeding certain threshold in production based on the inherent bias then it means we need to rework on the models it could be on fairness, it could be on anything about discrimination etc so to make sure that your model is not biased to a gender, race, ethnicity, color etc so once you have done this risk assessment you can I create very important high risk scenarios that the model should handle and data scientists can bring in their innovation here and I believe that's how we cultivate research and innovation when it comes to problem solving so I think it's the most critical aspect here to reinforce is that we never should assume the ML system behavior so we shouldn't assume the system is going to behave like X and Y without doing any risk mitigation and this risk mitigation exercises will help you to make sure that we don't assume a lot of factors about the machine learning system itself we are going to go into the root of each and every aspect that is that may potentially potentially create a downtime or any impact on customers so one thing we also was doing is we were testing heavily we were testing with pristine data that pristine is red and it's important that you have good quality set of data for testing and validation of models so we had like five strategies we had validation set like any normal machine learning model you validated or during the development against the validation set then we had historical set so which means based on our historically logged predictions and data that we have in the system taking in the seasonality etc so some models may be seasonal if it's like probably related models I have seen there is a seasonality in the bookings and the way the system is behavior so you need to have historical set that is tackling all these particular scenarios and we had a benchmarking set this may be hard for some of the machine learning models while for some of them it could be realistic to have a benchmarking set which where you can compare against your model performance against the standard industry set or benchmarking that you can use against different open source vendors for instance then once the model is ready for production we shadow it in production so the model doesn't make the actual predictions but it is logging the prediction and this has been evaluated and finally we also do we roll out so instead of rolling out 100% globally we make sure we roll out per city or we roll out it for traffic in a gradual increasing way of traffic per country or per market we were operating it and the most important part once you have the risk mitigated validation strategy set is to constantly capture the measure of the impact and here I'm speaking I will be speaking more about how you can transform a model centric product lifecycle into a data centric product lifecycle so recently Andrew Ng who is the founder of deeplearning.ai is really emphasizing on the importance of how companies should focus more on improving the quality of the data while focusing less on reiterating on the model so reiterating more on the data while you can improve the performance of the model by building in high quality data set so this strategy is quite simple that we have done in the past as well so some of the models that we built as I mentioned were catering for regulatory needs so we were we needed to have something explainable and had much flexibility to change the architecture so we kept the same architecture while we improved the performance of the model by bringing in more labeled training set so in this case we were labeling data set manually so as PMs you need to identify the data science needs of what kind of data you require sometimes you need to acquire it buy it or label it or generally work with different teams and this data is available for your data science team and once we were reiterating on improving the training set then we also were consistently monitoring the data quality so we have upstream systems and downstream systems that is probably consuming and where we are also producing the data that may affect the quality of the business processes so we need to have consistent monitoring on the data quality that we are receiving the profiling and some statistics and alerts set on this data quality so as PMs I think that is quite critical that we also consistently monitor these dashboards to make sure that the inflow is as expected and as I mentioned previously we were logging consistently the predictions this could have lot of importance you can do analytics on top of it and then you could use it for retraining you could use it for validation you could use it for prototyping so it is important that you also work with your DS teams to make sure your predictions are logged and you could consistently have dashboards that is actually monitoring the performance and the fourth aspect is once you roll out your ML in production make sure you have like ongoing real time monitoring on against the ground truth data so I know in some of the cases the ground truth data comes in bit late so it is fine but it needs to be still to be monitored and you need to have alerting set on this and the important aspect of machine learning models I've seen models being monitored for precision recall etc which is great but it's also important that we crafted a behavioral statement in the beginning it's also important that you actually monitor your machine learning models performance in its context so how can we do it in context and some of the interesting metrics that you can use is drift detection metrics so like data drift and concept drift so thanks to evidently I mean it's an open source data drift detection tool that you can use and in glow we are using a vendor called Mona so they both are great vendors but the important aspect here is the concept data drift so essentially you are over the period let's say you trained a model you deployed into a market now you are deploying into a different market so the model haven't seen instances of that particular market so now you need to have some monitoring on this right so here the context is really been applied the customer behavior is different in different markets so now you can understand there is a drift in the behavior of the model so it's important you set the alerts and it should be part of your maintenance work as a data science team that you have to allocate capacity to the maintenance team and once if there is a drift in the existing models that you have in production you need to allocate capacity to rework on this and other thing is concept drift which is essentially the behavior of the market or the customer changes so let's say there is seasonality in models as I said about travel it can be seasonal even for onboarding of customers there could be seasonality if there is a campaign happening on certain markets you can have a seasonality there so and the behavior also changes so let's say during pandemic we had a different behavior for customers right so you need to consistently monitor the drift in the features and the performance of models and this is quite important as product managers that we also have dashboards and alerts said to make sure your model is in the context of customers so irrespective of all this it is important all that matters is how much revenue you are going to generate and this is the hardest part of translating your starting with your basic model metrics like precision recall metrics and then going into context metrics like drift etc and then finally going into what is the revenue this is going to generate to the business right so in some cases it's straightforward like when we were doing automation related models we can say we are automating tickets so we know the cost per tickets and we can see like how much how much is the model saving for the business so that's quite straightforward while for some cases it's not that easy so the most simplest way you can measure the impact is create a scenario where this model doesn't exist and see how the business is performing and also the scenario where the model is existing so you can see the difference in revenue and you need to consistently track this this is how you measure the impact and finally I think when we were trying to build ML introducing ML we never pitched the solution to the business or to the stakeholders we never told is an ML what we told is what the system that we have the product that we are building how it can bring value to customers and how much revenue it's going to generate for the business so that's how we pitched it and we never said it's a deep learning or machine learning or AI it's I have noticed there is a tendency at some times we go into pitch the buzzwords while it is also important that we focus less on the buzzwords while we focus really on as a business what's the value it can bring to the customers and how much revenue it can generate so recap so we started with having an insight and how we can define the problem working with the data science teams to the behavior of the customers and then creating how we can quickly validate the solutions that we have and then mitigating the risk with risk assessments creating monitoring on this and finally measuring the impact so I hope you enjoyed this and if you have more feedback or questions don't hesitate to reach out to me in LinkedIn or Twitter I wish you all the best thanks a lot thank you