 Hello, everyone. Welcome to the second part of my series on machine learning. In the first part, if you remember, around a month back, we covered the declarative layers, the basic ecosystem that machine learning operates in. And a lot of you reached out and you asked, OK, what about the applicative layer? What about the second part? So this video, this presentation basically covers the applicative layers in the machine learning. Basically, after you have set up the tooling, how do you go ahead and identify use cases, target use cases, and what is the theme structure that goes ahead with it? Firstly, basic about me. Sanjitha, I've done my BA from NSIT Delhi and my MBA from ISB Hathabad. I have around five years of software development experience. Post-advert, I did my MBA. And after that, I've had around 10 years of product experience, basically mostly platform product experience across the spectrum. I've worked across Africa, in India, in Europe, and even in the US. And unofficially, I'm an accidental MBA. And of course, what is an accidental MBA? It has who wants to reclaim his tech roles. He gravitates towards platform PM roles. That's what I am all about. So why are we here? Three points to, again, give Kyaan. Commiserate on our lockdown lines, even though it's getting better, but my friends in the UK can attest to the fact that a new variant is out. So if you are UK, please stay safe. And yes, as we discussed, to complete the industry. So what we discussed last time was mainly the first part of the story. But today, we aim to complete the full story. So maybe let's go through the reminder of what we discussed last time, especially from a terminology point of view. The biggest superset that we discussed right now is artificial intelligence. What is artificial intelligence? It's basically a sentient thing, right? Somebody who is non-carbon based like us, but I can actually think, sense, reason, basically which can pass because of his double blind test is something that we'll call artificial intelligence system. Machine learning is a subset of that. Basically a learning system that learns and adapts on a specific area over multiple iterations, right? Basically like a kid, right? You learn and adapt a lot of times. Just do a lot of, basically if I can take a very real life example, something like AlphaGo. This is a deep mind system which plays against itself and learns and adapt to play at the game of code. This is a 19 plus 19 board game, right? And deep learning is even a smaller subset of machine learning, which is basically multiple neural networks just learning about themselves. In the industry what you will see is machine learning and deep learning are used very interchangeably. For all intents and purposes in the tech industry, we will use machine learning as a do. And again, this is what we discussed last time from an ecosystem point of view. On the left most part, we have the data formative layer, which is where all your individual data sources within the org are actually generated, right? Right after that is the data aggregation layer. So once let's say you have data sources which are aggregated, what kind of data sources you can have marketing or identity, et cetera, et cetera within your org, they are aggregated somewhere, most likely something like a data lake or a data warehouse, right? This is called the data aggregation layer. The data aggregation layer feeds into something that we call as a declarative layer, the machine learning tools layer, which actually enables this data to be used by machine learning models in the future. This is a very important part of the data because you cannot just feed crap data to your machine learning model. The kind of data you feed in is the kind of output you will get. So everything within this is counted as a part, as a very viable part in the machine learning ecosystem. The declarative layer then feeds into the applicative layer, which is something that we're going to discuss in deep detail here today. What does applicative layer do? Basically, once you have the tools ready, once you have the input data ready, you actually have to identify opportunity, identify application and actually work on deploying and actually getting those results that we are all buying for by our machine learning deployment. And what you see at the bottom is observability layer, because anything and everything that you do in a tech system should be observable, should be monitorable. And again, as we discussed, if you remember last time, if you don't, please go back to the video. That video is still available on product schools website. And what happens is, there can be a similar observability layer all across, but there are dedicated tools that actually work on machine learning observability as well, in fact, it's a very hot area right now. So this is what we discussed last time. Building upon this foundation, we will actually go into the last part of the ecosystem, which is a machine learning applicative layer. Okay, we already discussed this, data formative layer is a production of data, aggregation of presentation is a collection of all big data sets in one place. Basically, it's just a very simplistic explanation. You have cataloging of data, et cetera, declarative layer is a tooling for the applications of workforce. And finally, the applicative layer that we're going to discuss today. Going further, we are assuming that we have well-labored data. It's an industry term used in machine learning. As we discussed last time, data with proper data linkage and tooling is not a blocker for our rule outs. So going further, we are assuming everything is in place before data applicative layer and we are all set to actually roll out our machine learning applications. Now, let's come to the machine learning ecosystem's application layer. So application layer, I'm going to further break down into three parts, right? The first one is identification of the problem. Okay, that's all well and good to say, okay, we are going to use machine learning. Machine learning is going to be a game changer for my e-commerce website. But how do you actually identify what kind of use cases can be solved viably and gainfully by machine learning? And what kind of use cases are better than by a human? Let's say the human element is still not, this is not Skynet, right? We're not going to replace humans, right? So basically, we are trying to narrow down what problems we need to be sorted by human, right? Who actually drives this? It's PM along with ML practitioners. Mostly what you will see is data scientists within your org, within your product groups who are embedded within your, as a data scientist craft, who are actually going to try this. And they do it via three ways. So there are three ways you can actually identify machine learning use cases and it depends on the kind of organization you are, the kind of maturity you have in the machine learning use case. First one, and the most common one is C. So I will start with C and work my way through PNA. If you are a young startup, right? And you don't have a dedicated machine learning craft, ecosystem, you don't have experimental layer setup for you, it's very difficult for you to actually go ahead and drive machine learning, right? So what you do is, you actually hire an external consultant, sometimes who can actually be a machine learning consultant. Mostly somebody who has worked in a big org who actually knows the ropes of actually identifying a big machine learning use case. And you hire that person to teach you the ropes, to teach you how to identify machine learning use cases. And what you do is you use a lot of third party tools to actually do a pilot test case. And if it adds value to you from a revenue point of view, you actually look to invest further. So C is mostly used by startups a lot. Then we come to P. We just have a central empower team which finds use cases. Let's say a central machine learning kind of a team, which actually goes out and delves deeper into each product group and finds use cases for you. They are pros and cons for both. Mostly this is done by a mid to large organizations. Mid organizations are much more beneficial having a centralized team because there's no longer spread. So let's say if you are let's say a 100 million dollar organization, which is let's say 200 people take the massage organization, it's easy to span the scope of the use cases that can be available by a centralized team. But this is mostly done by a mid tier team. It's very difficult in a large team, let's say 2,000 or 5,000 people plus organization, having a central team would be very difficult for you to actually go and dive deeper into individual use cases of each product group. So as we discussed, B is mostly used by mid tier organization and A is something that you will find a lot done by a large organization, let's say a fortune 200, a fortune 100 organization. What they try to do is they have ML practitioners which are embedded within each product group. So I'm sure within your product organization you have horizontal and vertical product groups with the platforms and consumer facing product groups. So ML practitioners are embedded within the vertical product group. And at the same time, you also have a central team. So I'm not saying it's either or, a lot of organization do A and B mix. So it gives you the best advantage of having a central log as well as having a distributed log. So this is basically what we are trying to do. Okay, you identify problems at this area. And how data scientists identify problems is okay by a statistic variance, by testing out, okay, this is one use case that we have which is being done by him or being not done very properly. But if we can maybe automate this, this is the amount of revenue we can see. So it all starts like your basic experimentation deployment with our value case, with our statement of purpose that, okay, if A then B. Cool. So let's move to the second part, problem implementation, right? So once a viable use case is selected as we discussed in the first part of our presentation, then it's up upon us, let upon the machine learning team to actually go ahead and implement a viable model. What you do is, and please remember the assumption I called out that the data we have is very laborious, non-biased data, right? Now you're going ahead and building a model. And this is where a lot of tensor flow, a lot of model building tools is coming into the picture. We are actually sitting down and asking your machine learning practitioners to come up with a model, right? A model that is self-learning, right? Once you have the model, you feed it's labor data and you feed it by the billions, right? You let's say feed five billion data points, six billion data points. It's actually a science behind this, how you go ahead. And that's why you need machine learning tools, the declarative there, right? And once you feed the data, you are actually observed, right? You have your machine learning practitioners who are observing, okay, how much is the variance, right? The data, but when is the, let's say, model 80% ready, 90% ready, 99% ready, right? These kinds of things can be outsourced by highly specialized and imparting, can be done centrally by highly specialized team as well. In the previous example, when we discussed the central machine learning team, if you have a central machine learning team, this is mostly what they are trying to do, right? This includes building new learning models. So besides a lot of models that are available online right now, right? Let's say personalization or a lot of these standard vanilla use cases, the bare basic model is available online. But for each of our organizations, the model has to be modified. So this team basically what it does is, at this stage, it modifies that model, right? But all those models are still based on very standard learning methods that we have in the industry, which is regression, binary, multi-class classification, clustering, et cetera, et cetera. But what you will find is increasingly, a lot of new organizations are bringing in new learning methods. So this ecosystem increasingly becomes a center of excellence. You will have a lot of PSG involved here. A lot of scientists or researchers were actually going coming and trying to prove their hypothesis. So it's a very rich area, but frankly not much to do for a PM here. So now we have a model ready, we've fed it data, and now we are relatively confident that that model is ready to actually go live, actually see how much it does on live production. Now you get into impact the system. Once a model is built and cleared for live deployment, you assist, but this is a viral use case for production deployment, you deploy it. And how do you deploy it? You deploy via specific experimentation infra, right? You want to see, observe efficacy of the product. One very important differentiation that you see, and it's maybe a shortcoming of the current experimentation tool. Once you're trying to deploy a machine learning model, you're trying to deploy a multivariate model. You're not a single variable or twice variable, right? It's multivariate. So you need a very specific set kind of experimentation infra. And this experimentation infra, once you actually have run the test for a certain amount of time, then the reports and the, like any outcome of the experimentation tool have to be peer reviewed. You'll have to take everything with a sort. Only when you find a dedicated P value negative less than 0.05 and impact that is very specific and measurable, can you actually go ahead and let's say, disseminate this information to say, okay, this is what my model brought into information. I think this is a value value, right? Again, as we discussed, this part can be outsourced or can be done by a highly specific and imparting as well. But this is a very important part of the rule, right? A lot of teams in the industry do one and do very well. One and two is standard deployment, right? The developers are in machine learning, practitioners can do it. But being very thorough, being very dispassionate about three, the point three that we discussed here is something that is not done in the industry. Especially once after the practitioners have spent a lot of time, one and two, right? There's a lot of recency by that you've spent people say, no, no, this model is working fine. This data is not doing well, let's say. So a very impartial, a very imparting has to sit here, which is impart, okay, okay, say, okay, scrap it. This model is not working. Only then would you see value in having a machine learning person or it's as good as spending a development hours. Good, so maybe we can discuss a few use cases. Let's say you see a lot of no coding, automated transaction part of coding. I am a big fan of DeepMind. So maybe go and read the website, right? Basically, they've come up with an automation tool, machine learning model, which can actually enable no coding, let somebody, let's say a machine who can do a certain amount of transactional part of coding better than 50% of the units, right? So they actually came up with a model that actually doing competitive coding better than 50% of the units, which is awesome. Just imagine the future, right? Most of the coding practices that are developed will do 70% of it is just house clean. If let's say we can automate house clean by a software, how impactful a software engineer would be, right? Second one is automation of ITFN, read the FN manual. Let's say in a help center by environment, right? You have tons of help center documents, but you want to prioritize that document that you see mostly people are struggling with. So automatically, higher gradients to more user support articles is also a very nice excuse, and there are so on and so forth. The kind of industry applications for machine learning are endless. You see vision-based, you see automation, the tooling, and biology. So again, defaulting to DeepMind. The kind of work they are doing is simply amazing. Then again, this is something that I don't have spending a day to slide on because I don't want either of us to forget about this. Machine learning ecosystem is nothing without its monitoring and observability there. Monitoring and observability becomes all the more important, so that the machine learning output is not yeky. That's a term that's used a lot, right? And the concerted effort should be put here. So if I can, let's say, if you're spending 80% of your time getting a machine learning model ready, I would spend another 80% of your time just making sure your monitoring is correct. Because what you see is what you get, right? If you're not even able to measure and observe what you are looking to deploy, right? It's as good as a black box. So what are some of the trend lines that we discussed today? Firstly, the strong trend lines. Applicatively, there is vast enormous, touching everything under the sun, right? I'm a big component of machine learning. I think it's going nowhere. It's only going to get better and better. The kind of application we can't think, we can't even think, the machine learning ecosystem is going to solve is beyond reproach, right? Life sciences, cell driving cars, vision, et cetera, et cetera. Data ecosystem is a benchmark. So all, what we are assuming, everything that is done in a machine learning ecosystem cannot be done without data ecosystem, without you having the initial four layers or three layers that we discussed, without that having in place, you are not going to get anything done by the last year. So as much important should be given to those four layers as we are looking at the result layer, right? And as we discussed last time around as well, I don't see much to be done by a PM and applicator there. A PM drives a lot in tooling and products, right? So you will see a lot of strong PM growth in machine learning in the ML platform there. ML tooling there, right? Data science there. So that is where you enable these use cases to happen. You will see a lot of research driven in applicator there. Of course, they need to be proven wrong and they need to be convinced otherwise. I'm always a LinkedIn pink away. So answer is always statistically variant. And that's me. I hope I didn't give you another death weapon point. And if you have any concerns, any issues, you can always reach out and available on LinkedIn. And thank you so much. Thank you for your time. Cheers.