 So now that it's 6 p.m., let's formally get started. Thank you everyone for joining. I don't know the numbers of people on the various platforms, but thank you for making time. It's Wednesday evening, ISD, and God knows where you are. But today, our topic for this series of meetups, this is the second in the series of meetups, the overall series is called Making Data Science Work, where the focus is actually putting data science to work, putting things into production, getting ROI from your data science initiatives, as opposed to it being exploratory. And today's session sort of hits the nail squarely on the head because the theme is on the productionization of ML models. And we searched high and low for some people that we thought were credible, some people who have faced the firing line and lived to tell. So today with us, we've got Nishal and Aditya. So Nishal works for a company called Omnias. Aditya currently works for Glantz at Inmobi, but there are updates on both fronts on Nishal's side. I would love for him to tell us a little bit about what Omnias does and then his role there. And similarly, Aditya, what is the flux that's going on? And once we have a few of those details in, we'll dive into the meat and potatoes of what it takes to put models into production. How does one go from exploratory and how does one feel the sweaty palms that are involved in actually taking these models to production? So Nishal, if you wouldn't mind, I'd love to start with you. If you could tell us a little bit about your journey in brief and where you are today at Omnias, you're based in Berlin and you're joining us from there. Please tell us a little bit more. Okay, thank you, Indra. Definitely have lived for a while now with deploying models. Don't know how long that's gonna sustain, I hope for quite some time. I'm Nishal, I'm the VP of Technology at Omnias and thank you for scribble data for having me. Omnias is a Berlin based AI company that's building products in the insurance industry for claims automation. And insurance industry in general is a different type of an industry where there's more interaction in forms that have not evolved to the extent that you can assume in other industries. And of course, they're rethinking of their entire claims handling process. And we are one of the probably the foremost enablers in this industry in making that happen. Indra, you're on mute. Yeah, I'm on mute. Damn it. Aditya, I wish you could tell me this. I know, I know. But it does though, it does. Aditya, if you could tell me a little bit about, tell us all a little bit about your journey and where you are at plants and what's happening next. So in terms of my journey, I've been working in data science for almost 10 years now in the industry. It's been interesting to see the evolution of the models itself, playing a role in the actual products. So it's been very fascinating for me to see the scale and how ML productionization from, when we are fetching data from DBs to actually building these microservices have evolved. So it's been pretty impressive in the last few years how quickly data science field has evolved. Because the fact is data science in industry is a very nascent field, right? So it's evolving and we are still in that place where we are not as mature as software engineering. But yeah, in terms of my own kind, I was currently, I was working with plants till last week and I just recently moved and joined data science team at MPL, Mobile Premier League Gaming. So it's gonna be interesting. So yeah, that's all. It is going to be interesting. MPL is, I think, a forerunner in mobile gaming out of India. So that's, I mean, very interesting to hear how things might shape up for you then. Just a little bit about Venkata and me, we are the hosts and for those of you who are joining us for the first time, as in you didn't attend our last session, we are from Scribble data. We are an ML engineering product company. Our primary offering is a feature store which for all of those interested in putting ML models into production forms a key part of the ML infra that helps organizations not just put multiple models into production, but also think through the aspects of robustness and trust in the data sets that are underlying the training of those models. So that is really our interest in this space and over the course of our existence as a company, we've had the opportunity to speak to various practitioners and we love the teams that come up and by organizing this meetup in collaboration with Hasgeek, it is our way of addressing from other vantage points, some of the questions that our own customers throw at us about what it is that makes, what that gives to returns to date on investment in data science initiatives. So if I can bring the conversation back to the original theme that we were talking about. We were saying that despite growing teams and investments and budgets, very few machine learnings finally reach the production stage. So what is productionization and what makes it difficult? And there's this new term that we're all grappling with which is ML ops. If you can talk through a little bit about what in your opinion is the definition of ML productionization and why is it difficult? That would be a good jumping off point for the rest of the conversation. Even if you want to take this. A little bit about the kinds of models that you deal with. Yeah. Yeah, I mean, Aditya, do you want to go first or should I? Go for it. Okay, cool. So for us, and I'm really talking from the perspective of what we do at Omnius itself rather than something that's very general is we have a bunch of issues taking things into production. The first being that we are not a SaaS based company. So when you're working in the insurance industry and you're trying to build a software for that industry the expectation is that you're able to deploy it on premise or on their private cloud which you absolutely have no access to. So which means if you did a mistake you have to go through a really cycle before it actually makes it through to their production environments. That makes it harder in general in software development itself. The next level of difficulty for us is of course the privacy of data. And I think this is true for a lot of data science domains where you do not have enough data to train on and whatever data that you do get to train on is not really truly representative of any part of the population. I mean, the sample sizes are so small that you make a lot of assumptions. The only way we know our assumptions are invalid is when these models are actually running in pilot stages at our customer side. And then that's a business issue because the expectation a lot of these companies have is that these models work out of the box, right? And that's not truly the case because we can't really train models on another customer's data and use it for from customer X to customer Y. So when we go through the entire process of educating around annotation of data as we deal mostly with texturing data on documents then sort of bringing machine learning models to train on this data and use it for prediction, everything can break. From the very beginning step of identifying what is the schema of the data model all the way through deployment from after it's the model's been trained for something that's in production that they're using for the actual predictions. Complexity of this is hard to solve because you have no idea where what kinds of problems could arise. Is the problem in the data? Is the problem in the fact that the data there's been a drift in the data from what you used for a pilot to what's been used in production? You also have because of the data drift model decay or basically even the concept drift itself because the labels that your model's been trained on and what you're actually looking for in production might be completely different. And given that there are other underlying systems that are using your AI systems for automation or augmentation, you bring in a lot of bottlenecks if your systems start to malfunction. And that's in short, some of the challenges that we are facing and through the course of the discussion today I can probably get a little deeper down some parts of it. Yeah. Aditya, if you want to add something to that feel free to jump in. Elsa, I have a couple more questions. Yeah, I think the primary difference between like why there is like this new term going on around ML ops is what Nishil was saying, right? Like the biggest component of machine learning models is the data part. And that is something which the traditional DevOps systems were not designed for. So you have to take into account like the data, which is the, you know, the uncertainty around data because as Nishil was saying that it can decay and your model performance can go down, you need to have proper data versioning around it so that you can capture future tricks. So yeah, I just want to add that. Okay. So, but you... Sorry, I had just one quick question, Venkata. Nishil, you were talking about the various points in the development lifecycle where things could go off the rails. When you finally decide that things have been on the rails enough that a model is ready for production deployment, how do you make that judgment call? What in your testing mindset, what do you look for? Can you talk about it at a high level? Yeah, I mean, sure. For us, it's, I mean, in general, when you're doing data science itself, you're almost always inferring on a validation set. And the problem with that sort of just looking at one model is in production, you're not really just using one machine learning model. You sort of have like a chain of models that are connected together to solve a task. So when you have to change the way you look at how you're evaluating machine learning models itself, you're mostly looking at an end-to-end performance rather than how good is one model compared to its previous run. And which means that you have to look at the impact of the task that you're trying to automate or solve through the course of the journey that it's going through different models. For that, you're looking at end-to-end performances. You have tolerance levels for different metrics that come together based on how much of error or how much risk you can actually take and the impact that it can possibly have. So that's kind of the thought process one normally uses when they're trying to automate tasks or solve a bigger problem with the chain of machine learning models at place. Yeah, wonderful. Actually, this points to another theme that we see, which is that by the time an organization gets that point of maturity where they can see the opportunity to put one machine learning model into production, it's almost inevitable. They actually have ideas for 10 more or they actually have 10 more buns in the oven, so to speak. So it's never just one machine learning model and we are done. That organization has much more opportunity, challenges, and eventually ideas as well. So Venkat, you were going to ask something. You want to take up one of the questions from the audience? Yeah, before we, there are a couple of questions around versioning and explainability will come. We are, we're still, sorry, I cut in. Venkat, please go on. Yeah, so before that, see, given all the uncertainty associated with productionization of models, how do you think of the planning around the process itself? In how many different ways could it go wrong and how are you going to recover, especially when it is at an arm's length distance? In the case of, you can't even access the models. They're all offline, right? Behind the firewalls of the customer. In the case of glance, for example, the scale at which it reaches, the number of millions of people it touches means that recovery from errors will also be so much more harder. How do you think about the risk management around these models? So I am, so in terms of glance, I can give an example in general, I think the first thing which we have to evaluate is how much you are actually replicating your test environment, right? So it can be like in terms of glance, it's about whether you can do a proper feed recommendation. So how much is your actual training data set is a replication of your online data set? Because one of the things which we always have to understand is that in online, there is a feedback loop associated to it as well. The model is making a prediction and you're going to be training on that prediction which doesn't happen in your training data set. So, but as a checklist, the way I look at it is, A, how much close my offline data set is to my online data set? Like how quickly I'm able to replicate that? And B, the infra requirements, right? So any model, when you're shipping the model to production, you need to understand in which infra environment you're going, right? And see whether this model is giving you any critical gain if it's a very new model, right? For example, I'm building a neural network model which has 30 layers, right? Now, to productionize this model will take me a lot more effort compared to productionizing a logistic regression model. And at this point of time, you have to question yourself whether that gain, whether it be 2% or 10% is impactful enough for the product itself. Because if it's not, then you should be looking into the simpler model. It's easier to debug, it's easier to productionize. So that's how I'll be going ahead with that. That would be my solution essentially. I mean, for us, in general, it's a little different sort of a challenge. And this is one of the reasons why we can't avoid the risk and I'm being very frank about this, is there's no way for us to actually say with 100% conviction that everything's going to work. So we are taking risks when we're doing this, but we make the customers aware of this. So we spend some time educating around the risks and what could be the potential problems. We talked to them about understanding machine learning confidences. So we really training when they get predictions either to their downstream system or they're using an application where they can look at and visualize our predictions. We have documentation and we go through different iterations of telling our customers what is confidence, what is the difference between a prediction accuracy and a confidence. How can you make use of something? Why is training data very important? How do you keep consistent training data? Which leads us to a point where you can't build these processes in the beginning. There are certain bridges that you burn. There are certain situations where you land yourself into from which you sort of rise and build processes that are important and relevant because it's impossible to identify everything in the beginning and also it doesn't make sense to build everything from the beginning. I mean, you're a startup trying to solve a problem. So you're really looking at make it work, make it right and make it fast. So you have to, if you don't embrace that sort of mindset across our entire engineering and product team, we could build an MLOps platform for the next five years because or even 10 because there's so many conditions and so many different parameters that can go in but it could then end up having zero impact on our customer. So it's definitely a trade off. Yes, if you have to go from the make it work to make it accurate, right? You need a feedback loop. So Keith's question was that, let's say the model is behind the firewall and it has failed for some definition of fail. How do you extract enough information from the context, the weights of the intermediate layers and so on to be able to actually do it right, the next iteration? I mean, the expectation here in Qatar is I think one of the most important things to understand with on-premise models itself is the more or less providing an ecosystem for the customers to train and use the model there. You don't really have the most of the times especially dealing with private information, right? Either you're working in the hospital space or the insurance or finances. You might not have the capability to actually bring these weights back. You might, and it's not really just about the weights, right? So it's a combination of your model and the data that has been used to work with. At this point in time for us, we've built the capability for the platform itself, the MLOps platform that the customers can use to retrain. So our feedback loop is that you train on a small set of data which is not really representing like the sample is so small but the machine learning model can start predicting. So the amount of time you spend on doing annotations reduces because the model predicts and let's assume if the model is even 50% accurate, right? So it predicts 100 fields, but 50 are right and 50 are false. You still reduce the time in annotation. So whatever is correct is left, whatever is erroneous is fixed and that is a feedback loop again to train the next model. And you always keep an inference set and you sort of provide the metrics so that people can decide based on the metrics if they want to publish the newly trained model to production. And this is, I think the point that Aditya was trying to touch on when he said is MLOps is very important because it's not really just about the machine learning model or the operations or the data. It's like an intersection of this where you have the feedback loop coming in and you have to provide mechanisms to reduce your risk of data versioning, model versioning and sort of have reports generated on top of this so it can be reproduced and the decisions can be taken on based on this. So if the new model that goes into production is Varsar, you have an immediate mechanism to at least roll back to maybe not the best model but at least a better model. And those are checks and balances that you have to build into your system. Okay. Yeah, Anishin, I have a question. So when you're doing these on-premise deployment, right? Like how do you isolate your engineering works versus like the modeling work? So like the... Yeah, I mean, it is quite hard for us to differentiate between that because there are things that can go wrong on the engineering side and there are things that can go wrong on the modeling side as well. So we've built in, for example, logging mechanisms for both, right? And the way we do the inferencing as well is we've sort of created two ecosystems. So when you deploy our entire application on our customer landscape, you sort of have a differentiation between or an abstraction where you're just staging all your data, you're annotating it and you're training and this is not really used for prediction. And then whatever you're using for prediction, you're sort of building capability into the platform to analyze the capacity of the prediction and to analyze the capacity of validation and training. So you're trying to build these things there and of course, things can go wrong leading to tickets that are being created at the service desk where it is on us that you have to infer if it's an engineering issue, if it's a model issue and how do we go about it? So very thin line as it's actually quite hard for to educate anybody around this as sometimes it's even confusing for us if it's a data pipeline that's going wrong, if it's a post-processing that's going wrong, pre-crossing. So of course logs help us, but it's hard, I don't have a definite answer as it changes from use cases to use cases for different customers. Which is fine. I think one of the promises we had made to our listeners during our previous session was that we'd be bringing them war stories, not just here's what you should do, here's the prescriptive PowerPoint, do this and you'll be perfectly okay. It's good to know that even at the cutting edge of being able to deploy these models with some amount of confidence. When I say cutting edge, I mean people who are willing to stake revenue impacting, bottom line impacting decisions or outcomes on these models. Even there, there is this uncertainty because the truth is that despite all the hype, it is only now that we're getting to that reasonable state of maturity where we're actually being able to put these things into operation. So I really appreciated that answer. And in fact, this brings me to two questions for Aditya. Aditya, one is if you could help just concoct for us a definition of ML ops so that we are all speaking the same language. And two is that business about not releasing on Fridays. I mean, I think people will figure out very easily, but then you can talk a little bit about why not, what can go wrong? What's a better practice? So I can give you my definition and then I think I'll let the group decide come to a definition itself. So in terms of me, myself, I had the way I looked at ML ops is the, ML ops is essentially the ecosystem around the model. And that can be on the pre-processing side. It can be the post-processing side and it is actually serving up the model itself. So anything which is, so everything and anything which is encompassing a model is ML ops for me. So that is my definition in short. In terms of Friday releases, I would say if you don't want your weekend to be ruined, please don't release a model on Friday. I've had issues when we have deployed an amazing model on a Friday where all our offline numbers were looking great and we were bullish about this model and we wanted to get the experimentation started early. And it just went horribly wrong. And in this point of time, it was because of a bug in the engineering. Like on Friday, you have a bit of a, sometime you have a bit of a attention span, less attention span as well. So we had an engineering bug and we actually had to come back on a Saturday and do the things. But in general, I obviously avoid because you're setting up, you're taking a responsibility of production environment, right? So you cannot, even if an Azure, even a cloud outage would help you but make your life miserable on a Friday. So that's my story. So what period of time is that golden window for which if things run okay, you're like, all right, my weekend or at least my night, whatever the next night is, is going to be reasonably okay. And what do you look for? What is your mentor checklist looking for beyond the usual metrics? How do you look at your model performance and for how long? So I think a couple of things, right? Like A, all your offline metrics should be good to go. I think that is the primary thing. B, your infrastructure should be ready for your model to be deployed so that you can make the model productionizable as well as possible. And see, you also need to understand what impact this model is gonna have. For example, if it's a revenue generating model, I would be very, very conservative in putting it out even if the offline numbers are really good because I want to study it, think about it before putting a model in production. And if it's a model, which is essentially for a model, which is not impacting it, it's a low, it's a conservative, it's an aggressive, I can take an aggressive strategy and push it once every, twice every week, right? I can do that. So I can push on a Thursday and a Monday and do an AB together. So that has been my strategy. The best day is Wednesday, by the way, just putting it out there. Thank you. Did you put something in today? You know, you're done. But the models take different amounts of time to settle down, right? In some sense, there is always some fine tuning to be done. And also you have to think about refreshers. There was a question about, when do you refresh a model, right? I mean, how do you understand the performance of the model and what instrumentation do you have to monitor it? So I'll take it into two parts, right? The first question is, how do you handle refreshers? And the second question is, how do you handle the other part? How do you monitor the health of that model? Right? Yeah, so I guess there is, at least I haven't been able to find out a formula where I've been able to figure out the right refresh cycle of a model. And that's where a healthy ML Ops ecosystem helps you out because what you can do is you check your metrics, how they are performing in the initial days and how they become stale. And you also check your input, your data feature distribution, right? Because as your model gets older, the feature distribution will start becoming out of sample for the model. So at least in my experience, I've seen it becomes an art of model. Keep on monitoring it so that you can figure out the most optimum refresh cycle. It can be from a model once a month to every three days, depending on how quickly the data is changing. But yeah, so usually the way I have done it is in the beginning, I try to train it every day, even though if it's costing me money to retrain a model again and again. But I also try to notice is how much the feature changes during the course of a week or a month and based on that, I will change my refresh cycle. There is a related question, which is that you will eventually realize that your model is wrong. It's already degraded beyond a certain point. How do you handle the transitions between model versions and also the downtime associated with models? Sorry, I had an inclusion. So I think the biggest thing you need to understand is you always have to divide. So the way I look at it is I try to see a very simple metric. Like I try to see the number of predicted labels divided by the number of total labels in my test website. So if the test, if that, and I call it calibration, right? The calibration of the data set. So I'll try to look at that and I try to see how much that is changing from my training data set. So I'll just have that. Like I'll have a guiding metric, which I have to keep on monitoring if things are going really wrong. Because if I'm not able to make the prediction, if that ratio is not consistent, then I have to have a pullback. So in this case, what, if it's a model which is just being released, I'll try to go to the base model so that, you know, that ratio where I know that the ratio is correct. So that's how I usually do it. Mr. Yeah, I mean. Managers, right? The migration. After looking. I believe that in your case the client data scientist manage the migration after looking at the metrics. I mean, it might not really be a data scientist. So the thing that we are trying to do at Omnius is we're not trying to make this exclusively operable only by data scientists. The idea for us is data science is a part of the system and it's the entire product that should support that is powered by the data science capabilities that help our customers. So if we expect this data scientist to understand this then we are in a situation because then data scientists have more questions than migrating model from one ecosystem to the other, right? But for us, what we are doing is we have an inference set both of during the training and we have it during the prediction as well. So for us, if we look at the inference the confidence of the model going down on the inference set from its previous times on how it was predicting for a task that's actually a pointer for us to say either there has been a data drift that the data that the model is being trained on and the inference set that we have is actually quite different and we have to then talk to the customers themselves in trying to understand when they raise these questions on why it's happening or we know that the automation of the tasks are poor like they would expect 40% or 50% of their entire tasks to be automated at a certain tolerance level and they see that it's just 10 to 15% and everything else is back to their manual workforce. That's definitely an alarm for us. And at that point in time we actually have to go to the site of our customer because we have to get into their landscape. We have to give us access to a machine that's in one room and then we're looking at the data and what the model is predicting and the confidence values and skimming over all the configurations that we have. And then we come back with some idea on what are the improvements that we can push depending on the necessity that needs to be done. How quickly can it be done? And sometimes we might not have an answer and we have to really then go back to the boards and see what we can do. I have a couple of questions. One of them is from one of our audience members which is about what happens, how to handle the model retraining downtime. This is from Chaitanya and potentially related to that is, you know, imagine you've just put a model into production and something is going right. My question to you would be at that point, what are the fail-saves that you have? Is it a function of just taking it offline or is there, is it, do you have the previous version lined up, ready to go with stakeholder management also in place that that is the way that we will default? How do you think through that? I mean, I think one of the things that I would really like to bring here is this is not actually new. I mean, when the entire advent of business analytics and data warehousing and ETL pipelines and all of this came into place, this was the same problem that they tried to address. Do you want to do ETL on a transactional database with live traffic that's coming in or do we want to take a different approach? And if you think on those lines, your transactional system is running as is. You never disrupt a transactional system. You let it run through how it is and you sort of build ETL pipelines to bring all of the data from a transactional system to an analytics data warehouse. And then you do whatever you do on top of it. You use map radios, Spark jobs, whatever you have, right? In the same way, once that evolved, the next decision that was taken was, do you want to do real-time analytics along with batch analytics, where that's how the Lambda architecture evolved? Where you could do both at the same point in time. Now, you can think about models in the same way. So you can have a system that does not touch the prediction system right now that's running in production where you have the training that's running independent of the impact that it has on production, right? And depending on a downtime, where your number of customers that you have are the least or an actual schedule maintenance downtime, you could, if you let your customers know beforehand, especially in an enterprise world, which you can, you can bring the system down and you can start rooting, you can start updating the model. Of course, one of the better approaches for managing this is also what the software industry has already done, where AB testing. So you can route traffic, not entirely to the latest model. So you can slowly increase that foothold. So you have service measures to do this right now. So if you're running everything on something like Kubernetes with Istio, or even if you have very simple rules on engine X, where you can root your traffic accordingly. So you don't necessarily root 100% immediately if you're not sure. You slowly start bringing that up. Yes, you do have the potential of hotswaps that you can do, where you change the model on the fly. But you always build a failsafe mechanism. If the model, if the service, if the model is too big to be stored on memory or there's some other malfunction you can have. And trust me, you can have unicode, decode, encode issues when you're bringing things up and translating because certain component in the pipeline change with this new model, right? So you always, if you're doing hotswapping, you build failsafe mechanisms where if your requests are coming in and they're failing like two or 3% of your requests fail, you immediately roll back to the previous one. So you sort of keep the other service up and running with no traffic coming in. And if everything's okay, you're fine. If not, you shift the traffic accordingly. I remember Aditya when we were discussing some months back where the slow incremental rollout may not be an option. Sometimes we have to have a step function as we have to sort of certain, cross a certain threshold in order to be able to see a substantial sample of the traffic to run your model. I was wondering how you take your management along on the risks associated with those kinds of jumps. So I think it's around the AB framework which Nishil was talking about, right? Like a lot of times before designing the AB, you have to make sure that the AB is just as significant because otherwise what's gonna happen is that on the most of the time the AB is gonna give you a neutral result. So yes, you will have to have a responsibility. As a data scientist, you will have to have a responsibility of figuring out what that traffic needs to be. And then you also have to make sure to align the other collaborators, right? Like as Nishil was talking about earlier as well that it's not just a data science model, it's a product which is going out, right? So it needs to have everybody aligned on the expectations around it and why you need to have a bigger AB, right? In terms of this situation where you can prove that the model is working or not. Because otherwise what's gonna happen is that you're gonna be iterating very slowly and if you keep on having these ABs with neutral results it's not gonna aid in any product development. So I wouldn't call it model, I would say product development. So yeah, so I think these are the initial steps which you have to have like alignment. A couple of other questions that came up, one from Kirti and one from Meeraj, both associated with the rollout of the many versions. Now there is data versioning, there is model versioning and then there is the Docker versioning which is the combination of these two. Now you're talking about a large number of combinations and you have to work through all of these details to even know whether something is specific to this particular model version, this particular data version, this particular Docker build and so on. Do you have standardized practices to be able to cope with all of this complexity and if bad things happen, one of the Kirti was asking, is there any, now do you use any kind of capability of the model to explainability capability in the model or explainability tools like SHAP and so on to even understand what has happened? How do you manage the evolution with the number of models and versions? Okay, Aditya, can I take this first? Yeah, okay. Okay, so again, maybe because I started off more in the software and then eventually moved to data science over the last eight, nine years. You have to think about the code that generates the model. Like you would think about how you manage software itself. You had build graduations before and they don't change. So you need to, you have different levels of build graduation now, right? So your code has a build graduation that generates a Docker artifact which has code that can train a model, right? And then now, the model that is being trained with this code is also not just dependent on the code but it's also dependent on the data. So you have to decide based on the data that you're working with, what is the kind of snapshot that you have? So if you boil down to something like a metaphor, like Git, you're sort of having tags for everything. Your data has a tag, your model, your code has a tag, your model has a tag and all of them put together is basically your experiment. And depending on the problem that you're trying to solve, you build your build graduation, taking all of this into account. You sort of have a mechanism to be able to say, I want to have this data, like I want to look at my report and the report goes saying this data with this snapshot, which is basically the tag, this code base and this model was the one that's generated with all of the inferencing that you have, right? And to be honest, this problem is not, it's harder to solve because of the size of the data sometimes and the size of the models that you have but not really from a conceptual point of view. From a conceptual point of view, you have a lot of frameworks that support in this. Like for example, there's DVC, there's MLflow, from Databricks, there are a bunch of other tools and systems that are coming together that you can use in making this happen. So one advice that I would give is usually as data centers or engineers, we want to build a lot of these cool stuff ourselves, right? So the first thing, when you think about these problems, we want to try and solve it immediately with our own bare hands. So try to avoid that instinct because I actually did that. So when I started working at Omnia's and we started thinking about this, I decided to come up with, and I'm saying I and not we because it was literally I. So I decided to come up with a framework that using Mongo and other things that we can do, data versioning and metadata tagging and everything. And I presented this at a conference and one of the people in the audience was like, have you looked at DVC? Have you looked at something else? Like why are you doing this? Like what's your, like are you trying to build something big? They were generally interested because they thought Omnia's is related in building model versioning and data versioning platform. And that opened up a different door and sort of a thought process around the fact that MLops, even though the word that's being coined is new, there are lots of companies doing this. So reading their blog posts, reading open source frameworks that are there can help you a lot in understanding how to solve these problems. The second part of the question in building in explainability for models, it's not that easy. I mean, if you're using deep learning frameworks, building explainability is a research project in itself. And normally companies do not have the bandwidth or the time to do this. Ideally you're relying on the shoulders of the giants, right? You're expecting Google, Amazon, Facebook, LinkedIn, Microsoft, Stanford research, MIT, all of these people are more coming up with research with the PhD candidates, doing prime research and releasing papers on explainability of models and open source frameworks that you can use and see if it makes sense. You have to keep your eye out depending on the problem that you're trying to solve. We do have some explainability that we use for computer vision algorithms where we sort of like present a hate map on what... I think we lost Nishchal, I think. Give it a second. While we were waiting for him, quick plug. He talked about DVC. So two weeks out, which is on the 17th of this month, we will be talking to the authors of DVC actually. So you stay tuned for that. Yes. And this whole framing that data science has a software engineering discipline where the lessons learned over the past 30 years can be applied to data science is will be an entire session. Yeah, absolutely. So while Nishchal can come back. Aditya, you want to take a stab at this as well? The question around interpretability of the models. Interpretability and the management of the versions of the data, especially at yours, you must have very large data sets. Yeah, so again, so I think interpretability of model depends on use case to use case, right? For example, when I was working in healthcare, it was a very big issue. And it becomes an important thing because you are essentially making prediction on human data. And that becomes a very critical decision and giving some certain insights to the guys who are taking a decision based on your model is something that you have to provide. So that is the first thing. So even, but at the same time, I think if you're in your journey where you're just building out your first model and putting it in production, interpretability is good because that way you can not only align the other parts, I'm, you know, engineering product, but you can also have a higher deeper capability in your own code, right? Like what is going wrong? What is it all going right? But as your code and as your system start maturing and you move into these high level model, I think most of the time you are, you're only going into interpretability if something definitely has gone wrong drastically. And usually you're trying to understand based on your logs and you also try to understand based on your validation data set. I have a question, Aditya. You mentioned interpretability comes into the picture if something has gone drastically wrong. Do you, I just want you to reflect on this or rather project outwards maybe. Do you think with more regulatory compliance heavy places, this is going to be more of an issue where it's not just about if it has gone drastically wrong versus audits, for example. Do you think that that might be a different role? I think in more than interpretability at that point of time, the thing which I think the models will have to capture would be the uncertainty around the model prediction. Like, you know how confident you are about your own model to make a prediction. And I think that would be where the regulatory compliances would be asking you about. For example, if you have, if you have trained a model for just loan selection, right? Like how, I think they're not the explainability but even for you to back your model is the uncertainty part, right? And you should not be making a prediction like that if your model itself is uncertain. So I think, A, there would be interpretability but B, you have to have an uncertainty build around it once these regulations start coming in. Yeah. So Venkat, shall we take a question from the group? One of the... Yeah. So there was a question about the DVC that is very keen on understanding how you manage versions of your data. We talked about DVC. Do you layer any processes, any... For example, one of the simplest thing is DVC will allow you to check in the data set. But you have to value... You have to quality check that data set and prepare it enough to be able to be used few weeks and months down the line. So can you dig in a little bit into the data versioning and tell us if you are doing something unique, different, beyond DVC itself or some of these tools? I think Nishchil is back. Nishchil, you wanna take this question? Yes. Okay. Sorry guys. Of course I had to drop off. I mean, it had to happen at some point. Someone drops off. Okay. Yeah. I mean, for us... So there are two aspects to model versioning itself. And one of the challenges that we face is the data set that we deal with is not numeric or structured data. So we have a lot of images and we have XMLs where we store all of the metadata and the actual contents of the document itself. And what we ended up doing was we actually used Git LFS and sort of put all of our page XMLs with different snapshots, basically created simple Git tags on top of it and used during the experiment. Basically the experiment would check out that particular Git tag and sort of run that experiment. And that worked great for us as we're not really on the cloud and we have a mini data center in our office that we've set up. So we run experiments on all of the GPUs and everything there. So the data transfer between like something like a NAS data store to that of a system that where the experiment is running on, there's not much latency. So that for us is a big difference in the way we do data versioning itself. But in general, data versioning gets more and more complex as the size of the data grows. And usually it's sometimes it grows exponentially. And when you're checking in all of this data and checking it out, that probably takes hours together, especially if the data is on the cloud somewhere and you're pulling it to your local infrastructure to run it on or on another system that's present. It's very important to have even nodes for the tags that are generated. So for example, I mean, this is not, so one of the things that I really want to emphasize on is a lot of these thought process cannot come from one single person on the team. So this is, if you want to do ML ops, it's the mindset of the entire team. So you need champions from within the team that have to come up and sort of go through this process and the pain of doing this because for us, we have a few data centers who are actually very focused in creating instrumentation around how do you validate a data set? How do you verify a data set? How do you tag a data set? How is it actually being used? So a lot of the discussions that I'm putting forth today are my ideas are basically champion from within the team. So if when you want to do this data versioning part of it, it's not just data engineers or people on the product who can decide this. It's basically everybody who has to come together to say, okay, these are the challenges that we have and based on the, are we okay with the latency of upload and download? How quickly do we want this to be trained? How are we doing the verification? Is it even important for us to do the verification? These are questions that are answered based on a problem. And depending on that, you choose tools. I mean, we are currently using ML flow for just model versioning, whereas we use something like Git LFS with the tags for the data versioning part of it. So there's data science as a team sport, right? And ML ops as a team sport is an interesting idea. It is, I wish more people would talk about it. I know that Aditya, you had a team of almost 12 people and you are dealing with a large data engineering team and also large management teams in movies, a beast. So how do you, how do you take people along and align the tooling and the processes and the thinking? I think in terms of, as Nishil was saying, right? Internally, it's mostly around education because not everybody are champions of these things because everybody wants to build better and cooler models, right? I think that's the cool premise around it. So I think having internal champions which can actually come up and say why these things are important is really, really, really critical for the success of data science team in general. And regarding the external, I think you have to align the external, when I say external, I mean, inside the company, the external pods, you have to align the other pods, especially people who you work closely with, for example, the product teams, the engineering team and educate them about each and everything around these journey and when you're taking. So what we did it in movie was that we had a, and because it was, it was both with product and business folks, we actually kept an Excel actually, like the most simplistic tool. We kept an Excel and that's where we were populating our versions and our journey because A, it became super easy to share with everybody what was happening and B, you can catch bugs and you can report it there. So we were doing data versioning using an ML flow, but at the end, when we had to open it up to the external folks, it was the Excel which came to the rescue. So because everybody can understand it, it's tabular, you can view it and you can even, and somebody else can also debug the issue, right? It's just not you then, because then a product guy can come say, hey, the metrics were really good and fab and you did this change in fab and in March, this thing happened, did something happen to the data? So it actually becomes a very cross collaborative atmosphere when they get access to all these things because nowadays everybody has that acumen now. It's not like about education anymore, why these things are important. So I thought, so that was a super useful tool for us in the end. Yeah, Indra. Okay, enough is enough, Aditya. Sorry, I promised I'd say that at some point during this conversation, no, but in all seriousness, the question that I had is, look, guys, I'm sorry, my internet is not stable. Can you hear me? If I can discuss? Yes. Yeah, we can, we can. Okay, it's back now. Okay, so the question that I had is this. We talked about it being a team sport. We talked about the kind of attention that goes into validating these models, then putting them into production, the kind of meticulous babysitting that goes into these models once they are in production, bringing the stakeholders along. All of this sounds like immensely expensive activities. When I say expensive, I mean that here is this data scientist who both has been at this organization for a while, understands the domain, understands obviously the techniques and is able to then build models that put return, that show return on investment. But at the same time, so much of their mental bandwidth is going into nurturing this baby, which means I want to ask you the question, is there, do you see any opportunity costs right now, maybe today in the middle of 2020, where the same data scientist who could have ideally moved on to the next business case. You said that they want to work on cool things, move on to the next one. But the way that all of this is structured right now, they do have to pay so much more attention to that original, to that first model that they built or the second model. And in that sense, they're constricted in how much they can actually do for the organization. Am I somewhere in the ballpark? I mean, I think I speak for maybe Aditya and I, when I say this, it's not just with the data scientist, but also with everybody in a data science organization. You are, when you're building these things, it's not about what you've done in the past and it's already in production, so you just move on. So there are parts of it that we're always hanging onto. It's like X relationships that you have. You have, you carry some parts of it and then you move on to different things, but there are certain things you just, you probably never let go. So for us, one of the key things here that's actually helped us quite well is we sort of try to change people who are working on the problem. Like we explicitly move around people because we don't really hire, I mean, when we do the hiring part of it, of course we look at certain specific skill sets that we might need for computer vision or natural language processing or data engineering, but what the way we try and do here is we try to mix it up a little bit. So we come teams where data engineers understand what the model is trying to do and how we can be trained or re-trained and we have data scientists who are shuffled around as well based on different problems. Sometimes people who are on production line for quite some time, we move them back to a little bit of research and give them some time to breathe and look at new experiments and run a few things, present papers, speak at conferences. I think what is not spoken about quite often is the mental pressure and the psychological pressure that is pushed on people, especially for part of data science and ML teams, right? It's a lot of pressure. It's a lot of stress because you're changing and questioning the status quo of not just your organization, but for another organization where you're having this impact on. So one of the core necessities for people who are leading these teams are not just technical acumen. I mean, people, everybody these days, and I say this with a lot of pride, there are college graduates or people who are not yet graduated who probably have a better understanding of deep learning than I do. And I've been doing it for quite some time. So it's not really just the technical aspect of it. So team leads really have to spend some time understanding the pressure, the emotional stress, the emotional trauma that everybody goes through, their failures, their ups and downs, and they have to work with it and sort of build long-term focused teams rather than just look at it from say, okay, the model doesn't work, the pipeline doesn't work, so you have to spend nights together. And then once that's done, there's no room for appreciation around to the next one. So you really have to bring in the humanity part of it and the fun into doing these things, which makes at least ex-relationships not that bad. What a wonderful, wonderful answer. I really enjoyed that because you brought a very human element back into this mix, which is, I mean, it's great to have the hype, it's great to be able to roll these out, but at the end of the day, there's a cost, there's a serious cost associated with all of this. We are at 7 p.m. right now. Are there any closing thoughts that either of you would like to share with us? Ajithya, maybe you can go first. I think it is actually part of the question which you asked, Indra. Like always data scientists would like to move on to better things because everybody has this assumption that data scientist time is expensive and stuff like that. But I feel like the more data scientists actually get entrenched into this journey of taking a model from the inception to production, the better a data scientist will become because A, when he'll be able to understand the actual execution part, that it's not an offline metric like accuracy or AUC for him anymore, it's the business side and he'll be able to understand the users better. And in the end, data science is all about understanding users. So I think just a closing thought on the last question that it's really critical to have every data scientist go through this journey to have a better understanding. Sorry, sorry, I cut you off. No, no, please finish your thought. I didn't realize, go ahead. Oh, no, that's all. Yeah, so that was the answer to that question. Okay, so then I have one thought that might just reflect a little bit of where what you said intersects with what, the way we think about our narrow lane, our lane being features, feature engineering, the feature store, which is that, and I don't remember who we borrowed or stole this idea from, but you know when features are computed, the idea is to be able to put them into a marketplace so that different data scientists from within a customer organization are able to see, okay, this feature is already being computed every day or whatever period or city, I can start to use it already. But when a data scientist chooses to use a feature, we almost want them to pay a cost, pay monopoly money almost, because we want the cost of each of these little building blocks to be felt throughout the organization because there is that cost in keeping that feature running, keeping it alive. And I think somewhere when you were talking about having the data scientists go through the entire process and see the cost of putting something into production, all of these things get accounted for when various choices are made. I want to experiment with this or I want to put that into production somewhere all of the costs have to be accounted for. That's when you start to get the return on the investment. If you don't think about what the investment is, you can't get returns on it. Exactly, it's true. And we also extending that thought, productionization, you can choose to disagree with me. We believe that productionization fundamentally changes the economics of data science because now you have to think about a system, a process, a model that has business impact that has lots of operational costs over a long period of time. So you have to account for every cost that you pay every day in every function in your organization. All right, on that happy actually, on that down note of accounting for every last cost, model productionization is serious business. There are many ways to get it wrong but I really, really appreciate you sharing your stories with me about what it takes, what it means to get it wrong, how to recover from that, how to think through things when you actually learn and improve. I expect that some people that may have questions may add them to the has geek events page. We will relay those to you so that between now and our next talk, we would love to be able to get some of the answers to those questions. And I'm also hoping that for most people who have attended this talk, you'd be open to connecting with them on a platform like LinkedIn. I'll just put that out there. You can violently shake your head if that's not the case. But I think we have a good audience here and those networks can only do us all good if we enrich them. So another way, I'm sorry, go ahead Venkata, something else? Yes, yes, another way. I know for Nishal, the opportunities for immobilized people is very close to his heart. I mean people because of accidents, because of other constraints, they're not able to move out of the house. And these are difficult circumstances when they don't have as many opportunities. I know that Nishal is working on that. It's a passion of his. So if the community knows about mechanical turk or anything that will open doors for people to work from home, especially the disadvantaged folks, I think as horse, as Nishal, and all of us will be better off. Anything else you want to add, Nishal, to that? Please, please, please write to me if you know anything. I mean... We have added the Twitter link on the page itself. There is also LinkedIn. If you can't reach anybody, you please reach us Scribble and we will be more than happy to connect you. Yeah, thank you very much. Okay, thank you all. It's been fantastic. Thank you everyone for attending. Thank you, Aditya and Nishal, for being present in every format, not just because you made your time, but because you gave us this warm energy as well. So thank you very, very much. It's like a yoga class ending now. Thank you very much, guys. Have a nice day. Yeah, bye-bye. By the way, folks, don't forget to attend the next session. We have with the DVC founders, Dimitri and Aiman. On 17th June, you will get the announcements on the Fifth Elephant page at Hasgeek. Awesome. Thank you. Thank you. Thank you, guys. Thank you, everybody.