 Hi everyone. Thank you for the opportunity to speak to you here. A little about me. My name is Mon-Mierie Ray. I'm an MLops DevOps Solutions architect with GitLab, APJ. And today I will be talking about MLops, which is DevOps for machine learning. So I'm going to jump right into my presentation. So as I said, today's topic is MLops, DevOps for machine learning. And I've also put in a link to a demo on how actually that solution can look like. Please feel to fork it and use it however you would like. And so for the next 30 minutes, what are we going to talk? Well, a little bit, just a journey through history. Where did this all start? Why is this topic relevant today? The economic lens of AI, DevOps, MLops. Little bit basics on what is actually a model and what is MLops, the principles, the processes, and simplifying that with GitLab. Specific for public sector. So where did this all start and why is this topic relevant today? So it all goes actually back to 1956, where there was a summer research project on artificial intelligence. Where six mathematicians come together and they thought they could solve the concept of automatic computers and how can computer be programmed to use a language in basically three months. Pretty cool. And now we are here at 2018, still scratching the surface of AI and how we can actually use it in our everyday lives. And to that part in 2018, there was a conference held where on the right hand side, I have the little picture of actually an AI psychologist, Professor Daniel Kahneman and Nobel Prize winner for economics as well. And he was there in this conference with the AI advisor under Obama, then the chief AI strategist for Google, and a couple of people just unraveling what is the future of AI and how we can use it everywhere. He started the, he entered the conference with this tail and the tail goes like this. A well-known novelist wrote me some time ago that he's planning a novel. The novel is about a love triangle between two humans and a robot. And what he wanted to know is how would the robot be different from the individuals. And I propose three main differences. The first one is obvious, the robot will be much better at statistical reasoning. We would probably agree with that. The other is the robot would have much higher emotional intelligence, very, very controversial and debatable. And the third is my favorite one, the robot would be wiser. Wisdom is breath. Wisdom is not having a narrow view. That's the sense of wisdom. It's broad framing and a robot will be endowed with broad framing. And when it has learned enough, it will not be wiser than we people because we don't have broad framing. We are narrow thinkers. We are noisy thinkers. It's very, very easy to improve upon us. So the way he looks at it is AI, a complement of humans, noisy thinking framework, narrow thinking framework to help with the broad framing and wisdom of applying software technology anyway. So on one hand, that was going on. On the other hand was there were a couple of economists coming together and figuring out what does that actually mean from a demand supply perspective. So fundamentals of economics, a certain price drop in something means more consumption of that. For example, a certain price drop in tea. People buy more of tea, less of coffee, and also more of the compliments like sugar and milk. What does that mean for AI? So for before AI, the mobile phones, that was considered a drop in a price of communication. For AI, it's considered a drop in price of machine predictions. For DevOps, it's considered a drop in price of transaction cost within products and the compliments the sugar and milk for DevOps is technology and automation and human judgment. To put it all together, MLOps, the combination of AI and DevOps and the process of it, the price drop is considered for automated decision making for machine predictions. The substitute, which is the coffee, is the repetitive decision making for human prediction and human prediction. And the compliments is human judgment, data and automation. So to put a simple, simple definition to MLOps, it's a movement that gets us closer and closer every day to cheap, almost automated decision making by complimenting precise judgment through machine predictions. Now that we have a good understanding of MLOps, we will be first going into actually understanding what is really a model. So we're going to build one. Let's build an experiment within public health and safety. This is an experiment to predict the right care for acute coronary syndrome patients within hospitals. This is an alerted system that people can actually predict before in a proactive way. What is the care these people who've got diagnosed with ACS would need a very targeted way of helping the community. Now to actually build through all that, usually the way it's done currently is a couple of data points or variables looked by the nurses and the doctors to actually determine the right care. From an AI perspective, because it is all about broad framework, the machine can actually look into just not just three, four data points, but all through the 80 or all the data points that are linked to actually holistically look into a patient's health care and well-being, including data like what are they eating, where are they going for a vacation, all of that put together. So for this experiment, we are going to take data from EMR, which is an electronic medical record system, looking into the historic data of medicines the patient is having, any pre-existing conditions, in-house tests, clinical notes, in-house medication, any additional change notes, all are extracted as variables and features built into the model to be able to predict. Now what is a model? So a model is, and a process of MLOps. A model is a process where you have a target variable and set of data variables and features that actually determine that target variable goes into a system of algorithms to predict that target variable and the output through these all the data outputs the missing prediction of the variable. This process of this infinite loop where the data goes into the model, the algorithm and then new data comes in into the prediction that's evaluated through algorithms as well as through businesses and processes to understand if the model helps with the business outcomes, retraining of the model, retraining of new data into the model, making sure the accuracy of the model from a business as well as a algorithm prescription does not to create this whole process leads to this rise of MLOps which is machine learning operations. Now this is an example of one of the features which is part of the clinical notes where we look at how the data in the clinical notes looks very noisy, really, really hard to understand and if this human has to actually do the whole process of that clinical notes can be really, really, really hard. This has been taken in ED where people are writing notes and scratching surfaces quite in a fast fashion so a lot of these words are also truncated. So it goes through this pre-processing of the data which is done through natural language processing where the right fields and the right variables are extracted through the noise of these clinical notes. Here through the noise we see that the main thing that we need to know is no pain systems and severe tachycardia and if it's severe tachycardia, what is that relevance to ACS which is kind of the output or the classification of what we are doing here for this case the risk is low. So looking into more the process of this, we see that starting from the patient entering the emergency department, running through all the tests, looking into all the historic data, the clinician takes the notes based on symptoms and structured data is sent into the electronic medical record system. Notes are extracted from this EMI system and the model is built based on that. This is the process all the way from the business or the patient entering into a building of an AI model. When we look into understanding the machine learning operation process so we have the problem, the definition of problem where for this case was predictive for ACS classification. When we understand the data and the availability of it, here we've used the electronic medical record systems, the models, the algorithms we're going to use, the features we are going to build and the deployment which goes into scoring, the performance monitoring, monitoring and the model storage. So through this process, it seems pretty straightforward, but there is a lot of different reasons why MLops was born. Now, firstly, it's all about experimentation which is classically quite different to traditional software development. A data scientist spends a lot of time really, really understanding the data, experimenting with different algorithms, running it with the business outcomes to see if it's actually validated with the output from the model validated with historic validation done by humans. So a lot of that is through rapid, rapid iteration cycles. So each data scientist has their own set of tools to do it, own set of languages they play, some love Python, some Spark, Julia are all of it. Solving models, because of the nature of data, the nature of the science, it all becomes quite harder. And then when stuff goes wrong, because it's done in a very silent way, it is hard to trace it back. And slowly, slowly, the needs of machine learning are growing, the data is growing, so should the infrastructure and the whole cohort of data science is expanding. It's not just any more data scientists, it's developers who help with the production of models, data engineers who help with the data extraction and within data science we have natural language people focused in natural language processing, supervised model, all sorts of people, as well as the business owner. And all of it needs similar to DevOps, that whole collaboration and to breaking the silos to make efficient, faster release models deployed into production, monitored and defended. So this is actually, with looking into all of these problems, this is actually a list of all the principles that an open source framework came through, each principles that looks into what we call responsible machine learning. Number one, human augmentation, where humans will take the job and where the machines will take the job. Number three, explainability of the models for making the model reproducible and easy to use. And number seven, trusting by privacy are all part of this journey of MLops, which is DevOps for machine learning. So to look into this triple infinite loop is ML DevOps, where the depth part for MLops looks into algorithm training and testing, ELT pipelines, continuous integration, mainly looking into the data. The operation looks into the continuous delivery of models, the prediction and inference of models and monitoring and management of that. So now we actually look into more the technicalities of what is that machine learning operations and how do you actually use the DevOps CI for models. So the continuous integration to actually automate and orchestrate from the training testing deployment, all of that. So, and that's basically the key shift in using CI CD of models. So the main part to be able to use that is containerizing every process of the pipeline of ML models. So every change in the source code triggers the right part of the pipeline to model training and registering in the container. All of that automated, orchestrated and linked through a CI service. Where GitLab plays a role is specifically in the CI service part, the source code part, the container registry, as well as training the model in the GitLab run up. So GitLab triggers the pipeline, continuously integrates the changes in the source from building to validating and deploying machine learning model into one platform, as well as training. Now, when we look into a CI ML script, compared to software applications, the machine learning application, the jobs can be quite different. So here is a simple example of seven steps of the job. First is preparing the pipeline environment for the data scientists, creating a workspace for the machine learning services, submitting a training to the job, registering the models to the workspace, comparing the performance of different models. As you remember data scientists, it's all about experimentation. And so the comparison phase can be very, very critical. Creating the Docker image for scoring web services and finally publishing the artifacts to release pipeline. This is an example within GitLab, how the validating to building to deploy to testing to cleaning up the environment for a machine learning model can look at. As we did discuss one of the principles is explainability. The black screen shows all the different logs that can be tracked through these different stages and pipelines of building model. That gives that explainability of all parts as well as the reproducibility, as well where all of it can be traced, visible, tracked, logged and reproduced easily. For more further enhancement on the visibility and transparency, we do have the GitLab REST API which can go further deep dive into the data science to developer operation side to understand insights such as which model performed better with what kind of data. How long do certain models take longer to run. So, for example, penalized regression versus a tensor flow on a certain data, how long does it take to train, how do you faster move a model from experimentation phase to production phase. These are all insights that can be once using the GitLab CI put through with the REST API to have a better, better, healthier understanding of your machine learning life cycle, give you transparency visibility and proactively also be able to predict what model is better for what kind of data. So the workflow of the ML operations goes something like this where it data scientists develops a source code for the training model, as well as the inference base submits a merge request with the changes and approval can nominated. After it's nominated it triggers the CI CD pipeline which the new training code can be deployed whether in GitLab run or in your environment or Q flow, whatever it may be. The final steps of the pipeline would generate new model artifacts. And that will be automatically updating different model references JSON files, the approval of the merge triggers another deployment for redeploying the inference applications. And this whole infinite loop can be traced time and it goes on and on. So, to begin to end with back to the beginning, where we started. In one hand, we obviously have. Professor Kahneman who clearly has very, very strong vision of where I can take people. In some areas we've seen that this concept has been proved through autonomous cars. Siri the Alexis all using AI. But today in our daily lives we still find a dissonance. We do still find a sort of gap as to how we can actually use these applications these process in our everyday daily routines and needs in whether it's about in public sector in getting better at fraud detection. Or whether it's to do with the community health and safety, or whether it's really understanding also your own own performance within different departments in public sector. So there is still this sort of dissonance in how you can use these tools and applications in your daily lives. And to wrap that on understanding how that actually times out. I'm going to go back to that original story. We looked at to the original process of the ACS prediction. Where the process went very simple like this, the patient entering the emergency department running through the test clinician taking the notes, unstructured notes get sent to email database notes extracted. And the recommendation isn't built. If there comes a time when the data is getting better and better. We get better in predicting the right care for all the ACS patients. We started maybe with 5% accuracy with the model with the human and model interaction it gets to 10% 20% 30% and the model gets better and better and better every day using writer better operations with machine learning and fast releasing the model. So, slowly it gets better. They may come with point then that the whole process can also get reversed. So, the patient enters the emergency department, the test all done machine actually based on this right away predicts the right care for ACS, the clinician then just approves it. And, and that is sent back to the database to validate if it was if that was accepted or not. Now that is actually all about a thesis of time. So the more these practices are embraced, the more the data is used in the right way, the more machine learning operations is used in the right way to embrace more of AI into your system. You dial up the accuracy, the cheaper the current strategy becomes. So it's not just about the Google's tell Tesla's apples and Netflix. So, and you get better and better till that time comes when you don't only help your current strategy for free but also reinvent a new way of looking into operationalizing the needs of the community. So, for example, another example to wrap this up is Amazon recommendation engine. So right now, when I would like to buy shoes, or I did buy a shoe, it does recommend me around maybe two out of 20 shoes that I may consider trying. And Amazon is obviously taking this data. And one day it, it keeps on dialing the accuracy serve comes the time they I they recommend for out of 20 that I would consider buying. And then it goes to 1618. However, and the accuracy keeps on going to all the way to, let's say 95% accurate that the next shoe you're going to buy it's 95% accurate that you would actually buy it. And Amazon at that time may then decide that they may actually ship the shoe before you buy it and you may then decide whether you want to keep it or not. Now that's a very theoretical way of looking to it, but that is just the beauty of using machine learning operation like operations to embrace machine learning and the human computer interaction to make the community better, cheaper, faster in solving problems and and making the right decisions. So thank you. I hope you've enjoyed this little small talk on on develops for machine learning and envelopes and have enjoyed the rest of the talks. Thank you.