 To solve a problem with machine learning, data scientists and other practitioners use a workflow that looks like this. They start by formalizing the problem they're actually trying to solve and identifying the metrics that imply success. They then collect, clean, and label the training data to use as examples before converting these data into a format that's suitable for a machine learning algorithm. The next step is training a model or identifying good tradeoffs in learning a function to map from examples to results and validating that model to ensure that it performs as well on data that it hasn't seen as on data that it has. Finally, putting a model into production means deploying it as a service and continuously monitoring its performance over time. This workflow has a lot in common with a conventional software development workflow, and the same DevOps practices that support conventional software development can also support the machine learning workflow. We'll see how OpenShift supports this workflow from end to end using the concrete example of text classification. That is, we want a service that accepts a document and returns a label of either legitimate or spam. The first concern is having an environment for machine learning discovery. In the past, practitioners were faced with an unpalatable choice. Either use only tools and frameworks supported by their IT department or manage their own infrastructure. OpenShift and the operator framework make it possible to support truly self-service discovery environments on shared infrastructure. The OpenData Hub operator, which is a community project sponsored by Red Hat, lets practitioners deploy and manage a machine learning discovery environment, including interactive development, storage, compute, and visualization components. We're going to use the OpenData Hub to request an interactive development environment with Jupyter Hub now. We'll start by logging in and then we'll see the Jupyter Hub launcher, which will let us choose an image to run our development environment in with some libraries installed. It will let us choose a t-shirt size for the resources we'll need for our environment and whether or not we want to schedule a GPU. Jupyter Hub creates and maintains a persistent volume to hold our work, and we can tell it to clone a Git repository into that persistent volume before we start up. Finally, we can specify some other environment variables as well. We're not using SEF storage in this demo, but if we were, our credentials would be preloaded in the environment along with any other environment variables we specify. Now we'll go into the development environment and look at our work. Again, we've preloaded some code from a Git repository. This code is what we call Jupyter Notebooks, which are interactive documents that include pros, code, pictures, and output. They're a great tool for experiments and a great communication tool. In this demo, in the interest of time, we're not going to develop new techniques in notebooks. We're just going to use Jupyter to run and evaluate those existing notebooks that we cloned from Git, as if we've gotten them from a colleague. This first notebook shows a feature engineering technique. Its goal is to turn text documents into vectors of numbers so that we can pass them to a model training algorithm. Basically what we want to see here is that there's some separation between the vectors corresponding to legitimate documents and the vectors corresponding to spam documents, and that's what this visualization shows us at the bottom of the notebook. The next notebook actually trains a model. That is, it finds good trade-offs in separating the vectors corresponding to legitimate documents from the vectors corresponding to spam documents, and thus learns a function to label documents. When we're evaluating this model, we want to look at some of its performance metrics on data it hasn't seen. We'll look at two in this notebook, a confusion matrix, which shows us how often we predicted the right labels along one diagonal and how often we predicted the wrong labels along the other diagonal. We can also look at the F1 score for this model, which is very good. So these notebooks seem to have reasonably good results, but notebooks are interactive documents. They don't look all that much like software artifacts. In a traditional setting, a data science team would throw some of these over the wall to an application development team who'd reimplement the feature extraction code and model training code and build a production service. But there are all sorts of reasons why the app dev team might not even be able to reproduce the results in a notebook. Ideally, we'd be able to streamline the process of going to production, increase the velocity of these cross-functional teams, ensure reproducibility, and get some ongoing assurance that our code and models are behaving once they're in production. If we were dealing with conventional software, we'd use CICD and other DevOps techniques to achieve this, and things are no different with machine learning. Let's see how to use OpenShift's developer experience to improve the efficiency of our machine learning team. We're going to set up an OpenShift build corresponding to our whole machine learning pipeline here. We have a source-to-image builder that knows how to take a Git repo of Jupyter notebooks and post-process and execute them to make a model service. In this case, we're going to specify our machine learning pipeline in terms of notebooks. The first is going to be that feature engineering notebook that goes from text documents to vectors. And the second is going to be that model training notebook that goes from labeled example vectors to a function to label new vectors with either legitimate or spam. We'll train a model in the build, and then create a service that takes text, uses the feature extraction technique from the feature engineering notebook to produce a vector, and then uses the model to make a prediction based on that vector. We've already launched a build here, so we actually have an internal service running in our OpenShift project to make predictions. Let's see what it looks like. We'll interact with this service from a Jupyter notebook to interactively demonstrate how other services in our application might call out to it. In this first cell, we're just defining a very basic client function for our service. This function takes some text and uses it to make an HTTP post to our service, returning the predicted value. We can see how it works by comparing some nonsense words, which we expect to be spam, with the first few words of pride and prejudice, which we expect to be legitimate. We can use our service to make predictions about some of our training data, and sanity check that it's performing adequately. It looks pretty good. However, one of the big challenges of machine learning systems is concept drift. The nature of the data we see in the real world can gradually drift away from the nature of the data we trained the model on, meaning that our model no longer faithfully models reality. Put another way, while conventional software often breaks in obvious ways, machine learning models can break silently. They'll continue giving you answers. Those answers will just be wrong more often than you'd like. We can actually detect concept drift by tracking metrics about the predictions we've made. While we don't know in general whether our model is making correct or incorrect predictions, we do know that we expect roughly the same proportion of legitimate and spam messages over time. We certainly don't expect sudden, drastic shifts. These indicate that either the distribution of messages has legitimately changed, or that our model's performance has worsened. In this dashboard, we're looking at the predictions our model has made over time, while scoring synthetic messages streaming in on a Kafka topic. We're plotting the logarithm of the counts of each, which means that the slope of each line is proportional to its rate of growth, and is independent of how large the quantity is. If the ratio of legitimate messages to spam messages is the same over time, these two lines should be roughly parallel. We've actually set up our synthetic message generator to start producing more legitimate messages over time, and we can see where the drift occurs, as the lines are no longer parallel, indicating that legitimate messages are growing at a higher rate than spam messages. This sort of divergence would be obvious to a data scientist, but we don't need to tie data scientists up looking at dashboards. We can define Prometheus alerting rules for this, or even train machine learning models to identify this sort of drift in our systems. In this short demo, we've seen how OpenShift supports a data scientist's entire workflow, from self-service provisioning for a self-contained discovery environment with storage, streaming, compute, and interactive development, to model deployment and production monitoring. We've seen how the benefits of OpenShift for contemporary software development teams also apply to the teams who are building machine learning systems and intelligent applications.