 My name is Jochai and I'm CEO and co-founder of Converge. Converge is a machine learning platform built on top of OpenShift and Kubernetes. We help teams to manage, build and deploy machine learning all the way from research to production. We help to breed science and engineering teams and we provide IT with an environment to manage all machine learning resources, utilization, infrastructure and more. We started Converge because data scientists are spending 65% of their time on DevOps and also 85% of the models don't get to production. This is happening because there are two different audiences in the machine learning world. First is the IET side, focused on production, machine learning, focused on infrastructure, OPEX, CAPEX. On the other side you have the data scientists, a team that's usually focused on algorithms, insights and they spend 65% of their time on DevOps. Now, what Converge and Red Hat provide is a solution to solve exactly that. We provide everything data scientists and DevOps need out of the box. We manage Kubernetes deployment on any cloud or on-prem environment, a fully automated installation and life cycle management for the application. And also all tools data scientists need for machine learning, AI, farm research to production and then open flexible container-based code-first data science platform that integrates to any kind of tools that you already have in your ecosystem. Machine learning today is fragmented, broken between a lot of different tools, scripts, plugins and unconnected stacks. On the left side you have the MLOps, DevOps, a lot of work around configuration, installation, scheduling, resource management, life cycle, collaboration and a lot more. On the right side you have all the data science workflow from data selection to data preparation, model research which probably need to be versioned in the middle. Then you have a lot of experimentation, training of different models, visualizing models, validation of models, tuning and deployment. Once you deploy the model, it's not stopping there, you also need to monitor and to proactively iterate. So if there is some sort of model decay or new data that's coming into the model, how do you retrigger this kind of pipeline? This pipeline that involves both research and a production deployment into a single continuous training and continuous deployment mechanism. Because of this complex environment and procedures, 85% of models don't get to production. This is a problem identified by a lot of different companies. This is a paper published by Google a few years ago. It's describing the hidden technical depth in machine learning systems. What happened when a company is trying to get machine learning from prototyping to real production scale? Suddenly, they face a lot of different challenges. Challenges around resource management, who's using what? GPU, survey infrastructure, monitoring of models both in training and also in production. A lot of stuff and very little for the actual magic like the machine learning code, what's really giving the competitive advantage. The algorithms, the cool stuff. Red Hat and Converged accelerate and automate data science all the way from research to production, providing a really fast way to get from start to finish from data to production. A platform that's based on top of OpenShift uses all the scheduling resource management, life cycle, collaboration and other features that OpenShift provides. Together with a suite of everything you need to build and deploy a model, whether it's from data selection, data versioning, version control for data science, data preparation, model research, running a lot of different experiments on the OpenShift compute, doing validation, tuning of models and eventually also deploying. Converged is a code first platform full stack container based just like OpenShift and it's open, meaning that you can use any kind of framework and any kind of container to build your models. Converged accelerates everything from research to production across any infrastructure you have. The OpenShift and Converged solution work side by side, so we use that Converged OpenShift to distribute jobs across the different compute resources. So think of it that you can have one control plane for all AI and you can attach different compute resources to the platform. So if you have OpenShift on-prem or OpenShift in the cloud or a hybrid and mix of the two, you can have all the different clusters, AI clusters unified in one environment. And then your data scientists can run machine learning workflows on any of the workers, on any of the pods or containers that you have access to. So you get one platform to manage training of models, to manage research on Jupyter notebooks or BS code or something else. You can use autoscaling and cloud bursting and a lot of recognize features built in. In terms of the pipelines, Converged provides a solution to build machine learning pipelines around all your different machine learning compute and different jobs. Each component here in the graph that was built with Converged can be running on a different OpenShift cluster. So you can have this preparation step running on a Spark cluster on-premise, a CPU cluster built on top of OpenShift. Then you can have GPU training in the cloud or GPU training on-prem using also the OpenShift platform. And the last, the deployment can also be deployed on a public cloud cluster. Flows serve also an automation tool for building models. So you can run this pipeline of pre-processing model selection and model deployment every day or every week or even based on new data that's coming into the flow. So whenever there is a new version of the data, this flow can be triggered automatically. Flows are versioned, tracked in runtime and also during the building of the flow. So you can always see for every model that you build exactly how it was built with what kind of data, what kind of metrics, hyperparameters, algorithms, everything centralized. Now one of the nice unique things that we have together with OpenShift is that besides the fact that each node here on the graph, each component can run on a different compute resources. Converge automatically scales up the cluster and free the resources when this job is over. So in this case, I'm using multiple clusters, but think of it that you can have one cluster for Spark, for deep learning, and for classical machine learning. And Converge will orchestrate all the different jobs with the help of OpenShift. Converge is certified at the OpenShift container platform. You can quickly install it and get failure recovery, lifecycle management, cluster health, cluster monitoring, upgrades, and everything you can enjoy from the OpenShift portal. We'll now go to a demo, product demo. So this is the Converge UI. You can think of the Converge UI sort of like GitHub designed for data science. It makes everything really simple. You can share models, you can share resources, experiments, research, everything can be shared, and you can have all your data science team in one single platform. So data scientists, data engineers, and IT. Now, before we dive into one of the use cases here, Converge relies on OpenShift Compute. So you can attach multiple clusters, Converge itself can be installed on OpenShift. And then you can attach OpenShift and Kubernetes clusters directly to the platform from the UI. Now, one of the nice things here is that you can track utilization. So you can make sure that you're using all the resources. And if you're not, then see exactly what's the holdup. We also provide Grafana and Kibana and other nice open source tools built into the platform and into each cluster that you connect. You can also specify compute templates. So you can have your own instances list. So you can create, let's say I want to run only on half a CPU with one gig of RAM or I want to run on a really large four CPUs with one GPU or two GPUs. I can specify everything. I can build my own custom compute templates, makes it really important for IT to be able to manage machine learning compute templates, provide data scientists with an easy way to spin up resources. All right. We'll go into MNIST to show you how you build and deploy a model in Converge. So at Converge, we support a lot of different ways you can start building models. We even support the most basic way, which is spinning up a Jupyter notebook or even VS code. So you can choose to run VS code on any of the OpenShift clusters that you have. You can choose any compute template you want. You can also choose any data set you want and you can choose any Docker container one. So it can connect to your own private registry and we also provide some pre-built as well. Once you spin up a resource, Converge will allocate all the CPU and memory that you need. Converge will spin up the container, get your code from Git or from Converge and you'll have a working environment up and running in a few seconds instead of a few hours of setting up. This is all in the high security standards that OpenShift provides and it's the fastest way to get a visual VS code running on a remote machine. All right. Next up is Flows and this is from the slides I showed earlier. This is a really fast way to build any kind of machine learning pipeline you want. In this case, I'm loading data from an object storage. It could be mean IO, it could be your own object store, cloud object store, anything you want. Here I'm running it now on a Spark pre-processing step. This is a simple Python script that you can run on your own. You can choose compute, you can choose any Docker container that you want to run this job on. Then I'm going to do a model selection process. Each of those components here, like I said, can run on a different compute. So this can run on a Spark cluster, this can run on a GPU in the cloud, this can run on my on-premise GPU, this can run on a CPU. So I get like the flexibility to run any kind of compute, any kind of task on any kind of compute node that I want. So I can have one pipeline spread across all the different compute resources that I have, guarantees high utilization for the best tool for the best job. So here I have three different models. Each of them can have multiple experiments. So here I have two different number of epochs. I have 10 comma 100. I can also have different types of learning rates, for example. And what's happening here is that once I'm going to trigger this specific component, Converge will automatically run it four times, one for each permutation that it's calculating. And this is really cool because the nature of machine learning is running a lot of different experiments to optimize to the best result that you can. Converge makes it extremely simple with OpenShift to just specify a grid of parameters, and then Converge will take care of resource management, tracking of the different models, and eventually also deploying the model to a remote OpenShift cluster. All right, so what's going to happen here is that I'm loading data from an object store. I'm then running a pre-processing step using Spark. Converge automatically passed the output data as an input to all those three different models. Each of the models will be running as many as time as I set it in the internal parameters. And then I'm going to automatically pick the best model that I have and deploy it as a web service. This web service is also going to pick the model based on accuracy. Of course, I can customize it based on any metric that I want. And I'm going to pick the best model and deploy it as a web endpoint. This is really cool because in one pipeline, I've been doing pre-processing on Spark, training on GPUs or CPUs, and then deploying to a remote OpenShift cluster by one of the cloud providers or something. Once this pipeline is triggered, Converge automatically tracks everything. All the input and all the output is automatically versioned. Data is versioned across the different components. And you get this table that you can track in real time and you can see all the different executions of the graph. You can go into specific experiment of the graph and see resource utilization, hyperparameters and metrics are automatically being tracked. You can see metadata about the run and we also automatically plot your accuracy and metrics. So it's great also for research, like you can track and monitor the different models, see what kind of hyperparameters work best for what kind of models. You can even select a few models like this and click compare. And then you see all the different models side by side and exactly what happened in each model in every step. Now, in terms of serving, so we selected, we did model selection. The best model was automatically deployed to a endpoint as an endpoint. So we use OpenShift in the back end for this as well. We deploy your file and function as a web service on OpenShift. And the cool thing is, besides the fact that we completely automated DevOps, we help data scientists and data engineers to get sort of like an x-ray to the model. You can see all that's happening. All the input and all the output are automatically tracked so you can build new data sets using this data or you can see the activity of the model, what happened recently. You can also deploy new versions of the model and using Canary release that helps you to gradually deploy new models and continuously test them as they're being deployed. We also have Kibana and Grafana, great tools for IT and DevOps to maintain and monitor the endpoint. And last part is, is a continual machine learning. So this allows data scientists or engineers to monitor models in production. In COVID that today, let's say you have five data scientists, ten data scientists, each of them has five models in production. Very soon you'll have around 50 models in production to monitor. That's quite hard and each model requires different kind of monitoring. So what we did is that we made it extremely simple for engineers to add alerts to their models. So you can track model confidence or you can track data quality or any kind of parameter that you want and then trigger an email to the data scientist. So let's say your model receives bad input, you can automatically send an email to the data scientist. Hey, you should be looking at this model because something bad is happening in production. Another cool example that we have a lot with our customers is being able to track the confidence of the prediction. And if that confidence prediction drops below 0.5, for example, over a specific period of time, then automatically retrain the model. So it will basically trigger the pipeline that we just built before, make sure that new data is being fetched, retrain the model using the model selection and deploy the new model gradually with A-B testing, canary deployment into this production endpoint. This is how we offer continuous training and continuous deployment on a platform that was built on top of OpenShift using the scalability, security and other great features that OpenShift provides and all delivered in a way that data scientists and engineers can consume it fast and easy and be more productive at work and put more models in production. So this is Converge. Thank you very much.