 Hello, and welcome to our demonstration on building an intelligent fraud detection solution using the Red Hat portfolio and open source machine learning tools. Financial institutions have an incentive to modernize their fraud detection solutions for a number of reasons, and not in the least, the increasing loss both financial and reputational caused by fraud. There are a number of contributing factors here. First, money is moving significantly faster, and reactively addressing fraudulent transactions reported by customers means that a lot of damage is done before the problem is detected. Money is also moving over different channels, so there is much more to monitor. Fraud methods become very sophisticated beyond the capabilities of human experts to analyze. In response to that, financial institutions must be proactive and analyze transactions in real time, making predictions on their fraudulent or legitimate nature. In this demo, we will show you how to build a prediction service that does that. Financial institutions also must integrate and analyze data from various structured and unstructured sources, and use that for the predictions. In this demo, we will show you how transactional data can be used and analyzed to build this prediction service. And finally, financial institutions must replace human design rules with insights and data-driven models, and in this demo, we will show you how to build a machine learning model that powers the prediction service. We will build our fraud detection solution using Red Hat OpenShift as a foundation for a modern, hybrid cloud-ready AI platform that connects data scientists, application developers, and IT operations to a seamless and unified platform built with a Red Hat portfolio. This is made possible by OpenData Hub, a reference architecture community project that serves as blueprint for building an AI as a service platform on OpenShift. It uses a meta operator that integrates the best open source AI ML and data engineering community projects to provide a single, unified way to install the entire AI platform in a single click. It's been production ready and running in Red Hat's data center and used by teams for over a year. In summary, here is what we will see. We'll do exploratory data analysis and build a correct model for fraud detection using Jupyter Hub and OpenShift. We'll set up CI-CD process to take the data scientist's work and turn it into a running service that can make predictions using OpenShift pipelines and deploy it as a serverless service using OpenShift serverless. And finally, we will monitor the model for drift using tools like Prometheus. First, we will access the data science environment as a data scientist would for performing exploratory data analysis and building the model. We will do this in Jupyter Notebooks, which run on the Jupyter Notebook server. In this case, we are using Jupyter Hub managed through the OpenData Hub to manage and proxy multiple instances of the single Jupyter Notebook server. This is integrated with OpenShift security. So as an OpenShift user, I have my notebooks and my work protected. And all I need to do is log in with my OpenShift credentials to get access to the Jupyter Hub server and start spawning Jupyter Notebooks. I will authorize access and get different spawner options. Jupyter Hub allows me to select a notebook image that provides the libraries that we need for my data science work. It also allows different deployment sizes and access to features such as GPUs. This way, data scientists can get the right amount of resources and can get access to specialized hardware to run their experiments. It also provides other variables. In this case, it will preload the notebooks that we will work in. We will not build our own, we will just work on a few examples. So what I'm gonna do next is spawn a Jupyter instance. Once the Jupyter Notebook server has started, data scientists can go and start doing their work. In this example, you already have some pre-created Jupyter Notebooks that I'm gonna walk through just to illustrate the work that's being done to build the model. First, we have exploratory data analysis where data scientists look at the shape of the data and try to understand the different distributions of fraudulent and legitimate transactions and how they are correlated with different characteristics of the data. Things like the foreign nature of transactions or, for example, the inter-arrival time as we have here. So once they visualize and try to understand the data, data scientists move forward to try to identify the characteristics that are relevant for the model, for identifying fraudulent and legitimate transactions, a process known as feature engineering. And in this process, they are transforming the data to match better the algorithms that they are trying to apply. Finally, once the data is understood and the relevant characteristics have been identified, the data scientists write code to build the models to transform, to train them based on the data that already exists. And as part of that, they do other transformations, like, for example, understanding whether data is imbalanced or not and trying to reduce its impact. All this code in this example written in Python is put into these notebooks that codify how the model should be built based on the data that we have. Now, once we develop the model in Jupyter Hub, we'd like to turn it into a containerized service that can make requests using an endpoint. Traditionally, this would have been done by a handoff to an application developer. But we're going to use OpenShift tools to do this automatically for us. So going back to the OpenShift console, this is a built pipeline that will do what needs to be done to transform the code that we have in the notebooks into a running model. It consists of two steps. The first one will take the notebooks, extract the code and build a container image that has the model and the running service. And then it will take that image and deploy it as a serverless service that can be queried as an endpoint. So what I'm going to do is I'm going to start this pipeline. You can see the different parameters here, the notebooks, the source code, the source repository where these notebooks are coming from. And of course, some information about the builder image and where the image is going to be deployed. So I'm going to start the pipeline. Once the build process is complete, we can actually go in and take a look at the logs, for example, and see that everything has worked correctly. Images have been built and pushed to the local image repository. We can also see that a serverless service has been deployed. And we have a key native service that is to which we can connect to make predictions about the transactions. We're going to do so from another notebook. You see that I already set up the URL here. And we're going to start the prediction process. We have different functions, different code in this notebook that wraps the calls to the URLs and displays the results. Of course, this could be invoked from any other kind of application. In your case, this would be probably part of a Java business application that is trying to make these predictions. As you can see, the serverless service is starting. It starts making predictions. We can see the results here. And as we go to the bottom, this notebook also has what we call experiments. Experiments are ways of trying the endpoint with different distributions of data. So we can see the impact, we can see that the service is actually returning the proportions that we expected. Another feature of deploying the model this way is that we can collect metrics from the model using Prometheus, which is already integrated with OpenShift. So we can go to the OpenShift, to the Prometheus console, and we can see the metrics collected from the model. In this case, the distributions of legitimate and fraudulent transactions detected. This helps data scientists to also monitor the model. Change of distributions in the predictions could, for example, indicate model drift. Of course, Prometheus also offers the option of alerting automatically when the model has drifted and needs to be retrained. And that concludes our demonstration. I would like to thank you for your attention.