 Okay, welcome everyone. It is 1205 and we're going to begin. My name is Jonathan Gershardt. I'm at Red Hat and I'm co-presenting with Boris here from Lightbend on use cases and introduction to Kubeflow. So we'll begin with a small short introduction on Kubernetes Kubeflow which I will give and then Boris will dive into the meat of the presentation which is an example of a product recommendation. So if you've ever gone to the supermarket, which I'm sure you have and you go and you buy maybe a Cheerios cereal or you buy diapers for a baby or canned meat of some kind and then when the receipt prints out it suggests why don't you go and buy these items? So that's an example of what a recommender might be based on your shopping pattern. So let's do a short review on containers. It may be repeat for some of you. It may be new, but we will so virtualization started, I don't know, 20 years ago where this group of academics got together and said well, we have got lots of physical servers out there. Let's try and reduce that footprint, virtualize them and share the storage networking and compute in software on one or more physical servers using something called a hypervisor. So every VM repeats the guest OS every single time and that becomes somewhat of waste because why do you need to have an entire operating system? Repeatedly hosted in every single virtual machine. Let's try to share some of those operating system processes that are running. So therefore you have something called containers where you have one Linux runtime and shared processes and then it spawns off processes for each app that's running on that one a Linux host. So since we've got now lots of containers or Linux processes running around there, we need something to orchestrate that and that's what we call Kubernetes. Sometimes people may have heard the abbreviation k8s, kates and anyone know why kates is an abbreviation for Kubernetes? Because the word Kubernetes has eight letters. So k8 letters s. Just a fun fact. So you need to orchestrate all these containers. You need to create them. You need to kill them. You need to respawn them. They die. They need to be replicated onto other hosts. So you need a way to orchestrate that. Think of an orchestra conductor. So if you've ever been to a classical concert where you have a group of violins, you have cellos, you have oboes, you have various instruments and the orchestrator or the conductor is bringing them in at certain times telling them to go away to in order to create that overall concert music experience. So some of the Kubernetes orchestrator is bringing in certain containers as they're needed and killing others or if the case where a container dies, it will respawn and recreate a new one when you need to scale up and you're running an application that grows in its usage and you need to create many many more of these containers on other physical or virtual servers. Kubernetes will go out and create more containers and similarly scale down when your usage level drops. So you get to a desired state where everything is humming along just fine. So now we come to machine learning. So machine learning is a hot topic today in the industry where you have example data, you train a model and then you deploy that model. So it's the ability of machines to learn from data. So you'll fetch it, you'll clean the data, prepare it, train a model, deploy the model. So there are many examples of this. One of my personal favorites, which is not actually presented today, but I'll just share it anyway, is the ability to analyze healthcare data. So there are several use cases where you take very much anonymized data, you can download it today out of Medicare, out of other sources and you say here are 10,000 patients with these symptoms. How can we predict which ones will get better using their particular gene pool maybe and medication applied to them and from this can we learn how to treat patients who come into the doctor's office with these similar complaints, similar diagnoses and come out with a better treatment outcome. So we're training our computer model to say understand all of these patient symptoms, understand the medications that were applied to them, let's look at their even go and look at personalize that medicine and look at the gene pool of these patients. Are they of a particular race? Are they of a gender? Are they of an age group? Are they of an ethnicity? Are they of a demographic in a fertile country, in a less developed country, take all of that data into account, train that model and then deploy it and come up with a doctor and say if a patient walks into you with these symptoms have the computer generate a suggested diagnosis based on all that data. But there are various challenges. You have to do the collection, you have to configure, you have to extract verification, data can be dirty, it has to be cleaned, so we're going to go into more details of how you can actually come up with a good way to do this. So Kubeflow, so we have Kubernetes which we said was orchestrating containers and now we have Kubeflow which is an example of machine learning using this. So I just I mentioned earlier an example of gathering lots of data and running a model on it. Well that creates the need to have lots of processes running, scale them out as you're running your model and then scale them down when you're no longer running that scenario. So that gives birth to a system that's required to spawn out lots of containers, maybe increase more physical and virtual machines to run all of those containers and then scale them down. And that's what Kubeflow allows. It allows a TensorFlow machine learning to run on containers anywhere you run containers you can run Kubeflow which means public cloud, on private cloud, on virtual machines, on bare metal. So these are the components of Kubeflow. I think Boris will go into more details on this, but there's pipelines, experiments, model serving. Kubeflow is an open source project endorsed by the CNCF, Cloud Native Computing Foundation. So the GitHub is there, you can contribute to it, you can fork it, learn it and use it. You'll see TensorFlow on the left as we described. I think I'm going to skip this one and hand over to Boris. Thanks. So Kubeflow is progressing very fast. If you've seen the presentation, a wonderful presentation about machine learning operations yesterday, it was Kubeflow05, now it's 06, so in one day most of you heard. So it was new extended security, extension to the pipeline, new metadata components and a lot of documentation updates. As a point of reference, I started the example on Kubeflow04 and when I was converting it to 06, I have to make quite a few changes. So, Meet Kubeflow, this is a new look and feel, by the way, you don't have to take pictures, all the slides are in the project that I'll reference at the end, so you can just grab them from there. So this allows you to do many operations that are general operations that machine learning experts will do. In our installation, I was using OpenShift for one that adds a little bit of complexity to overall solution, but it works just fine, and this is how installation itself looks at OpenShift. The new and important addition to the Kubeflow is Istio, and there was a couple of excellent presentations on IstioMesh yesterday, so I'm not going to go into details, but it is important addition to Kubeflow, and not only Kubeflow is using Istio, is a service mesh, it's also using it as a security mechanism, because the clever guys from Atrica integrated Istio with DEX, which is allowing you to use Istio to enable user access based on their permissions to specific applications. And it also allows you to see a lot of information about the Kubeflow directly in Istio console, and this I found the most interesting one, because Istio allows you to do tracing, it allows you to build invocation graphs, but again this is mostly Istio Kubeflow just leveraging what Istio have been done so far. So as Jonathan was saying, motivation example is a recommender. It's kind of a typical example that a lot of people in machine learning are using. It is based on collaborative filtering, which means that if Jonathan and me are buying similar products, it means that we have similar interests, and if I'm buying a new product, it would be recommended to Jonathan because we are like identical twins separated at birth. So this is based on the rating matrix where for every user and every item you calculate the preference. So now let's dig into implementation. We have our example, we have Kubeflow. So the first question is where do you start? And you probably need to start from the data storage, and this is very important. When I look at Kubeflow examples, they are typically very specific to where they're running. So the majority of the examples are assuming that you are running in GKE, and the problem that you're typically getting trying to pour these examples is they're leveraging GKE storage. There is quite a few examples through that leveraging S3. So the portability, although theoretically it's possible, it becomes a little bit of a problem. So to solve this problem, Kubeflow is leveraging Minayo, where Minayo can be used in two different variations. It can be used as a storage itself, or it can be used as a conduit implementing S3 APIs on top of different storage mechanism. So this becomes very convenient if you are trying to build something. So now that we solved the problem of storage, the next question is how do we start? So Kubeflow provides multiple options for both machine learning and model serving. TensorFlow was the original one, but very quickly it is expanding to MXNet by Torchainer and model serving. There is TensorFlow serving, Seldon, RT, and a lot of others coming up. For what I'm showing here, I'll be using TensorFlow, old but good one for everything that I do. Now that I've decided what I want to use, I have to figure out which tools I need to start from. And I'm not a data scientist, but I know I have quite a few friends that are data scientists, and they swear by notebooks, Jupyter notebooks. So they typically start everything that they're doing using Jupyter notebooks. And Kubeflow, especially in version 0.6, allows you to start multiple Jupyter servers. It allows you to create... I'm not going to repeat in the interest of time. If you look at the presentation yesterday from Microsoft, he was showing step by step how to create the Jupyter server. So you have the Jupyter server, and now you can actually start writing your code. And at this point, I will just show you how it looks. So this is... You guys can see it more or less. So this is the code for the recommender that is written in Jupyter notebook. It does everything that we need. So now, wonderful. We have our code that we're pretty happy with. In the Jupyter notebook, we experimented with the datasets. What do we do next? Of course, we can mainly run Jupyter notebooks once in a while, but nobody wants to do this. Because it's basically manual labor. So in order to automate execution, one of the things that Kubeflow provides is... Are you trying to ask the question or you're just scratching yourself? Sorry. I mean, feel free to ask questions. We don't have a lot of times, but I will try to do my best to answer all of them, even if this is after the presentation. So once we have the code, notebooks have a wonderful feature that allows you to easily export the code as Python. Once you export the code with Python, what you can do is you can convert your implementation to TFJab. So TFJab allows you, again, I'm repeating what was told yesterday, run TensorFlow machine learning in a distributed fashion. In my example, I was doing very simple. I was doing a singleton. So in order to do this, you need to export the notebook as Python code. You have to build a Docker image fairly simple and straightforward. And in the latest version 0.6, you basically have to write the YAML file that describes everything that you need to run it automatically. Once you do this, you can use the facility provided by the Kubeflow to just run this TFJabs. And you can see with the job, the screenshot here, it was running. But at the end of the day, it is executed. And you can look at the logs right there. So you don't have to go to the pod trying to figure out where the pod was running. You have the convenient screen inside the Kubeflow UI. Now we have everything running. The model is built. Now we need to use it somehow. So one of the model serving things is TensorFlow serving. This is basically how TensorFlow serving works. In order to do this, if you need access to Miner, you need to create a secret. And as I said, Kubeflow is evolving very fast. So in 0.4 and 0.5, there was support for TensorFlow serving. So far, it hasn't been moved to 0.6. So I just wrote my custom home chart doing things that were similar to 0.4 and 0.5. Once I have it running, I can validate it with Kural, make sure that everything works. And now I can start using it. One of the things is TensorFlow serving is HTTP or gRPC request reply. Typically, your data is stream data. So you need to build a little bit more software, which is reading from Kafka, invoking the service, and then writing results back to Kafka. And this is completely seamless plug. The company that I work for, LightBand, introduced the thing that is called Pipeline that makes it easier to write, deploy, and use Kafka-based applications. But going back, at the point when we were clapping and saying that we've solved the problem, there is one little thing that creeps up. And this is concept 3. If you think about recommender, recommender is based on the purchasing history. And guess what? Your purchasing history is changing color time, which means that recommendations that were done six months ago might be not valid anymore. So what you really want to do is you want to do a continuous model update, which means that in the most simplistic case on the schedule, let's say every month, you have to rerun the model and use the new model. Well, part of the problem is TensorFlow serving is not very good at this. So what is happening is the moment you update the model, you create the new version, but it takes TensorFlow some time to switch between old version and new version. And although technically it guarantees that there will be no downtime, I don't trust them. So one of the things that we've implemented for this example is a very simple dual serving approach where you have two TensorFlow serving, and every time you update the model, you switch which one you're using. And this becomes a fairly safe bet. And this means that our serving now has to have additional input that is telling it which URL to listen from. Also to complete the example, I had to implement some additional custom components that allow me to do the full end-to-end. Now, here comes the most important thing. So in order for us to implement our complete example, what we need is we need to update them data. We need to retrain the model and we need to switch our server to the new model. Of course, we can do it manually, but who has the time? So one of the things that is great in Kubeflow is Kubeflow pipelines that allows us to automate this. So Kubeflow pipelines allows you to leverage either ArgoVoc flows or the latest thing that they've introduced is Airbnb implementation. In what I was doing, I was leveraging Argo because I happen to know it, it was much easier. So in order to do this, one of the facilities that is very nice is the fact that you can design your pipeline directly in the notebook. And I'm not going to go through the details of the code, but if you look at the pipeline definition as fairly simple and straightforward, you basically define the steps and you define the sequencing. So you define step A can be executed only after step B. Now the next thing that you do is very similar to what we've done for TensorFlow serving for model creation. You export the notebook as Python code and now you can run the pipeline based on the execution of your local Python code. And because we have about five minutes left, and first of all, I'm afraid that this is the last presentation before lunch. So you guys are hungry. I can't run over time. The second thing is I want to give you some time to be able to ask the questions. So this is how it all fits together. I mean, if you get anything from this presentation, this is probably the most important slide. So you first decide on the common shared data storage. You use Jupyter as your exploration tool and as a tool for creation of your workflows. You export all this and you use TensorFlow jobs for running the jobs. You use TensorFlow serving to do serving. I've added pieces to switch between the servers and you use a Kubeflow pipeline to orchestrate the whole execution. You can try it yourself. The full code, including some of the nodes on the Kubeflow setup for OpenShift are in this GitHub project. The presentation itself is also in GitHub project and the project contains the complete code. So technically, if you're taking the pictures, that's the only picture that you need. Too fast. Thank you very much. You've been a great audience. Any questions that we can try to answer? Yeah. Excuse me. That was the only question. It's either everything is very clear or I confuse you even more. Oh, one more thing while you are thinking about the questions. There is a gentleman sitting there in the back. He paid me 20 bucks to introduce him because he's doing another Kubeflow presentation later today. Trevor Grant, so come to listen to his presentation. Okay. Yeah. I need some exercise. I'm just curious about the considerations on deciding you actually displayed this storage service. What is the benefit and what are the trade-offs for that particular storage service versus something else that you could use with a TensorFlow data pipeline? Yeah. So that's a great question. So TensorFlow, by default, only support either local disk or S3 or of course Google Cloud. So one of the considerations that I had is the reason I like Minioh and they didn't pay me to say this is the fact that you can use Minioh as a generic API on top of the data storage that you already have. You pay a little bit of performance overhead but it's not really something that prohibits you otherwise you have the problem of if you decide to switch the cloud, people for some reason like to talk about it, you will have to move all of your data from one data storage to another. Another option is I'm here with Red Hat. They have a Gloucester FS that you can run in the server itself and you can use Minioh to leverage S3 APIs on top of Gloucester FS. You can do the same thing with SAF. You have to fast for me. Anything else? Yeah. Life video, I probably wouldn't do it. TensorFlow is serving his a lot of interesting options. It has TensorFlow badge and this is what I probably would do if I would be in the situation like this because doing real-time video is scary. Yeah. Thank you very much again then.