 Δηλαδή, καλώς είπαμε, καλώς είπαμε, αυτή η σημερινή είναι για τη κομποτερική βαζινότητα, με κομποτερική νέα νέα, τέντσορφλο και κουπφλο. Είμαι η Δορθαία Καλαιώρα, είμαι ένας σύμφωνος εγγυνέρατος Αταρρίκτος. Λοιπόν, τι έχετε δει σε αυτή η σημερινή σημερινή. Σήμερα θα δημιουργήσουμε ένα κομποτερικό νόδο, που μεταφέρει τα κομποτερική σημερινή, τένσορφλο, C&Ns και τέντες δημιουργής, για να δούμε πώς θα μπορούμε να προσπαθεί η σημερινή της δευτερός. Λοιπόν, θα δούμε σχέση για να βρήσουμε το κομποτερικό νόδο όπως ένα κομποτερικό βαζινό. Κομποτερική σημερινή είναι η πλατοφόρονη εργασίας που θα χρησιμοποιώ. Για those who are not familiar with it, Κομποτερική σημερινή είναι ένα επόμενο σχέσης που κάνει οι ευαγωγές της δημιουργής της εργασίας στον Κυππανέντες, Πορτωμπλ και Σκαλωμπλ. Πήρουμε πώς κυπφλογισμό είναι χρησιμοποιηθεί στις προστασίες μες το κουμπιδί. Αυτό είναι ένα πιο ανάγκης συμβουλί του κουμιδιουκούς της κουμιδιουκούς, και το κουμιδιουκούς της στις συμβουλίουκς θα μείνει. Το κουμιδί начинаει με εξημεριότητα με κουμιδιουκούς και κυπφλογραφές, such as spiders or tensorflow that we are going to use today to perform machine learning tasks. Then Kubeflow encapsulates these applications and services to run those machine learning tools on top of Kubernetes clusters. Lastly, Kubernetes runs on prem or any other cloud. Now let's have a closer look of the machine learning workflow as a data scientist would experience it. Data science begins with identifying the problem and collecting and analyzing the data. Then the data scientist has to choose a machine learning algorithm and code their models. Subsequently, the experiment with data and model training. After they built a good enough model, they can optimize it with hyperparameter tuning. And of course, in the very end, they can serve the best model that produced the best result. As we see, Kubeflow provides various components that implement all the aforementioned stages. In this talk, we will focus on Jupyter notebooks, Kale and Kubeflow pipelines. So how can someone interact with Kubeflow? Let's have a closer look in some key components for this session. Here you can see the Kubeflow central dashboard. It is a graphical user interface to manage notebooks, volumes, snapshots and other various components. Kubeflow also has its own APIs and SDKs that you can interact with. One of them is the Pipelines SDK, which we will see in the next slides. We also saw that Kubeflow integrates with Jupyter notebooks. These notebooks run as containers inside the Kubernetes pod and enable users to run web-based development environments inside the Kubernetes cluster. They also expose a Jupyter Lab IDE to the user's browser used for interactive data science computations. Lastly, they have persistent volumes attached to them, which store and install libraries and data, plus enable snapshotting and reproducibility. Another basic component of Kubeflow is Kubeflow pipelines, as we said. Here is how the Kubeflow Pipelines interface looks like. Kubeflow Pipelines is a platform for building and deploying portable, scalable ML workflows, and it consists of a UI, as you see in this slide, for managing training experiments, jobs and runs, a stable SDK for creating pipelines and components, and lastly, an engine for scheduling multi-step machine learning workflows. We also saw that Kubeflow integrates with Kale. So what is Kale? Kale simplifies the use of Kubeflow, giving data scientists the tools they need to orchestrate and to end machine learning workflows. Let's talk more about its main features. Kale offers a UI in the form of a Jupyter Lab extension, as you see in this slide. With Kale, you can enable it and annotate cells within Jupyter Notebooks in order to define the pipeline steps, perform hyperparameter tuning, use a GPU, and track metrics. Kale also provides APIs that enable users to create pipeline components and KFPDS cell, resolve the dependencies, inject data objects into each step, deploy the data science pipeline, and also serve models. We see how Kale converts Jupyter Notebooks into pipelines, but we haven't yet addressed the importance of pipelines in the machine learning workflow. Well, if you really think about it, when working in a Jupyter Notebook, you already follow a workflow that looks like a pipeline. Cells can be viewed as separate steps that depend on each other. Pipelines follow the same logic. They consist of clearly defined steps that can even run in parallel as isolated code execution units. This enables hyperparameter tuning of a notebook and its steps. Also, it becomes very easy when running pipelines in the KIPFLOW pipelines to apply data versioning and reproducibility to these independent steps. The various cells of your notebook that become independent steps might have different requirements. For example, the training of a deep learning model might need a GPU, whereas another cell might not. All the above are feasible with Kale and ROK. So, let's talk about how Kale and ROK can dramatically simplify the workflow of building a KIPFLOW pipeline. Here is the machine learning development process without these components. As you can see, you need to follow various steps. First, you need to write your machine learning code, then create docker images, write the DSL-KFP code, compile it, then upload the pipeline to KFP, and lastly, run the pipeline. So, what happens if you make a mistake and you need to amend your machine learning code? You would have to start from creating docker images again with your new code and follow the next steps as you see here. Every time you make a mistake. When you introduce Kale and ROK, this procedure is simplified to three steps. You will write your machine learning code, you will tag your notebook cells with Kale and run the pipeline with a click of a button. If you make a mistake, this time you just edit your notebook. So, this reduces the iteration time by 70%. Today, since we have introduced all the basic components, we will talk about KIPFLOWS as a service. We will use KIPFLOWS as a service to run our dog breed demo. KIPFLOWS as a service is the fastest way to deploy KIPFLOW. Here, you can see the central dashboard. You can get started with KIPFLOWS in minutes with just a click of a button, like I'm going to show you now. So, give me a second. No, I can't help me. Okay, thanks. So, we're going to open a new window and type this URL, qflow.arrictor.com. You can sign up here with your credentials and create a new account. I have already an account. So, is this email? Let me type my password. It's okay. I can change it later. I know. Okay. So, here's the central dashboard. That you saw earlier. I have already an employment, but if you click here, you can create a new KIPFLOW deployment just by clicking this button. This procedure will take under 30 minutes. We can click create. You can also see a video of how you can use KIPFLOW as you're waiting. So, this is creating. And we can view the ready deployment. Here's the credentials. We have a username and a password. We can copy our password and go to this URL. We will use user as the username and this password in order to login. And here, you can see the KIPFLOW central dashboard that we saw before with the various components. For example, we can go to pipelines, and you can see some ready pipelines that we have prepared. We can go over to the slides. And now I'm going to hand it over to Konstantinos to show you the demo. One, two, one, two. Yeah. So, hello everybody. I'm Konstantinos Andreopoulos. I'm a software engineer at Aricto, along with Rothea. And you can call me Costis, you know, later in the questions. It's much easier. Rothea showed how we can easily create a KIPFLOW as a service instance and access KIPFLOW really easily. So, in this part of the presentation, we are going to run a demo for you. We have prepared a Jupyter notebook. We are going to convert it into a pipeline. And you'll see how we will use KIPFLOW as a service to do that. So, let's jump right in and talk specifics about the example itself. So, as the title says, today we're going to talk about the dog breed classification problem. It's based on a really famous Udacity project. Some of you may know it. And the purpose, as the name suggests, is to classify images of dogs according to their breed. So, it accepts any user-defined input image. And you can easily, you know, go ahead and take picture of your dog, upload it and use our trained models and get a prediction of your dog breed back. So, for the purpose of the demo, we are going to use CNNs, Convolutional Neural Networks, and transfer learning. So, let's jump right in and talk a bit about this part of machine learning. So, as I said, Convolutional Neural Networks is the type of deep networks that we are going to use, and they are a huge class of artificial neural networks, and they have infinite applications in image classification and image analysis. So, an interesting feature about CNNs is that their architecture consists of, let's say, windows that are, in a way, sliding across the image and they detect different types of interesting features in that image, such as corners, such as edges. So, that leads them to understand the contents of the image, like items that are contained in the image or dogs, in our case. And the framework that we are going to use in order to implement all this in this demo is TensorFlow. I'm sure you know TensorFlow, which shows it because, well, of course, it's open source. It has an infinite amount of applications. It's excellent for training and inference of deep neural networks, and it is also natively supported in Kubeflow. So, the first pre-trained CNN that we are going to use is called ResNet 50, and as the name suggests, it's 50 layers deep. It's trained on ImageNet. It has over a million different images and about a thousand categories of different objects contained in these images, such as keyboard, mouse, pencil, and many animals, such as dogs. So, the next one that we are going to use is called VGG16. As the name suggests again, it's 16 layers deep, and it's a CNN with infinite applications in image analysis. It's also used in the ImageNet Visual Database Project, and, well, as you probably guessed, it's really, really popular. So, let's jump in the sections of the notebook that we are going to use for this demo. In the beginning, as any of the scientists, we are going to do a couple of downloads, imports, define some pipeline parameters because this notebook is going to turn into a pipeline in the end. We are going to load our data in memory, and then we are going to create three different types of CNNs. The first one we will build with our own bare hands using TensorFlow. The other two are pre-trained, so we are going to use transfer learning on top of them in order to use them in our specific problem in dog reclassification. Then we are going to test them and see a couple of visualizations in order to see where we are at. So, we can now jump in the demo. Let me change the screen here. Let's go here. So, as Dorothea showed you, we have an instance ready to use, so we are going to go this way, we are going to copy our password, and we are going to connect it, right? So, let me move that a bit and it would be easier. So, as you can see, we have a Kubeflow dashboard here, the ones that we saw in the slides. Let's go ahead and create a new notebook server in order to run our examples. So, I am going to go ahead and hit your notebook. I am going to give it a name, let's call it KubeCon. All right. So, as you have seen, I have lots of other Kubeflow instances here. We are going to add a couple of more gigabytes of volume, let's say 15, and I think we are ready. We are going to click launch. Essentially, what we are doing now is we are creating a notebook server that runs a JupyterLab image, but this image not only exposes a JupyterLab UI, but it also, this JupyterLab UI is extended by Kale's lab extension. So, we are going to see how Kale goes over the JupyterLab UI and gives you the opportunity to create a whole bunch of different annotations in order to translate cells into actual pipeline steps. So, we are connecting to it. You see the JupyterLab UI pop up. Yeah. So, this is empty. We need to download our notebook. So, let's get the terminal. Our notebook is hosted in our GitHub organization page. It's under the examples repository. So, I'm going to go ahead and clone it without seeing actually. Let's see how that goes. It goes okay? Yeah. Oh, yeah. I forgot clone. Thank you. Thanks for the help also. So, if we refresh here, we can see that it has a new examples folder, the file viewer here. We are going to go ahead and pick the dog breed one, and we need this notebook. So, we have prepared this for you. Let me go ahead and download this so that we don't do that later on. And we need to install also a couple of requirements. So, what I've downloaded there is the data sets and some of the pre-trained features about the models that we are going to use to classify dogs. The ResNet 50 and the VGG161 that we said before. So, let's go ahead and take a look at the notebook, okay? And see here that we have a new icon on the left. It's called Kale. And we can hit enable, and you see the difference in the UI. Now, Kale gives you the opportunity, as I said before, to annotate notebook cells and define them as pipeline steps or pipeline parameters that you can tune each time you run a new pipeline. Also, you have import cells, and you have a whole bunch of different cells, actually. So, let's go ahead and see now that the notebook is annotated. Let's go ahead and see what its cell does. So, essentially, in this step we load our data set. This is not that important. We can go ahead and see what we do with our models. So, you can see here that we are annotating a cell as a CNN from scratch. I mean, this is the cell that defines our newly created CNN model. And you'll see that we are using the sequential API by TensorFlow in order to create it. It's not a very deep model, so don't expect really fascinating results. So, we go ahead and compile it, we train it, and then we do a couple of tests. We report test accuracy, and also we'll create a couple of visualizations for you to see. So, in another step or in another bunch of steps that are all annotated with the same annotation, you can see that we are using the VG16 model. And, essentially, what we are doing is that we are adding another layer in the end of the model, and this layer is not a pre-trained layer, as the rest of the model is. We are going to use this layer in order to do classification on top of our specific dataset, the DogWidth dataset. So, we compile the model, we train it, and the same thing, we do a couple of tests. And a couple of visualizations and predictions. The third model, and the last one, is the Resident 50 model. We do exactly the same thing as we did with VG16. And, well, I think that these are the important parts of the model. We can go back and see how the downloads are going. Okay, yeah. We have a couple of installation paintings. This takes like a minute. So, essentially, what we are going to do now, now that we have set up our environment, we are installing everything and we have downloaded our dataset. We have loaded our models, et cetera. Now we are going to hit Compile and Run. And when we do that, Cale will automatically validate the notebook, see if there's any missing dependencies there, between the steps, and it is going to take a snapshot of the volume. And this snapshot is essentially a picture of the volume, let's say, that we can reproduce whenever we want. And Cale is going to create a new PVC out of that snapshot. And this PVC will find its way, it needs an every step of the pipeline that's going to be created. So this way we persist all of our user environment to the pipeline that we are going to create. So it seems like it's working. We can go ahead and view it. And you'll see the execution graph that KFP exposes. It's in every stage. The first step is creating a volume. So because this is going to take a bit of time, we can actually show you, oh, see, next step. We can actually show you a pipeline that we run, well, yesterday. And you can see here how these three steps are the ones that we use to create our model and train it. And you see how they run in parallel, but they somehow depend on the same step, on the previous step. You see, you know, all the dependencies using the arrows. So let's go ahead and click, you know, in ResNet. And go in the Visualizations tab. And you'll see that we have this HTML window here. So as you can see, the test accuracy of this model is 82% over our test data set. And we have shown, we have decided to show, to show visualization of a specific part of the data set, this dog. This is a Dalmatian. This is the ground truth. And this is what the model predicted. So we see really, really good results early on. The next model that we are going to see is the VGC 16. Again, Visualizations tab. You can see a whole bunch of visualizations that you can do. Not that good, not that good of an accuracy. It's 72%. But again, it predicted correctly this specific Dalmatian dog. We can do the same thing for our CNET from scratch. And we have some things here too, if we go and scroll right back like this. You'll see that the test accuracy is not that good at this point. I mean, we expected that it's like a really, really not that deep layer. So it cannot be compared to ResNet and the other two layers. So yeah, this is it from the demo. We can actually jump in to the slides. And you know, I want to end this by talking a bit about our community. I would like to conclude by saying that in Eryctor we are really, really passionate open source contributors. We have contributed to many components in Kubeflow. We've contributed to the interpreter management UI, to Kubeflow pipelines, to our own opinionated version of Kubeflow mini-KF up to the Linux kernel itself. So a bit more about Kubeflow. Kubeflow, as you may know, is a great community made up with many, many big companies that are actively contributing. And it is a community that is always, always trying to find new members, new contributors that are passionate users. So please feel free to get involved. And here you can find like a lot of pointers from our GitHub organization to our Slack community. We are really, really happy to support you and to onboard you. So without further ado, I would like to say thank you. Here's a couple of pointers and I hope you enjoyed it. Now, do we have any questions? Can't see any hands up. Anybody have any questions? I have one question. So I found out earlier today that we have quite a few dog lovers in the room. Yeah. Can these people, some of these people might have dogs that they need classifying. Can they drop an image into the Slack channel when you classify them and we can come back to that later on. Yeah, yeah. As we said, it's a public example. So if you just go ahead in our GitHub account, download it and run it, you can actually take a picture of your dog and see what the actual breed is. See that later on. Oh, so you do have a question? Okay. Front, congratulations on the presentation. Front, on your point of view, is it possible to predict a mixed breed, something like percentage of... Yeah, yeah. Of a combination of two or three breeds, you say. Well, it actually depends on the dataset that we used. I think that the dataset that we used, although I have not seen all of the targets, but I think it's all unique breeds. So if you create a dataset that has different targets, you know, extends the one that we used and has a combination of others and also provide pictures of these so that you can train your model, then sure. But I think this one doesn't. We have another question. About the Kaila project. It looks very cool, but it looks a kind of deprecated project. It's just with 500 stars in GitHub and the last comment is like for the October last year. Yes, yes. So what we showed you here is actually our own private version of Kaila that comes with Kubeflow as a service. So it isn't very in sync with what you see in our upstream repo in GitHub. It has a whole bunch of new features and it's, you know, it's in a really, really updated stage, not the one that you see in the open source. Any more questions? Jessica, do we have any questions in the Slack channel? No new job features. Okay, great stuff. Okay, thank you very much.