 Thank you very much, Anand. So before we start, actually, we would like to share one thing with you before we start presenting. So this is something that will be used during the workshop. So I will share with you in the chat. It's an open spreadsheet, so you should be able to enter without problem. And there are basically two sheets in this spreadsheet. The first one is just for your task. So in this way, we know where we are and where you are. So if you have any problem, you can just tell us what is your problem, in which task you are, and then we can continue or try to solve your problem. So there are two requirements for the workshop, and I hope everyone has them, and is to have a GitHub account. And the second thing is to have a token from GitHub. So if you don't know how to do that, I will share a link in a moment or if someone can do that in the chat. And yes, that's all. And the other thing I wanted to share with you is this, actually we can start maybe, if you can start putting your name or GitHub account in the spreadsheet, so we can speed up a bit some of the steps. And please let me know if you don't have, if you don't know how to create a GitHub token, I will share with you the link about it. It should be here and here. This is the link. Oh, thank you, Pat, instead of me. So I don't need to share it again. And so when you're ready, the first two things we would like you to do is to enter the tutorial repo, which is this one. And then what we would like you to do is to basically, the first thing is just to fork the repo. So once you have the GitHub account and the token, you can just fork the repo and then go to the URL of the repo. You can copy it. And we're going to use it in the Meteor platform. So you will know what is Meteor in a moment after the presentation. But to speed up the process, we are going to make it this immediately. So this is the link to Meteor. So you can just enter the link. And what you have to do is just to copy the URL of your fork and then let the magic happen. And Tom will explain also what happened in a moment. So if you have any problem, please let us know. Or there's still some things that you don't know how to do. Otherwise, we can start with the presentation, I think. So please fill the spreadsheet when you're ready. And the second sheet actually just to tell you right now is with all the links and all the things that you will need in the workshop. So to make the thing easy for you, we just put here what you have to use. So here you have the tutorial repo, for example, and you have also the link to Meteor. The rest will come in a moment when we start the workshop. So thank you very much. And I think we can start with the presentation. So please go ahead. Vasek, let me know when I can move the slides. Yeah, you can move the slides right now. So I'm going to talk a bit about Open Data Hub, which is the part of this whole workshop where you are going to be doing the AI. Obviously, it's an open shift. Francesco, if you can give me a next slide. I will first try to tell a story. And sorry for the orange color. I couldn't change the color to the red to match the rest of the presentation. These are the slides I created before in a different template. Obviously, it's technology and always works just not for me. So I would like to tell you this story. So I guess most of us started our IT, software engineering, data science, whatever journey you are on. You started your journey and on your machine, you got a machine or you bought a machine. It was or it is sitting in front of you and it's great. You can do whatever you want with that. You can configure it the way you want. And that's how most of the people we talked about. We talked about doing AI, machine learning and data science these days, how they are working. They go to their IT and if we talk about big companies, they go to their IT and they say, I'm doing this data science project and I would like to get a machine. It means this amount of RAM and this amount of CPUs or CPU power and they get some machine and they work on that and then their data set grows. And it's bigger and bigger and they need more memory and more resources and they also need to set up their machine. They need to install a new version of the software. They need to figure out what version works with which other version, things like that. And that's kind of where we are now or where we were let's say a couple months back maybe. The other issues of that is that if you finish something, if you do something, you want to show it to others and now it's sitting on your machine to figure out how to share it, how to kind of move it to other people, how to deploy it somewhere else. I already mentioned that you will ask for more memory, you will ask for more CPU and it's not that easy. If it's laptop, then there is not much to do probably. If it's a workstation, maybe you can buy more memory, maybe you can buy a better CPU, but it's not going to happen in minutes or hours. It's going to be probably a week long process to get that done. And we're also getting into issues of work on my machine. We have all experienced that. I'm developing something, I'm doing something. It works great and I give it to someone else to try it out. It just doesn't work because something is different. Francesca, if I can get the next slide. So there is another option for that. There is a second option and that's shared infrastructure. We've been seeing the move to the shared infrastructure for a long time for services. Nobody is now managing their own, well, not nobody. A lot of people are not managing their own small server for a lot of things. They just go and buy a service or they go and buy a larger set of machines, some kind of cluster, where they run everything there. They share it with other people. So this is the same for data science and AI and ML, right? The benefits are kind of what I said before, but in the opposite way, right? It's much easier to collaborate. If it's already in the shared infrastructure, it should be much easier to give it to Tom and to give him access and say, hey, Tom, can you look at this? I think it works, but can you check some things on that? There is much easier way to reallocate the resources. The resources are already in the infrastructure. Well, great. The project A, which was popular last week, can be downsized and can get less memory. And the project B, that is very important this week, can get more memory or more resources, more CPU, right? Or it can be based on demand, depends on what you need or what you want. And then also the works on my machine kind of goes away, because if we are all on the shared infrastructure with very similar tools, we are using same tools for deployment, for development, hopefully it's very similar to production, right? We are trying to get close to production approaches or production processes for development, so that if we deploy the production or anywhere else in our machine, it's going to work. And we don't have to solve a ton of issues just because it's a different machine, right? Why am I talking about this? We are talking about AI on OpenShift. OpenShift is kind of this picture, right? It's a shared infrastructure where a lot of people can come and can use the resources as they are assigned to them or as they are part of some groups, things like that. It's kind of normal now for developing or for doing web services and, I don't know, APIs and databases and all these things, right? But it may be not that obvious for AI and machine learning. Frances, go to the next slide, please. And the reason, so the reason we are talking about this is that we were working or we have been working on OpenData projects for last, I don't know, three years probably. It started as a project internally at Red Hat. We needed something like that for our needs at Red Hat because there's a lot of data produced every single day at Red Hat. Nobody was able to collect the data in a single place and nobody was able to process the data or there were no tools, no standardized approaches to how to process the data. So we started building OpenData. The core component of that is Jupyter, which we will see or we will see later. And what else it is? So we call it a blueprint because when people started to call it reference architecture, and reference architecture often means to people something else than OpenData is. So reference architecture often means that it's something that is set in stone, that is how I should use things, how they should be tied together. OpenData is trying to be flexible and give you the flexibility of trying things your way rather than prescribing you how things should be done, but allowing you to do them in a default way, how they are configured in OpenData or changed them. So what is OpenData? OpenData is an AI as a service platform on type of open shift. It is an open source project. So we have an open source community, all the code lives on GitHub. We have community meetings, public mailing lists, all these things. It's also in a meta operator. So probably not everyone is familiar with operators. You might hear about that more during the workshop, but basically it is a way to encode some operational knowledge into software so that you don't have to do all the steps manually, but that the software takes care of the operations for you. So operator normally operates one single application that it understands well. Meta operator operates operators. So we deploy various operators that deploy those and application and user applications like Jupyter Hub or Spark or whatever else. And also OpenData Hub is production ready. It is a system that is deployed as part of RedHead infrastructure. It is used by many RedHeaders. It's also now being used by RedHead customers and some enthusiasts at universities and things like that. Next slide, please. So if we dig a bit deeper into OpenData Hub, the idea is to cover the whole data science flow. So you need to be able to store your data. So we work with the CEPH project and work the teams around that. Then you need to be able to transform and process the data and create a model. So we have Jupyter Hub for that that allows us to create Jupyter notebooks, use various libraries like TensorFlow, PyTorch, Spice Park, all these things, and work with the data stored somewhere. Then we have the Kubeflow box here. It's just thrown in there. Kubeflow is our upstream. We use Kubeflow operator as the way to deploy things. We use the same structure for our deployment manifest and all these things. And you can also use components from Kubeflow and OpenData Hub. You can mix and match them pretty well. So if you look into Kubeflow, which is another AIML project on Kubernetes and OpenShift, you can see a lot of interesting projects and useful projects. So you can use them as part of the workflow. And then you want to deploy your model. So we have pure OpenShift for that. You can just build a container and deploy your model. Or you can use things like Seldom or KF Serving, where you have some automatically built APIs on top of that, metrics and all these things. Speaking about metrics, we also need to monitor what we deploy. We need to review whether it still works or not. So we have Grafana and Prometheus as part of OpenData. And then also all these information, these new information needs to be stored somewhere. So we tie this all thing back to the storage, to kind of a central distributed storage, if that term makes sense. And I've already said that OpenData Hub is there to be able to use it as it is or extend it or modify it. So basically the idea is that you can just deploy it on OpenShift by three or four clicks. You can start using it. If you find that something is not right for you, you can take it out. You can change the configuration. Or you can even bring your own components by kind of following the approach that we have there. And I think Tom can talk about it more in the Operate First initiative, because they are doing exactly that. Next slide, please. I'm not sure if there is more. Right. So this is very brief introduction to OpenData Hub. If you want to learn more, you can go to OpenData Hub at IO. We have community meetings on Mondays. I'm not sure we changed it now. So maybe it's not regular every other Monday, but maybe it's kind of irregular. So what's the calendar, I guess? All the repositories are on the GitHub, as I said. We have plenty of presentations and demos recorded on the website. So if you go there, go to docs, and there is a video and presentation section. And as I mentioned, there is documentation and examples for how to use, how to try, how to try. I guess that's all from me now. I'll stick around. If you have questions, base questions into the chat, I'll try to answer them, I guess. Thank you, Vasek. So we can move to the next part. Please, Tom, go ahead. Thank you. Hello, everybody. I do have a chance before to introduce myself. I'm Tom from, I'm part of the ICOE Internet Hub. And one of the initiatives that we work on is called Operate First. And Vasek already kind of hinted what this initiative is about. So let me maybe continue with the story time that Vasek started. And on the next slide, Francesco, thank you. We're going to jump into a different story, into a story of operating workloads. So we need a bit of history. This is not a history lesson, all right? This is a workshop, so I'm going to keep it very brief. In the, just a few years ago, before we had things like open source software, the code was what brought value to companies, what made companies successful. The code was proprietary, and that was developed. Then on the next slide, something called open source was brought to light. And at that point, we got software with available source code to it. And at the point, it was just about seeing important how you operate the code as how you can hack on the code. So now, once you have access to the code, to source code, the differentiator, the actor behind companies, how they differentiate, what they do with certain software is within the operations, within the section, how you basically handle the applications, how you deploy it, how you manage it. And at this point, we're kind of seeing a balance at this point in history. On the next slide, this is where we are now. With cloud computing, as Vashe told you before, we are seeing a bit of a shift from local infrastructure to shared infrastructure. And shared infrastructure is something that is usually just consumed by users. That's something you don't usually have the experience to hack on, to actually try to operate yourself. And the operations itself became more valuable as an asset, as a knowledge for companies as an intellectual property than the actual application code itself. The operations, as you can see in public cloud providers, are usually closed source and you don't have access to the knowledge these companies acquire in operating and managing the workloads that we all know. So imagine a software like Grafana or, for example, storage, software defined storage and whatnot, you know that there's a project called this and that. It's probably open source, so you can look out the source code of this public cloud project, but how you're going to deploy it and how are you going to scale it in a massive scale to handle loads of thousands and hundreds of thousands users per minute hour. You don't know that. This knowledge is proprietary now. So on the next slide, this is what operating first would like to solve, because we would like to level this out and balance open source code and operations by making operations open source as well. That brings in cooperation of operational experience into software development, bringing developers closer to SRE folks, to people who operate applications, who test the applications, and who work on deployment and scaling of applications. So this is an idea, a concept that we're trying to implement and we are basically trying to run operations in open source way, collaborating in open manner and managing cloud infrastructure. This way using the open principles. On the next slide, you can see a funnel graph of contributions like a scale from user to contributors. This is also a very important, very important aspect of operating and basically illustrates where the problem with operating software is. If you have an open source software, it's very easy to figure out how to solve a problem. The journey from, I have this problem and here's the fix for it, is pretty straightforward. It's possible for you to do that. But when you have a software as a service and there is something, somebody, some entity outside of you that manages the software, how do you know, how do you know, how to contribute, how to fix your issues, how to deploy your fixes. It's not possible currently. And this is something we try to solve operate first and we will be using operate first infrastructure in this project. We will be working on an open shift cluster provided by operate first and everything running on that cluster was set up using this open idea of operations and operating stuff. Next slide, please. So just to go back to the operate first, this is trying to solve a problem that we see in open source code versus operations in a cloud workspace in a cloud world. And we have a full community going around this initiative. And on the next slide, you can see a couple of links, how to join us, how to contribute to this community. And you will see if this approach actually works or not. We're trying to prove that it works. So we will be glad if you can join us there. There are also other talks on operate first during DEF CONF. So feel free to join those as well. Marcel Hild will be talking about some interesting operate first stuff there as well. So this was a short answer to operate first and back over to you Francesco. Thank you very much, Tom. So before we go ahead with project off, I wanted to check if there are any questions for Vasek or Tom. Or if you have any problems like with the first task I just gave you. I think there is one question in chat. Yes. One of the users were not able to start their video. Okay. Tom, I see Tom is helping them. We can continue. Okay. Then Tom can take care. I can go ahead. Thank you. So let me start talking about project off. Also, this will be a very short introduction. So what is project off? You heard Vasek and Tom talking about operate first or the age. And one important concept that was mentioned was related to if I want to share my code and I won't allow the others to rerun this code also without any issues, then we need a way to basically give this code in the best way and in a secure way. And this should allow reproducibility and shareability for everything that you do with your code. Either it's Python code or it's a Jupyter notebook. So everything you use, you should be able to just give this piece of code to someone else. And if they have all the information regarding what the dependencies, the runtime environment, and all this information, then they should be able to run it again without any issues. So project off has three main goals, I would say. The first of all, the first one is basically to help the developers in selecting the dependencies. So as you know, when you start working on your project, one of the first things is to select dependencies. So if I am a data scientist and I want to start my project in my notebook, I need some dependencies. For example, I need the TensorFlow. But if I run pip install TensorFlow, this is something that works, of course, but if I try to give this notebook to someone else and they try to run it again in one month, then something might not work because maybe there is a new release of TensorFlow and this can basically break your notebook. Even if you state the specific version that you use for TensorFlow, TensorFlow itself does depend on other dependencies which are called transitive dependencies. One of them, for example, is NumPy. If any of these versions change and the main dependencies basically not stating all of them in a specific version, then you might have a problem because on this machine will work, on the other one won't work. And what we want to do is to basically help developers in this task, so management of dependencies, and we want to actually give them some more degree of freedom. So if they want to choose the dependencies is not just because I want the latest dependency, but maybe because I'm interested in performance. Maybe I'm interested in security. So I want a software stack which has no vulnerability, no CVE. I want performance because I need to train my model. So I want to know which version of TensorFlow is actually the one that is giving the best performance, which does not mean it's just the TensorFlow one, but it's also all the dependencies that are behind TensorFlow. So NumPy have all the different versions and each of them can impact also your performances. So this is what Projectos tries to do. So there is a service which is using actually reinforcement learning and what we do is we learn about all the dependencies and we try to give recommendation to the developers based on their recommendation. So if you are interested in performance, we will give you a software stack which is focused on the performance. But I will go a little bit deeper in a moment. So the other two goals are more related to the images. So one is that we want to deliver optimized images. So as you can imagine, if you have some specific application for computer vision or natural language processing, then there are different stacks. And all these stacks can also be optimized, optimized not just for the software stack itself, but for all the layers that you have in the code, basically also the interpreter. So we talk about Python ecosystem. So you have the Python interpreter, you have the runtime environment on top of it, so below it. So the operating system you are using and the CPU or GPU, so the hardware is also something that can affect. So TOT actually takes into account all these inputs and it is able to provide you with a software stack that can run on specific runtime environments and can give you some specific requirements in terms of performance, security or whatever is your interest in terms of the software stack. And the last goal is basically that we want to automate all of this. So we don't want the developers to take care of dependencies. We want bots to automate this. We want pipelines to create images for them. So we want them to focus just on their specific project and the problem itself, so they don't need to focus on about these things. So just a bit of a view of the knowledge that we have, this is how we can provide this recommendation. So we have built-time and runtime environment information. So we have a specific runtime environment that we use to install packages. So we learn if a package is able to be installed on a certain machine, if it's able to be run on a certain machine. We have, of course, all the dependencies. We run performances for specific stacks that influence, for example, TensorFlow, PyTorch, all the ones used for ML models. We have also application binary interfaces, so to know if actually that code is going to run. Security, as I mentioned before, we have CVE. We have analyzers for vulnerabilities. And actually, we also take into account about source code meta information. So when you use open source project, some communities are, let's say, well established and the project is well maintained. There is a history of this project and there is basically a possibility that this project will go farther. But some projects are not like this. So what we want to do with this specific meta information is to analyze open source project and gather information about the level of maintenance, the level of security, if there is a community behind it, if they follow some specific policies. And these are information that we provide to the users. So they know that if I'm starting a project, maybe this is not the best library to use for this specific case. If I'm going to do this, I don't know, to move this code in production one day and then this project will basically disappear. So all this information are stored in the top knowledge graph. And the recommendation is something I already mentioned before. So there are different type of recommendation, depending on your requirements. Regarding integration, of course, we want to integrate in most of the day-to-day developers tool. So we have a CLI that you can easily install with PIP install and then do just Tamos advice on your software stack. We have Jupyter lab requirements integration. So for data scientists that use Jupyter lab or Jupyter tools. And this is what we will use actually today during the workshop. We have integration for the GitHub repo. So there is bot that you can install from the GitHub marketplace and this bot will just pass by your repository and can provide you with updates on the dependencies. If there are any issues on your dependencies, they will be solved. If there is a new CVE that come up and the thought learns about it, then it's going to immediately open up a request for you to update the dependencies. We have also, of course, source to image. So if you are familiar with the container builders, then TOT is also integrated in this kind of tool. And this is important, for example, for pipelines. If you want to make sure, for example, that no security vulnerabilities are integrated into the code that goes in production, you can basically set pipelines to build with the TOT as a service. And TOT will basically fail if there is an issue from in your dependencies that is related to security, if you are interested in that. And we have also, of course, pipelines, pipelines that optimize the builds and that allow you to reproduce all your code because thanks to TOT and the configuration that you can provide, we can state specifically what you want to use in your code. And this is just a summary of it. And if you're interested in project TOT, we have a YouTube channel. There is a GitHub repo where you can just go there, open issues. If you're interested, there is a website, Twitter. And on YouTube, you will find basically all the projects that we are working on. And also, if you want to know more into details what is the TOT service, how we provide the recommendation, there is a lot of material there. And with that, I close from project TOT. So we can move to the last, I think, two slides. It's not going to be long, I guess. So Tom, if you want to continue on project Meteor, so then we can move to the workshop. Sure. Next slide, please. So you've heard about project TOT, about Operate First, about Open Data Help. We have quite a few data scientists in our program team and in our organization who are using all these components, all these projects, all these initiatives. And we found out that we build, as it's usual, in every other team, every other organization, every other company, we've built a lot of tooling around those applications, around those frameworks. And at some point, we found out that we should share this tooling. And don't get me wrong, the tooling was open source, still is open source, and all of it is accessible. And again, reiterating on the Operate First principles, we made the code available to you to be able to consume by you and deploy by you, but how do you interface with those applications? How do you use them? That was the issue we found out that we are again in the need to solve. And that brought Project Meteor to light. It's a project that is trying to bundle all those tools that we have into a nice package that is easy to consume. And probably you remember from before, as Vasi was talking about, ODH, Open Data Hub, and operators. So Project Meteor is trying to package this knowledge, this tooling, this also operational knowledge about various data science aspects of their workflow into a nice package that can be also operationalized, operated, and automated. So Project Meteor is the user interface for Project Meteor. It's the website that you've accessed at the beginning of this workshop. And it is trying to prove that we can enable every user on the Internet, or anywhere in the world, to use thought station projects, or the thought advice, and Tamos, and what other tools Project Thought provides, that we can also make it possible for you to use our CI tooling that we have. And we're trying to prove and show you that you can actually have something integrated with Open Data Hub, and use a very simple and easy to use interface, how you can bring your own workloads into Data Hub. So next slide, please. What we do in Project Meteor is that we consume GitHub URL. This is a URL to any repository on GitHub. We initiate a couple of pipelines, technical pipelines, on top of it consuming this repository, and basically building different artifacts out of it. Some of them can be Jupyter Book websites that are being deployed as part of our infrastructure. So you will be presented with a Jupyter Book website with statically rendered content of your repository, and showcasing your data science findings in a consumable way, in a consumable manner, so you can share your work with your colleagues and with your teammates, and they can view what kind of analysis you did. And if they are very interested in your analysis, we have another pipeline that is basically running on top of the same repository, creating a different container image. But this container image, this time, is capable of being spawned in Jupyter Hub, which is part, which is a complement of Open Data Hub. So this brings you an interactive form of how you can interact with the notebooks in any GitHub repository or shared infrastructure with, depending on the cluster where we're running this at, but it can be very capable of, the clusters can have plugged in GPUs, huge memory pools and whatnot. So right now, this is a very easy way by passing a URL somewhere to get access to a shared infrastructure with cloud computing capabilities and cloud computing scale, basically. And again, if you're passing this URL to Meteor, this image is publicly consumable by any other Meteor user. So any other Meteor user can go and spin out this image in Jupyter Hub and interactively work with the image, as we will see in the workshop in just a couple of minutes. So that is the idea behind Project Meteor. We're still in very early stages of this project. We will be extending it with other tools that we have developed in AICOE or in talkstation or anybody is basically welcome to integrate their tooling into this cloud computing user interface for data science. Next slide, please. So please feel free to visit us at at our repository file initiative or spark any discussions. You can also join us on Operate First Slack and Operate First Community. It's all under AICOE umbrella. I think that's about it. Let's get to showing things and actually doing a workshop. Francesca, back to you. Thank you, Tom. So I hope you started to enter the link, sorry, the URL into Meteor as Tom just described and you will see that as he mentioned there are two images that will be created. So we'll have two links that will appear as like available. Now I can open basically my environment in Jupyter Hub and I can also open my web page of the Jupyter Book that describes all the steps that we're going to follow basically. So let me check the status. So where we are all I see that you're already ahead of me as well, which is also great. So, okay. So the first thing first, if you are ready here, I see that some images are not ready yet. But in any case, if your image is basically not ready, you can just go to the main page. And as Tom mentioned, all these images are actually available on Jupyter Hub, but you can reach all of them here. So you can just pick up any of the images that are already finished if you want, and we can go ahead like that. So you can use, for example, this is, for example, mine, if you want to use it, it's okay. So in this way, we are all on the same page, and we can say that we can go ahead from here. So once you are here, you can open the Jupyter Book if you want to have a look. So this is something that we will follow now during the workshop. So there is everything explained about the steps that we're going to follow. The spreadsheet is more to have an overview of what you are doing. So we know if you're going too fast, or you need some help or anything. And I think, and I already mentioned, so if you have any issues, you can also talk, not just write. So feel free to interact with us as much as you want. And we are here also to answer any question if you have a question about anything that was presented today. The other link is the Jupyter environment. So as you saw, some of the next steps or task for you is to basically open this environment and to select the large size for the resources. So actually, as you can see, I have the name of my ID. So it's Meteor Br5RG. So I should be able to find it in this list of images. I might be able to find it. Yes, here it is. So then I just select large, and then I can start my server. So now it's Jupyter Hub task to spawn my image that was created by Meteor. And I can update logged in and open it first. And you already are here. And I'm starting the image. It should be ready in a moment. So meanwhile, I can start maybe talking a bit about the tutorial that we're going to see today. So the purpose of this tutorial is actually to show most of these tools that were described that are available in Open Data Hub. And it's going to describe in particular the interface, the interface actually between the data science and the DevOps. And thanks to operate first and the open way to see, you can also see what is happening. And you can check everything. So the data scientists can see what DevOps is doing and vice versa. So they can also share and learn from each other as we do in a community and in an open environment. And the application that we're going to do today is a simple application. It's just a nice classification. I think everyone knows about it. What we will focus is more about how the tools and the way we can develop basically AI project using operate first, open data hub and all the tooling that is on top of it. And so the first thing is just the environment description. So if you want to know what we use, what we're going to use is open data hub, of course, on top of OpenShift, which is running on operate first. We use cloud storage object. In this case, we use a Minio. Detecton pipelines are the one that actually were used also in the AI CCI, so the one that is building your images. And Argo CD is something that we will talk a little bit later. Maybe Tom can also mention something more because he's more expert and about Argo CD. And the tooling itself, so Jupyter Hub for spawning the images. Ellyra, for those of you that don't know, Ellyra is basically an extension for Jupyter Lab that allows you to create AI pipelines that can run on top of different engines. So you can have Airflow or you can have Kubeflow pipelines. And Kubeflow pipeline can run on different tooling like Tecton or Argo. But we will see this in a moment. Then we will use ProjectOS. The extension that I show before is the one we are going to use today is the one for the Jupyter notebooks and Kubeflow pipeline to basically run these pipelines through Ellyra. The concept that also we want to always share is that we want to allow everyone to reproduce their work. So if I today develop some project and I want the other to share, to reuse it and to repeat this experiment, then they should have the project with all the pieces that are required in order to rerun the same experiment. So in this way, they cannot encounter any issues and they can repeat and share and show this example to someone else. Automation, as I mentioned before, we're going to use bots and pipelines to do most of the work. Some of the steps that we're going to see is something that I will show you. We're not going to repeat it because probably we don't have time, but we're going to see how actually we work usually in the ICOE. And also these tools, of course, are available if you want to install them. I will show you how to do that if you want to reuse them. And we can start, I think, with the prerequisite, I hope, or I mean, yes, everyone was logged in. So you have a GitHub account. Remember, you need a GitHub token because we need to push the changes that we're going to make in your projects that you're going to fork, that you have in Jupyter Hub, and then you need to push to your fork. OpenShift and everything is already set because Meteor already set everything for you in terms of resources, in terms of what you need, the environment, the images that we're going to use have been already created, and the dependencies for your specific actually image have been already set because of Meteor. And so we can move to the first part. So the first part is something that we already did. So if you actually don't use Meteor, you'd have some steps that you need to follow to go to Jupyter Hub and select images and go ahead. In this case, we use Meteor that is automating everything for us and creates already the image. So we don't need to do anything else. One thing that I want to share with you is the importance to have some common structure for your project. So in AICOE, we follow a structure that was created by one of our teams, the AIOps team. And actually, this kind of structure is the one that we reuse in any data science project we have. In this way, we are able to find immediately what the others are working on. If we look for something specific about the project, we can immediately find it. So if I want the notebooks, I know where to find them. If I want the manifest, I know where to find them, the models, if they're here or they're linked to some specific location. But in this way, we can allow everyone to see how the project is structured, what you are doing. And this is very easy for anyone else just to pick up that project and learn about it and repeat the same experiment. As you see, there are dependencies. There is everything that is required for a specific project to be shared. And this is already something that we already did. So the first thing we're going to do, if the image is ready, I hope everyone is on the same page. Let me know if you're already in the Jupyter Lab image. Remember that if you meet your pipeline, it's not finished. You can always pick up any of the other that are already created so we can move ahead. Because in any case, we are using all the same environments in this case. So the first thing we're going to do actually is to clone the repo. So as you described, when you have Elira, Elira comes with a set of extensions for Jupyter Lab. And one of these extensions is the Git extension. I hope everyone is familiar with Git. And this extension is basically going to facilitate all the things directly because you can do it with this extension. So you have a Git clone and what you can do. So everyone can go to their own fork, go to code, and they can clone their own repo. So you just take your link from your own fork and insert it here and just hit clone. After a few seconds, the extension will just clone the repo. As you see, you have two, but the one we just cloned is this one. So all of you should have this repo in the Jupyter. And if you see, it's actually all the structure that you have in this fork. So let me update it. I also cloned it. Someone has an issue because I don't see that all of you are in that step. Please let us know in the chat or hi, Gianni. So let me see. Let me know also if you are stuck in any of the task. And remember that if you don't remember any of the links, here you can immediately find them. So in this way, we don't lose track, but we will go and use each of these in a moment. So if all of you clone the repo, kind of, I see half of you copy clone the repo. Is any of you having issues talking good to these users? Just how can you show the Jupyter Hub spawned Red UI and basically delete your selection? Yes, I guess I can go here. Or you would need to shut down your server so I can probably... Oh, yes, please. Can you? Should I stop? Go ahead. I also have a running server, but I'm stopping it right now. So the problem is it is blank. So I'm going to do the full flow now. I'm going to log out. You go to this URL, Jupyter Hub URL. You're presented with this screen. You can click on the operate first, which will log in through GitHub. It's just an authentication to the OpenShef cluster, nothing else basically for awarding you your GitHub username to operate first. And you should be presented. Okay, it's blank for me as well. Well, let's me take a look at... Hey, Tom, can you try clearing your cookies and refreshing them a bit? And let's see if that works. Clear your cookies. Tom, can you show me the web console, the developer web console, and go to console? Is there any error? Mixed content unable to fetch Shilai config. Yeah, we have seen this error in the past, but it's basically impossible to reproduce, and it just happens completely randomly. But it hasn't happened to us for a long, long time. The problem is that basically the only way we figured out to fix it is to dump the Jupyter update database, because it seems that the issue is that Jupyter Hub stores some either cookie sessions or something in the database. And for whatever reason, when it tries to kind of re-log in for the service, it just can't. And then there is some redirects, and it ends up being this. Okay, so should we...? I think the 100% sure way how to fix this is to for an admin go to OpenSheet console, or through the CLI, whatever, and basically scale down Jupyter Hub, scale down the database, or recreate the database pvc, and scale it up all again. Or I mean, if you know from your head how to basically delete the database of Postgres and let Jupyter Hub recreate it, it works too. For me, it was always faster to just delete the pvc. It should allow everyone to basically jump back to where you were. You shouldn't lose the running pods and anything. It's just kind of the user information stored in the database for whatever reason, sometimes very rarely gets stale. And this happens. And since we can't, we don't know how to reproduce it, it's very hard to fix. We had like three various fixes in OpenData that all seem to fix it because after the fix, we couldn't reproduce it anymore. Although we had 100% sure reproducer, but then a couple days later, it showed up again. Now you have it on video. Yeah, awesome. Exciting. Okay, I'd like to see a live session of debugging. And we're starting the pods now, so it should be back up again soon. We joked around like, are you ready? We are not ready. Well, I mean, we are kind of ready, but we were not ready for this. It's okay. It takes a few minutes in any case. So we should all log in again from Meteor or from Jupyter Hub directly, right? Yeah, once the Jupyter Hub is back up, you can just go, just reload the page and you should basically be where you were. It shouldn't hopefully kill any pods, it should just pick them up, pick them back up. And if you have environment running, it should get back there very quickly. So now I'm waiting for the EOD interpreter to reconcile the PC and then we should be back up. Okay, thank you. That's why you take with you two experts doing the workshop. And I'm sure what I'm doing is maybe more exciting than watching the black screen. Yeah, we did not talk about OpenShift, so if you want to show something meanwhile. So right now, I'm scanning through. Because you will get access to it in a moment also. Since everything was reconciled here, so let's take a look at the Jupyter Hub. Okay, we don't have the PVC yet. We can just create the PVC manually. We don't need to wait for the operator to let me go to the Jupyter Hub and create the database it's creating. Now I was able to attach the PVC that we've recreated right now. And it should be starting any moment now. The hub should pick it up. This will time out. I'm gonna speed it up. It's Jupyter Hub. It's starting. We should get a new, now it's up. So now I would like to see if users can see the same, or if I fixed it just for myself. Seems to be working fine for me now. Francesco, you're muted. I'm sorry, I said, yeah. So thank you, Tom. Thank you, Vasek. I think we can go ahead if everyone is able to spawn. So just select your meter. So now you basically want to search for this string page to find your meter and select the large size and hit start server. And this will bring you up to a Jupyter blob environment using this image, this meter image right here. And if you're already forked or cloned repo should be already there, right? The repository folder. The one we just cloned because I see it. So should be there for everyone if they already cloned it. If they cloned it in the Jupyter Hub environment, they can't see it. But if they did not, they will need to clone it again. Did you do it? Lead me, please. Okay, you have it. Seems. What should I do now? Now I should proceed. Can I share again? Okay, so now we should add the problem. No, no, we use the, yeah. We actually use the GitLab extension. I guess you have it maybe or not. It doesn't matter. I want a new one. Yeah, well, maybe conflict if it's the same thing. But okay. Great. Thank you. Thank you, Tom. Thank you, Tom. Thank you, Anand. Thank you, Vasek. So hopefully we are all the same page now. I hope so. I will check here. Everyone was able to clone the repo. You should have one repo called Elaira AI DevSecOps tutorial without any date. So just be sure that you have this repo. And remember, you can clone it. You just need to go to your fork, copy this link, and you can enter it in the Jupyter Hub extension. I see a few of you still did not clone it. Am I correct? I think most of us are still spawning Jupyter Hub images. Okay. So I see one, two, three, four, five, six new Jupyter Hub servers. So if anyone has any problems, please let us know. Otherwise, we can do the next step. We basically finish this step. Meanwhile, we can go ahead and start to talk about the next one. So the next one is basically what you do typically in your project is to start creating your notebooks. So the notebooks for this specific tutorial have been already created. And if you open basically the repo that it was cloned, as I said, because of the structure that we use for all the projects, it's very easy to find the content that we want. In this case, we want the notebooks. We know where to find them. And in particular, we're going to use two notebooks for this first part of the tutorial, which is more, let's say, focused on the machine learning model development. So these are two typical steps, very easy in this case, to just do download the dataset and train the model. So it's some misclassification, as we mentioned before. So if you go to the download dataset notebook, you will see that there are simple steps. It's basically importing the libraries, loading the data and storing this on Minio. As you can see, first of all, there is no cell that mentioned PIP. So we don't want anything like that to be present in the notebooks, because that is not something that you can reproduce. So that is why we created the Jupyter Lab Requirements extension. And this is actually a very important point for us, because we want to allow the others, as I mentioned before. We don't want them to use PIP install in the cells, because it's not safe. It's not also safe to share these notebooks that have PIP installed, because they cannot reproduce them. So we created this extension. And this extension, actually, what it does, it takes into account about the requirements. So if you have any requirements for your specific software stack, so if I want TensorFlow, for example, Matplotlib, Boto3, these are very common libraries that you use in your machine learning projects. We also store actually the PIP file lock. So all the things I mentioned before, you have always the direct dependencies, which is TensorFlow, Boto3, and Matplotlib, for example, but then you have all the transitive dependencies. So all the ones that come with these packages, and these are important because a change in each of them can basically break your code. And with each of the versions that are stated, actually locked in that PIP file lock, and you also find the ashes. So if you want to know where this specific package came from, which is, of course, important for security. And in each of them, you also know which kind of resolution engine was used. You can use a TOT or PIPM at the moment. So we support, of course, both of them. And we have also a configuration file that mentioned the runtime environment that you use. So in this way, we know that your notebook has been created on Fedora, or ReL, or UBI, or Ubuntu, or whatever kind of operating system you are using, in which version, and the Python interpreter you were using. And how do we see this? So, of course, this notebook has been already created with this in mind. And the extension is present in all the images that are created through Meteor and are available on Jupyter Hub. There are three ways to interact with this specific extension. So the first one, actually, is the most common, is the one that allows you to work directly in the notebook cells. So we're going to see in a moment what you can do with that. The second one, if you want to integrate this in, I don't know, pipelines for specific steps, or you want to check dependencies, or if your notebook has dependencies, then you can use, for example, the CLI. And then you have also available the UI, of course, if it's something that you prefer, or if you want just to work on the notebook cells. So if you want to have a look or try, basically, this is the UI. So the UI shows actually the library that I wanted to have. And as you can see, there is TensorFlow, Boto3, and Matplotlib. And in this case, this is what you state. So what you would do with PIP, for example, in the cell, it's not something that you want to do because you cannot save this information. But here, when the dependency management lock, actually the dependencies, we store everything in the notebook. And you can see this in a not maybe friendly way here, because in the notebook metadata, you will find all the information related to the dependency resolution engine that was used, and the requirements that were used, the one that I just showed a few seconds ago. But there is also all the requirements locked. So in this way, any other developer or any other data scientist that want to try this notebook knows exactly what needs to be used and which dependency needs to be installed. And another easy way, if you want to try, is just to do auto check, which is the magic commands for the Jupyter Lab requirements library. Now auto check, just verify that your notebook basically has everything that is needed from the dependency point of view. So that you have a dependency resolution engine that was used, the requirements, requirement lock, that the PIP file and PIP file lock are also corresponding because they have an hash that is basically matched in order to have the correct one. And also if the kernel you're using is something that is already present. If you want to see in a more friendly way, maybe the content you can find, for example, the PIP file, you can use auto show. So you can see the PIP file that was, it's basically the note, the metadata data shorter saved in the notebook metadata or the PIP file lock. And you can see exactly what is stored in this notebook. So in this way, basically the notebooks can be shared safely with others. And any of you can reuse the same notebook that they have and run them without any problem. I'll just go and see if Karan has a question. Yeah, just a quick question. How do I set my resolution engine to thought using Horus? And what benefit do I get if I use thought engine over a regular PIP N? Thank you. Thank you for the question, Karan. So if you want to use the, so if you use the UI, there are two options. So when you, from the UI, actually, when you install the install button, consider both the resolution engine. So first tries for thought. And if thought has some issues in, for locking these dependencies, it can fall back to PIPN. While if you go actually from the Horus magic commands, there is a Horus lock. And Horus lock basically does the same thing. But by default, use thought services to create the software stack for you. And you have all the option related to the random environment. So if you want to use a specific, if you want to receive a recommendation from a specific operating system or Python interpreter, so this is something that you can do, while PIPN will just be a simple Horus lock. And it's going to use PIPN by default. And when everything is created by PIPN or thought, it's going to be stored into the software stack. The advantage of thought, of course, is that it knows about the random environment. So thought itself can discover where your notebook is running. So it's able to identify if your notebook is running on a certain operating system, what kind of hardware is used. And this information are stored only when thought is used in that case, because thought can basically give you the recommendation specific for the random environment. That is not something that you can do with PIPN. So thought can specifically say, this is the package you should use in this operating system, because we already tested it. The knowledge graph of thought already has this knowledge. So it knows that you cannot run a specific version of TensorFlow on Ubuntu, on UBI, on Fedora. So we already analyzed all of this knowledge and thought has it and can basically share with all the developers that use this service. Gotcha. So basically the optimal set of dependencies. Yes. Basically optimize the software stack for your random environment. And if you are interested in specific requirements, like performance or security, then the software stack will be slightly different because of some specific information that are stored in the knowledge. Thank you. Thank you for the question. So this is basically how we created these notebooks. In this case, we don't need to recreate the dependencies because the notebook have been already created with this in mind. So for the tutorial is a good example of this. When we started working on this tutorial, me and my colleagues were starting to work on these notebooks and we already created a dependency for the notebooks. So we know that this is what has to be used for this notebook. And in particular, with the use of Meteor, Meteor is able to identify these environments and can basically automatically create them. So in this case, the notebook can simply run because the environment is already using the optimized software stack that was provided and it was installed directly by Meteor. So let's leave the notebooks clean. And I will tell you later why. So if you want just to try these commands, you can. But please leave the notebook as they are for now. The other notebook is the training one. So these are the two notebooks that we're going to use. So this notebook is basically going to retrieve the dataset that was downloaded and is going to train the model. So here we set the parameters. We divide the dataset into train and test. We create the convolution on our network and basically we train it. And once everything is done, we store it on the cloud storage, in this case, Meteor. So let's go back to the part. So I think we were just following this section. The next steps is how do we push changes on like we are in this environment and I want to share back. So this is typically important when you work on project with other people in the team. So if you're working on the same project and you want to share, what we do of course is push the changes on GitHub so that everyone has always the most updated version of the project and we can keep working just to recreate these environments which have the latest versions and everyone will be basically up to date with this. Let me check where we are at the moment. So we can consider this step. I mean there was not meant to be more for you to understand the concept and why we want this to be present in every project and why we want everyone to start using this kind of tools because these are important for reproducibility and especially when we work on open source and we want to share this project, then we need to allow the others to repeat these experiments. So if everyone is happy or has no question, more question about the dependencies, we can move to the push change part and this is where actually you will need your token. So let me go back to the book so we can always follow. As you see, Meteor gives you these two environments. It is very easy to move between them and you can always see follow the tutorial and at the same time apply what you're learning in the tutorial. So there are two ways to push changes. The Jupyter Lab Git extension is the one we are using at the moment. So if we go actually here, you can go to this tab, Git, and you can see actually what happened. So the notebook has been modified because the daytime and some initial configuration from Jupyter is changed. We didn't change the content itself, but of course there are some different in terms of the timestamps and the moment we were like opening them. So because this information are automatically saved, so they appear as modified and with this extension is very easy just to push the changes. So I imagine that I want to push my two notebooks or just one of them. I can go to the plus button. There is a stage this change, which is basically what happened when you do Git add a specific file. And when they are staged, then you want to commit them. So you want to actually commit these specific changes and you just need to basically say something regarding this change. So in this case, update, for example, download dataset is my commit. I modified a specific notebook and now I want to contribute back to my fork. And if I'm happy, I can open up a request in the main project. So what you do is just commit. This is something that is requested the first time. You basically use the Git extension. This is also, if you use the Git on your machine, the first time you need to state the name and the email. So this is something personal for you. I don't mind sharing. So in this case, I put just my name and my email that are linked to the Git account that I created. And as you see, once this is done, the commit has been created. But now it is not still pushed. So what you have to do to push is just to click this one. So if you see there are changes that needs to be pushed, you can just click here. And now you can basically open and say, this is your credentials. And maybe I don't want to share the token with everyone in this case. But please set your token there. I will add it in a second so you don't have to see all the tokens. Let me share again. So you see I have my username of my GitHub account and my token is already there. I hope everyone was able to create it and they can use it now. So you just do okay. And then it's pushing the changes. And now if we go back to our repo, we should see, like, as you see, the commits were changed because there were some changes that have been pushed to your fork. And now you can basically contribute, if you want, to the main repository. And if we go here, you will see that there are changes that I just made related to the notebook. And this is basically how we do, how we share our work and how we contribute back to the main project. And as you see, this is all very easy and automated using these tools. We can keep working on your note, you can keep working on your notebooks, modify things, you just move them to stage and you just go through the UI and you can do everything about this. There is another way to push because actually there is a limitation for this extension, which is that can handle only one repository at the time. And later, I will show you how to do this directly from the terminal of JupyterLab because you have also the possibility to open a terminal and do things. So you have one environment for doing everything. But we will see it later when we need to deploy the model that we're going to create. So now there is next step. I will say that this part can be considered done. So you know how to push. I hope everyone was able to push. If you had any issue with the token or username, email, please let us know. Okay. Thank you, Urbashi. So if anyone has other question, please let us know. Otherwise, I think we can move to this very quickly. We don't go through these steps because as I said, we don't have the time to go through all of it. But I will show you very quickly how you can do and enable the pipelines for you to work. And we will trigger one. So you will see what happened when this happened. So let me go back here. So we already talked about some of this in the presentation. We have some tooling and we use this tool to automate continuous integration, the delivery of images. And this is done through the AICCI. So there are different checks that are run and the AICCI is also able to create the images and push them to the registries. How does it work? The AICCI can be installed easily from the GitHub marketplace. So you have the AICCI, you can go and just install it on your repo. And what you need to configure in your repo in order to run it is just one configuration file. So if we go back to the repo itself, you will see that there is one configuration file called AICCI. In the AICCI, you can basically state the type of images that you want to create. And in this case, for example, we created one image for the download dataset notebook. We have one image that is optimized for the training one. And now you see the importance of having a specific software stack built for the notebooks. Because if you want to basically create, use these notebooks in pipelines, for example, you can create images that are optimized for the notebook. You can imagine that, for example, the download dataset requires maybe latest recommendations. So you can use that and optimize. But the training one might require performance. So you want to train your model using GPU or you want some specific version that is optimized for that. Then, thanks to the services that TOT can provide, for example, you can do this. And if we integrate those services with the AICCI, then you have an automated tool that can create images for you. So as you can see what the AICCI expects is just a base image that you want to use for that specific image to create the type of build strategy to use. And then the registry when we want to push. So this is we use Quay for the registry where we stored all these images. And this is something that is the only thing required from the AICCI point of view. And of course, you can use your own registry. This is something we use. But if you go here, we'll find more information about actually how to create your secrets, how you can add them to the AICCI and how you can use the release pipelines. The second thing you need, actually, if you want to integrate, as you followed the presentation before, we want to automate this. So we don't want to take care of it. We want the bots to do this for you. So what we do, we have another bot on GitHub marketplace, which is called Kebut. So Kebut is the GitHub integration and interface to the actual Kebeshet bot, which is the thought integration for the GitHub application. So it's basically taking all this information that is stored in your repo. And what you can do, it can actually make release for you. So you can simply open an issue. And this is what we are going to do in a moment. So you will see how it behaves. I hope everything is running on the background already. But basically, I modified something in the project this morning. I changed just one image because I want to be rebuilt in one of the steps of the pipeline. And what I do is just opening any of this patch release, minor release, and major release. So let's imagine that I want a patch release. I can just submit the issue. You see that one of the bots is already assigned to this issue. And what happens on the background is that the GitHub app is taking basically this event. And it knows that this repo has some specific configuration, which is the one for thought in this case. So thought uses the same, sorry, the same is also as a configuration file. And this configuration file states the same name for the specific steps that they're going to use, because they are in a way connected. This is the software stack optimized for this step. And this is the optimized image basically that is created from this software stack. And as you see, there is also the download data set and the training data set. And here you can state the operating system that was used to create that specific software stack and the recommendation you want. As you see, we want latest, for example, and training, we want the performance one. And all the processes completely automated for us. So as soon as we open this issue, the bot will take care, look at this configuration, and it will start to create the image. As you see, it's already done. So it opened up a request for us and say, okay, this is the changelog. This is what has been changed in the last, well, I made more than one change in the last few days. So now I'm making a release and the bots is doing this for us. So when this is automatically merged, actually, the bots will take over and merge this automatically just to speed up the process. We can manually merge it. And when this is going to be merged is the bot will basically create a tag on this repo. The tag will be version 0.12.1. And we should see it here, I hope. Otherwise, we can manually create it. And what will happen is that actually it was created. As you can see, this is a tecton, the tecton UI. And here, the ISO ACI is the one that is doing all the work for us. So the tag is the tag release that I just opened. So the pipeline itself, what it's doing is just checking what is happening, all the requirements that are in the repo. If there is a type of overlay build, so overlay build is when you have more than one software stack to be used in your project. And what it does, now it will start building all the images for all the different software stack. And it appears there is some issue or but we can check later. This is basically how all the pipeline take over. And actually, no, it worked. It already started as you see the different images. So there is one for download dataset for the experiment inference and training, which is exactly what we saw in this configuration file. So there are four images that are going to be built. And this is the overlay strategy that we use. And as you see, I didn't do anything. What I do during my daily is just having a look at the project, modifying, creating notebooks, but then the rest is completely automated. Once I push, then the bots can take over, they will maintain my dependency, they will create the release for us. And I don't know if you have any question related to this part, but this is the one that we are skipped today because requires a little bit of installation, but that you need to modify or you need to have your own registry for this kind of things. So we just created the images for you. And now we move to the part that we create the pipeline. So if you have any question, please let me know. Otherwise, we can move to the more interactive part, and you're going to use a Lyra and you're going to create AI pipelines, and then we're going to deploy the model in the cluster. And you're going to also see what happened in OpenShift. So if there are no questions, then I will go ahead. I hope everything is clear and you get all this concept that are important for working in like, or working a project that is not just one project to work by yourself, but this is something that you work with many other people. So this is important to have an infrastructure and automated task that can release some of your work and you can focus on specific things. And you allow the others basically to reproduce your work, to add feature on top of it or to modify things. This is something that we can, as you can see with all the tooling that is available on OpenShift and OpenData Hub, this is quite easy to do. So if everyone is good, I would go ahead and just start with the AI pipeline. So let's see what is Lyra and what we want to do. So Lyra is, as I said, the JupyterLab extension that allows you to create pipelines. And these pipelines can be run on Kubeflow pipeline, which is backed by Argo Tecton usually, but in this case we use Tecton. And how do we create pipelines in the Lyra is quite easy. So once you have your notebooks and they are ready, then if you want to create, we have the pipeline already here, because the tutorial is already created. But if you want to try, what you do is usually opening a pipeline editor. Let's rename it as you want. Defconf.us 2021. And what you can put in this pipeline is steps. So you can put several steps, which can be not only notebooks, but can be also Python code. So if you want to insert Python code, you can. If you want to insert notebooks, you can. And you can also link these notebooks. So in this case, we don't require this specific step, but just to show you that you can do and it's very flexible, you can basically say, okay, I want to download the dataset, which is the first notebook I created, and I want to retrain the model. So as you can see, this allows you to have pipelines that you can rerun. As you know, the application lifecycle is not static. So every time you create an application, there will be maintenance to be done. There will be new, I don't know, vulnerability in the software stack that requires new changes. Or if you have some specific machine learning model, there is the data tree concept. So in that case, you need to retrain the model with new data. So these kind of pipelines basically allows you to automate all of this, because you can create some specific tasks just to automate that. And then this pipeline can rerun every time you want or every time you set the trigger to rerun this pipeline. So this is the easy way to create the pipeline. What is important now is, as I mentioned before, for each of the notebooks, we create optimized stacks. And why did we do that? Because we want to use images specific for that. How do we do that? So in Elayra, you go on the top, which is called the runtime images. So there are images that you can use to run these notebooks. So there are environments that you can use or run on top, you can put this notebook and they will run on this environment. So now you see the importance of having notebooks and dependencies and why you want this environment to be basically reproducible. So in this way, you don't break the steps or it's very, I mean, it's impossible to break these steps. So let's add the runtime images. And now we can go to the spreadsheet and you go to the second sheet that I showed you before. And as you can see, there is runtime images section. So this section shows you the two runtime images that we're going to use. So you can pick up the first link and we go to Elayra and we call this onload, dataset step, image name, this one, you can save it. So this is my first runtime environment, runtime image that I created. And then I will repeat the same for the other step. As you can see, the image that are going to be created by the pipelines, actually something that we already provided for you. And if we go to the second one, second link, let's say that let's take the training one, the image here. Now you have also the training step. So why we want to do this because Elayra requires to have a specific runtime image for each of the steps. It's something that you can configure from here. And what else you can configure is the resources you need for your task and environment variables. If you have some output files and how do you want to call them or where you want to store them. And you need to do for each of the steps. So if I have 20 steps, I need to configure these things. And also the UI, of course, will say that there are some inputs missing, but you need to do this for each of the steps. So just to speed up a bit this part, let's save it. But let's go and see the actual pipeline that was created because there are some configurations that were already set for you. So we don't need to do to reset them. As you can see, the resources are already created, the environment variables are set, the output files are already set. What we need to do is to select the images. So we created the load dataset step, runtime image. And we can do the same for the training one because we just added the image. As you can see, there are different resources configured for the different steps. The only thing we need to modify in this case is the environment variables. So this is set to use S3, but for this workshop, we are going to use Minio. And what you have to insert is the object storage endpoint URL, which is something that you can find in the spreadsheet. So let's copy this and let's go back here and be sure to remove all this part and just paste the URL of Minio and object storage name. We can create one and let's call it fconf us 2021. We can have a look at Minio and we can actually create this for you if you ask 2021 or I already set it here. Yes, it's here. So let me create it for all of us. So we have this new bucket here and we need to add it here. So remember to make sure that there are no typos and anything. And now we can basically save it. So what is missing? So we have the images. We know each of the steps which kind of image resource requires. Now we need an engine to run this pipeline. How you can do this in Elayra? This is also part, of course, of the Jupyter book. So as you see what we did just a moment ago is just to add the runtime images, the tutorial downloads data step and the training data, training step. There's also CLI if you need to do that. But what we want to do now is to set the engine that is going to run the steps. So what we do? We go here and there is run times. How do we create these run times? As you see Elayra allows you two types of engines and in this case, select Kubeflow Pipelines. This is what we're going to use today. Let's call it DefConf. Where's Kubeflow Pipeline API endpoint? So this is something you find in the runtime section in the spreadsheet. So just take the first link. This is the link to the pipeline on Kubeflow. You can also open the link if you want and it will basically redirect you to the UI of Kubeflow and you're going to see that there are already some pipelines that have been created here. And Kubeflow Pipeline Engine, as I said, it can be Argo or Tecton. In our case, we use Tecton. And now we need to set the cloud object storage. So the endpoint is always the one for Minio. So you can just take this one and add it here. The storage name is also here. So you have Minio and sorry, username and password and Minio123. It's a very secure password. Minio123. Minio123. Yes. And the object storage bucket name, which is the one that we just created for the workshop. So it's DefConf US 2021. If everyone set everything already, we can just save and close. And I will check the steps. So we are all on the same page. So we created the runtime images and we created the runtime for Kubeflow. The pipeline, we know how to create it and it's already created it. So let me know if you already have all these three steps, and then we can move to actually run this pipeline. And we will see what happened. So please let me know when you are done or if you have any issues. We are here for any problems. If you have any question, please let me know. Related to Elira, related to the object storage or anything that you want to know something more. Please don't hesitate to ask questions or the resources. Everything is described in any case in the Jupyter Book. So you can find anything that we just did described in the Jupyter Book. And if you are happy, if you reach this part, I will wait a few minutes. If you have any questions or issues, let me know. So just to repeat what we did, we created the runtime with all the information. If you need to modify anything, you can do it. Remember, we need the tecton and you need a Kubeflow engine at the beginning. So remember to select Kubeflow when you create the runtime. The other thing we did is to add the runtime images for the steps. This is quite straightforward. As you can see, it's easy just to add the runtime image. And then we basically adjusted the options for the step. The download dataset step is the runtime selected for this and everything is already set. While in the training notebook, remember to adjust these two environment variables to use Minio. And the name of the bucket, just DEVCONFUS 2021. In this case, of course, we use the training step. So I see some of you are there. I don't know about the others. Please let me know if you are stuck somewhere or there is something that we have to do. Don't worry if you miss anything here because these two parts are, let's say, not linked. We can still continue and we can also finish. The part is not a problem. So if you're stuck anywhere, just let me know and we can start or repeat anything here. I'm sorry, Thorsten. I hope you solved the connection issues. And please let me know if you have any other problem. Otherwise, I would go ahead because we have 40 minutes left, I think. Well, if you have any problems, please let me know. I cannot see your environments or what you're doing, unfortunately. So if you can give me a yes or no, we can move ahead. Meanwhile, so what we want to do now is to run the pipeline. And as it is described here, running the pipelines, once everything is set, it's quite straightforward. You have a play button. Just call the pipeline as you want. I can use my username and runtime platform. You want to use Kubeflow. And of course, it's the one we just created, the Confuse Engine. Once you are ready, you can just click the button. And this will say that the pipeline has been submitted to Kubeflow. So if you go to Kubeflow itself, and we go to Experiments, and maybe we reload, you see, I start to see some of the pipelines so this is the pipeline I just submitted. And what we'll do is just repeat the same or run the two notebooks that we show before. So I'm going to do these two steps. But as you can imagine, you can do very complex structure for your pipelines. So the first one is just downloading the dataset. And if you go to Minio, we should see already some of the pipeline. As you see, it's storing everything in the object storage. So there is the dataset, the data that we need for the training. And the second step is going to take this data that are just downloaded, and it's going to train the model and at the end create the model for us. So as you can see, this is, again, all automated. I see only two pipelines running for now. So if you have a problem. Francesco, there's a question in chat by Pat. Where is it? Here? No, the manually maintaining pipelines. Did you answer that one? No. Sorry. By manually maintaining the info about running the images, aren't we breaking the automation chain? Can you elaborate a bit? Pat, would you like to join us here and maybe explain the question a bit better for Francesco? Hello, Pat, we cannot hear you. Hello, Pat. Almen, having an issue making the request. Not yet. So Almen. I had the same problem at the beginning. For running the pipeline? With the mic. No, no. With the mic. So Almen. Okay. How about now? I will come. No, we can't hear you. Go ahead. Yes, that's good. Okay. So yeah, I was asking. So in the steps, here you, in the pipeline, you have to go to each of the notebooks and configure manually the specific runtime image that you wanted to run that type, that notebook on like one by one. And this means this, this is stored in the pipeline configuration and it will not change, let's say, if the runtime image is updated later at some point, how will the pipeline itself get updated automatically or do you have to, that's the question. Is there a way to keep this updated automatically, the pipeline as well? You mean the AI pipeline? Yeah. So currently the AI pipeline, I mean, in theory, I mean, you reach a certain stage of your project. So at the moment when you need to release, we basically, once the tag is created on Quay, the pipeline is able to update automatically the tags that are present in the manifest in the repo. So what we want to do is also to automate this thing in the future, because we know already the tag and basically the source of truth is always the GitHub repo. So if we released a new version as I did today, then it means that we need to use in this pipeline the new version of the image that we just created. Of course, if you modify anything related to the software stacks, in this case, or the notebooks, in this case, we're not modifying anything. So we are using the version that we know that is working. So we know that we are not breaking, of course, the pipeline. But in the future, we might want also to automate the tag automatically. But in theory, the pipeline that we're using right now is using the HSE version v0.11. And since then, we didn't modify anything specific to the notebooks or something specific to the code. We're just updating documentation, so we're not breaking that. But this is a good point. Actually, it could be a good feature request, if you want, for automating that. So thank you, Pep, for the question. I hope the answer is also your question. Thank you. Thank you very much. Thank you, Pep. Almen, so error-compiling pipeline, Elaira, okay. I don't know if you can come here, Almen, if you want. Or if you want to share the screen, if it's okay for you. I mean, this would make things a bit easier for us to see where is the issue. If not, there are a few things I would check. So there could be a few things that do not work at the moment. So I would check or recheck the runtime if you're adding the correct Kubeflow pipeline API endpoint that you put tecton. Because maybe the compilation is really due to, maybe you didn't select tecton. So if you're using Argo, we cannot run that at the moment because we have only tecton here. Please check also the URL for the Minio and that password username, everything is selected accordingly. And let me know if there are any issue here. If there are no issue here, please let me know more about because I should have a look at what you have configured. So meanwhile, if you were able to submit the pipeline, you should have your pipelines. And I see just two, but basically here you have the two steps that we just created. And this is what happened. And at the end, we just created a model for this application. So if you go here, we are basically at this step, we were able to run the pipeline. I know some of you were not able. Ah, okay. So I hope it works now, Almen. And if you are able to run it, then now that you have a model, what you typically do is to let's say create an image for it and you want to deploy it. As you know, there are different type of deployments for the models. And I think we can go to the employer model. So as you see, this tutorial, we want to be quite flexible. So we want to show that there are different tools that can be used. So Seldon, TensorFlow RT, cave serving. So there are different type of deployments. For this specific workshop, we are using a simple flask application. So it's also something that you can do. So let's go to the deploy your model as a flask application in this case. So in order to create the flask application, we had to create some flask app. So as you see here in your folder, you see that there is this file that states basically what the application is doing, the endpoint that we are going to expose. So there is the predict endpoint and the metrics because we want to see what is happening in your model. And the model itself is created in this specific repo. As you see, everything is stored in a specific folder, so you can immediately find it. And here is basically loading the model if you want to use Ceph, so the model that we it's created on Ceph. Or you can just store the one that, use the one that is stored locally. As you see, there are models and everything is stored here. So how do we deploy the model? Maybe Tom, you want to step in now. If you want to talk a bit about ARCO CD and what you did and why we want to use ARCO CD. I can leave you the stage or presentation if you want. Or if you want me to do the steps, why you explain as you prefer. Yeah, maybe you can start doing the steps as I talk. Perfect. So this again, in the initial presentation, if you remember, I was talking about operate first and how we're trying to operate things in the open and automatically and in a way that is basically accessible to anybody and transparent. That means we're using tools familiar from the development world and just repurposing them for operations. This is a very big trend currently in cloud native computing world operations. And that is using things like customize YAML manifests, ARCO CD helm and other tools that basically facilitates deployment declaratively for you. So ARCO CD is a tool that manages application lifecycle for you based on GitOps principles. So you have basically a Git repository where you track your manifests. In your case, this is the LIR repository that you've forked. So if you go to, I don't know, IDR GitLab or GitHub of your fork of your repository, you can see manifest folder in your repository. And this is where this particular repository is storing its deployment manifests and its deployment specification. So you see Kubernetes resources, entities that can be deployed to a cluster and the cluster will automatically detect their type and act accordingly. And since we have those manifests defined here as a basically as a code, as a software, we can automatically deploy them through ARCO CD. So that's what we have our ARCO CD instance for. And we have this workshop apps repository. So we will be using this repository to create our own application for this workshop purpose. And then we will later experience how the application gets automatically detected and deployed for your particular fork. So Francesco, you can go ahead and maybe comment out on what you're doing currently for your tutorial or for demo. Thank you, Tom. So as Tom explained, actually this repository was created by Tom. And this repository is the one that we're going to use for the next step. So the next step is to fork the workshop app. And actually, yeah, we need to clone it. So what we're going to do is go to the to this specific repo, which is mentioned, as always, in the data you have here. So I will pass it here. Maybe it's easier. So this is the repo for workshop apps, which is maintained by ARCO CD. So go into this link and fork your repo, fork this repo, sorry. And once you have this fork, as always, we know how to do this. Remember at the beginning, I mentioned that the GitHub extension can just manage one repository at the time. So now we are going to see how to do this directly from the CLI. Directly from the CLI. So what you can do is just open a terminal and run git clone of your fork. Once again, go fork the repo and just take the URL here. And then you can just clone it. Now, as you see, I have also workshop apps. And the only step that is described here is to run a specific file that was created for this task. And if we go here, we can also see it. So as you can see, this is a very simple script, my script that is creating a manifest resource. This is a kind application, which is consumed by ARCO CD and ARCO CD will deploy an application based on it and reconcile on it. So as you can see in the source repository URL in the manifests, it will basically point to your repository fork. And on the one line above, there's a path. So it will try to resolve this path and look for Kubernetes or OpenShift manifests in this path. And we are creating just a flood list of applications in the workshop's apps repository. So if you run this script with your GitHub username, as Francesco is doing right now, you get apps.devconf.us 2021 and your application. So this application is an application resource pointing to a repository. So what will happen when we create a pull request with this resource is that we will run some CI checks on it, as we usually do in operate first. And then we will basically merge this PR. And once the PR is merged, ARCO CD will pick up this application resource and deploy the application for us. And we don't need to care about anything else. Then just basically updating, if we want to change something within our application, we can just go to our paco space slash Alira AI DevSecOps tutorial repository and change the manifest there. And everything else is taken care of by ARCO CD and by the cluster itself. So we have automated our application deployment fully with just changing our application manifest in GitHub repository and nothing else. No other interventions are needed. Thank you, Tom. So as Tom mentioned, we just need to run the script. So once you clone your repo, you can just move to the repository. So you run CD and the name of the folder. And then we can run the script with your specific username. And once you're there, you can see with Git status that there are some changes in this specific GitHub repository. And this is just to show you what is the content and how it was modified respect to the content that was there. So then what we do, as we said before, we are doing the same changes now. So we move the change to staged. And once this is done, we can do Git commit. In this case, we are repeating the same thing. We are adding the message from here. And then as you see, in this case, also this was never done. So because this is a clean environment that was created, so you need to add this configuration to Git. And now we can probably, yes. And then once you're done, you can do Git push origin. In, because the name of the, and this is before, you just enter your, why you need to enter my token. So I will stop for a second, take my token. So in the meantime, if you are just watching our workshop right now, you can go through this link and there you will be able to see applications popping up for once we basically merge those PRs, how to merge those PRs from the workshop's apps. So I see Alman you're having some issues. Let me check where it's all of you. So I hope you were able to do this and to push your changes. I did. So Alman, don't worry if you want to, I think this environment is still open in any case, right Tom? Yes. So they can still do this and if they later stay here, we can also continue a little bit if it's necessary. So now you see that I pushed. So the same thing that we did before, but with the GitHub extension, now we can, we did it from the terminal. So you know how to do it in both cases. And now I go to my fork and I should see that there are changes and I can just open a pull request. This is the manifest that we just created and let's open a pull request. So now I open the pull request and someone should approve it. And let's see if there are more pull requests or I see that there are two pull requests at the moment. So right now we're running some CI checks on the pull request. So if you take a look at the pull request details, you see we're checking a customized build, which is not necessary for this particular pull request, but we're also checking a pre-commit, which is validating your manifest against OPA policies and checking if you're trying to deploy something malicious, like something else than RBC application and so on. So once this checks, once this checks passes, we will auto approve this PR and it will get emerged. And then we should see the magic, right? The PR needs to be okay to test. No, okay. Diego, I can tell you why. Did you install the kebut, right? And then I guess you were following these other sections, right? So to actually add the botus collaborator. So this is not something that will be required, actually. It's still here, but our team is working and by the end of this week, this requirement will be no more here because we don't want the user to wait for us and for the bot to be accepted, but this will be not required anymore, so you can also skip this step. And if you have everything configured for your registry and the rest of the things, this is something that also the issue needs to be enabled if you want to use the bot. And you will see that the bot can open issues for you to see if the dependencies are up to date, if you miss any configuration file, and everything related to the dependencies. But this specific step is something that we will actually remove very soon. So don't worry about it. This will not be required anymore. You're welcome. So in the meantime, CI passed on Francesco's app and I've approved CI on all the other PR we have here, so there's CI still running. Now it passed, so it will be auto approved in any minute. Yes, now it's auto approved and the bots will act upon it. And now if you refer to your page right now, I think you have a stale application in there. Yes. Oh, yes. So I should see appearing my application. And this will sync up automatically, so it will create the application resource. This is what's happening right now. And when you click on the little button beside, yes, beside the checkbox, you get redirected to your application, which will create all these resources, which were found in your repository fork of the Lira AI DevSecOps tutorials. And that basically means that all these resources gets deployed into OpenShift. So can we see this? So if you want to see what is happening, you have also the link to OpenShift. So you can just open the link. You will probably need to log in. We'd operate first, as always. And we can go to the namespace, which is called devconf.us, or no, this one, right? Yes. So if you go to pods, you see that all the things that were described here, actually now are present here, right? So we have the two deployments. One is for Tom, and one is for me. Sorry. But you see that there are two deployments, and this is exactly what we find inside the cluster. So as you see, we didn't do much. The part of the deployment is completely automated. What we had to do is just to provide the manifest as the first time, but then this is completely automated. And when the manifest are updated, Argoscd will redeploy the application. So if we change the tag in your repo, then this will be basically redeployed. So if you go to routes, let me see where is everyone. Or if you're still stuck somewhere, please let me know. Or if you want me to... Okay. Thanks, Urvashi. Sorry, I didn't see the... So we have two applications waiting on the CI. One will be merged very soon. And the other is pending the first CI checks. So let's give it a few minutes. Yes. So basically now that if we did this part, verify that Argoscd did this. Now what we want to do is to basically test the application. So we want to test that the deployment is working, that we are able to receive the actual predictions. And we also want to have a look at the metrics. So we want to see that everything is working. So actually, yeah, this is something that we already... Now we have one more PR merged. So in Argoscd, we will see another application popping up. It's already synced before I was able to refresh the page on my end. Refresh, that's very fast. And so we are able to log in. We can check the deployment is successful. And now we will move to the last part. So you basically were able to deploy now. And I see that also other were able to open pre-quest. So I guess Diego is next. You see that we have three deployments now. And we see that the deployment is almost ready. So let's wait for Diego to be there. And then we can move to the last step. Let's see where we are here. So we were deploy your model, class application. I want to test the model. So to test the model, I will just explain some of the things. And we do have a notebook actually. We can go directly here. Maybe it's easier to explain. So if you go here, the notebooks, we're going to use this notebook now. The test deployed model. So as you see, the first part is just importing some libraries. So you can start and just run it. Then we need some specific configuration to be done for logging into OpenShift because we want to reach those deployments. And as we are in a different namespace, we cannot, I mean, there is no service account or way to talk to the other namespace from here at the moment. So what we do is we're going to log in and we can basically run the commands. So in order to log in, let me see if Diego was also in now. Not yet. Okay, but okay, Diego, you can already go there because that's something that we don't manage. Yes, okay. So meanwhile, you open this test deployed model, you go to work of links and we go to OpenShift API. So we can copy this. You can go to the notebook and change this for the API of this cluster. Then we need the token. Where do we take the token? So if you were able to log in, in order to get your token, you can just go here and go to copy login command. You will see my token now, but that's fine. So you can take your token and you can put it. It's not something that you do with others, but the purpose of the workshop is okay. So you are logging in, as you see. Yes, I don't want this, so let's add this here. I'm going to add this, the login, and we should log in in a second and try again. Okay, now it worked. So now we are logged in. We can see in which namespace we are, we're in the default namespace. And now we want to move to the one of the project. So it's called, we can check it here. Project is Elayra AFDFSecOps tutorial. So it's the repo. So we go here and we enter Elayra AFDFSecOps tutorial. And now we want to see the pods. So the same thing that you see here, you can have a look from the CLI of OpenShift. And now we want to get the routes. Each of us have a different route. If you don't want to take it from here, but you can find it here. If it's easier, you can go to OpenShift and routes, and you see that there are different routes for you. Diego is coming up. So the route would be available as soon as the deployment finishes. But meanwhile, you can, if you want, you can just take one of these and change with your name because it's the only thing that is going to be different. So you can do like this. Now we should be able to test it. So let's close this. Let's see where is the deployment. Let me log in again. Let's see if Diego finishes. Yes. So the routes should be available right now. Do you know what happened? Diego, you can just take any of the other routes. It will work because we are all able to log in into this project. So you can also copy mine. I can share with all of you. And then we can just send something wrong. That is great. I think I got this error also the other day, but the other were working. Someone tried it. Please let me know if for you is working. Meanwhile, I will pass it here so you can find it. The model and as anyone tried, because this might need some extra debugging and time to have a look. But what I wanted to show you also is another thing that is important for your deployments, which is the metrics. So if you have a look, if you take basically the endpoint, as we shown before, the endpoint has, sorry, this API that we created has two endpoints. And one is for prediction. And the other one is for metrics. So if you try to open it and add metrics, there is some issue in there. I'm not sure why we cannot reach them. If you're talking, I cannot hear you. So let me try again. I cannot hear you, Tom. You are muted. Sorry. I think this is caused by the service and service selector in OpenShift. Because it does not have the GitHub account prefix in the selector. So maybe we can solve it by unifying deployments without the prefix. So I've created a service and a route without all the prefixes of your GitHub username. Okay. Now the route to the point to all the routes will point basically to this particular service. It's basically what Pep mentioned. So if you take a look at this screen right here, all the routes are pointing to the service. And the service is not existent without deploying it without your name prefix. So that's a mistake on our end with prefixing the names. And the same happens when you go to the service. Each service has a selector. And this selector basically pointed to non existent deployment. So I think right now it should start working. Yes. Let me share if I can. Do you see my screen? Yes. So I tried now the route that you show me. Now it's available first. And if I go to the endpoint metrics, you see that there are metrics. And this one actually will show the version usually, which is the version of the deployment we are having. And also the endpoint that we are basically there are all the metrics for Flask. You see if the endpoint is taking the prediction. And if we go to the notebook now, as you see I just modified it with the correct route and now it works. So as you see we always interact with the DevOps, AI DevOps team as well. So also that the scientists interact with them. And as you see it was basically a problem of the route. And now you can get the prediction. As you see that the input image was zero. And the model predicted that is zero as expected. And you have some metrics also on the latency and the probability that was used. And you can also play it a bit if you want to try with other type of inputs. Or if you want to run more. Sorry. This is the correct one. It was predicting six. So this is basically all I think for the for the workshop. So once you're able to use the route, you can run the test. And we also had a look at the metrics. And that's basically it. So you basically show how to move from a project from zero. And how we moved everything. Or how we work daily with all these tools. And how we automate most of the things. So we can focus on your specific stuff. And the rest is pretty much automated by all the tooling that we use. So now we can answer all all the questions you have. Or if you have any trouble with anything. Or you want to go deep into any of the steps. Pep, are you using this one? And I would also add it to the spreadsheet. Please use this one. If you want to test the model with the notebook. And what you have to do in the notebook. I will just repeat. You just modify the cluster URL with the one that is provided in the spreadsheet. And then you just take your token. This is the one I have. So to take your token, you just go to open shift. And you get copy login command. You log in. And then the token will be displayed for you. You just do display token. And the only thing that you need to modify later is to make sure that you are in the correct namespace. So if you are not getting this, you can just move with the CLI to the specific project. And then in the test model, you just need to modify this URL. Which is the one that Tom just fixed. So now the endpoint is available. There were some friction between the service and the route. But now it's working. So if you want to test, you can just modify and say two and two. You will see that the model is providing you with the predictions. And this is also happening in the metrics. You will see that this is also changing because we are providing, we are using more the endpoint. So any questions or mine is not working? Diego, oh yes. Okay, Diego, are you, so are you using this one, right? The one that I pasted, the last one. Are you able to access metrics? Or you see the same thing that I see here? So Diego, did you add the cluster URL and did you put your own token? Okay. Does it say that you are in the project? What else do we need? So it's exactly this one, right? And yes, I hope, yeah, the first one is always something we need to run. Thanks, also Pep. Let me know, Diego, if it works. And Alman, if you want to explore this topic, this topic, you mean just the deployment part or in general, all the things we, or ARGO CD or what topics you refer to. In the repo, actually, where is it? So in the repo, you have all the steps that we just did. So if you want to repeat, you can always repeat, reuse the same URL. And you can repeat all the things we did today. And regarding deployment, if you're referring to deployment, if you want, feel free to open an issue and let us know what you want to find more into the tutorial. So also others can benefit from it. So if you want us to add more resources related to, I don't know, Flask or how to make this application or other type of deployments, okay, that's great, Diego. So, Alman, let me know or feel free to just open an issue here and let us know what you want to see more or if you want to have more resources linked to which part, actually. If you want to deployment, to ARGO CD or anything more, we can improve this tutorial. So that would be a great feedback, actually. Are there any other questions? I hope most of you were able to reach the end or at least to work through different parts because we made it in a way that you don't need to depend on the part before. So you can learn different sections and we are not stopped by any of the previous steps. And yeah, please let us know if you have any other questions or if you are stuck in any of the steps and you want us to help you or to tell you something more about those specific steps. And yeah, we are here, we can stay here. Or yeah, I would say other five minutes if you don't have more questions or if you have issue about the tutorial or you want us to improve it or some specific section, please let us know we are always happy to receive contribution. And the tutorial is open as you see everything is running in the open in the open way. Otherwise, we can only thank you for your patience. And I hope you enjoyed the workshop. I know it's not the same experience and when we are all together. But hopefully next time it will be all face to face. So if you have more questions we can also get coffee or talk about other things. It's always nice to be in these conferences also to meet new people and see also what you do because we are also interested in what you do in your daily jobs. So I would say thank you. Sorry, you also see some debugging session. So it was awesome. I think Tom Vasek solved it quite quickly. And yeah, want to add something more Vasek or Tom? I hope you had fun. Kind of hear you again, Tom. Yeah, I can just say until Tom and me it's thank you very much for being here with us. It was an interesting workshop from the debugging part definitely. It sparked new conversation in our team about fixing this part actually. Good to know. If you have one want to contribute to it. Good for that. From my end, thank you everybody for attending. It was great pleasure talking to you, presenting to you. I would love to see you in any of our communities. Please engage with us, engage with the communities. And we are here for you, for data scientists in the open source space. So I'd love to meet you one day in person. And if not, let's meet online at some GitHub issues or whatnot. Thank you very much. So thank you all. Actually, I think we can share the slide. Maybe we can put them in the repo of the tutorial in the documentation section. I think it's okay. We are open so we can add them there. So if you want to read the documentation and the slides, we will put them in the docs actually of the repo. So thank you again. Thank you very much. Thank you Vasek and Tom. It was a pleasure and honor to do it again with you. And otherwise we will see soon, hopefully. So thank you. Thank you everybody. Thank you everybody. Bye Tom. Bye Vasek. Bye. Bye bye.