 Welcome back, everyone. Thank you for sticking with us to this moment. We have two more sessions left. And next, we'll be having Olaide Joseph. And he'll be talking about transforming your Jupyter notebook to a reproducible machine learning pipeline using Kail. That sounds interesting. Let's bring Olaide up. Hi, Olaide. Can you hear me? No, I can't hear you. Let me see. Hello, can you hear me? Yes, I can hear you now. There seems to be some kind of echo. Not sure where it's coming from. Can you mute the one with the camera? Yeah, it's not appearing. It's muted. OK. I think it's better now. Thank you. So let me introduce you first. Now, Joseph is a machine learning engineer with a current focus on enterprise ML hubs. He currently works as an independent ML hub for cloud-mated ML hubs organizations. He has worked with ML hub organizations like Mavin Code and ARIKTO. Outside of machine learning, Joseph comes from an energy background, AirLangas. That's a big voice. His interest in energy lies in the adoption of technology to foster the energy-efficient operations. Finally, Joseph likes playing games and listening to dancehall music. That's interesting. We should talk more about the dancehall music after the session. Over to you, Joseph. OK, thank you very much. So I've been going over transforming your Jupyter notebook to a reproducible machine learning pipeline using Kailh, and I look forward to this session with you guys. So my name is Joseph. I have an idea like the host has said. I currently work as a contractor, ML hubs consultant with ARIKTO. So I work with the cloud-mated ML hubs company with products like Kailh, and he likes. So I'll be working on showing how the Kailh platform can be used to create reproducible machine learning pipelines. So to start with, let me just go over what this session is for, and if you are here, this session is for the GitHub scientists and DevOps with little on-line experience with Kailh flow. And what this session is about is we're going to cover, so it's depending on the time. So the plan is to cover this digital recognizer and notebook and Intel code chart and notebook. So depending on time, the focus for now is on digital recognizer. So we're going to look out for Kailh flow notebooks, specifically Jupyter Lab. Then we're going to look at the Kailh Jupyter extension with Kailh flow pipelines. So what this session is not going to cover, so this session is an in-depth machine learning or Kubernetes course, and this session is a Python programming course. So it's about the notebook will be converted into a reproducible machine learning pipeline. So it's the popular MIST or the digital recognizer notebook. It's on cargo for everyone, so everyone can have access to this. So about the notebook, so the question of this notebook was actually sponsored by my organization and it's an open source notebook. So you can actually have access to this notebook while I go over this session. So if you can see, this is the link, github.com, Kailh flow examples, Trima study under the digital recognition cargo competition. So I think I'll just wait for five seconds for people to just look out this link. Just the Kailh flow organization under the Kailh flow, you go to the examples repo, under examples repo, you go to the digital recognition cargo competition directory or folder. So pre-excited for this course, at least leads to knowledge about cloud computing environments like AWS, GCP, or Azure, basic understanding of cloud-dative architectures and community concepts like pods, controllers, nodes, container images, and volumes, etc. So familiarity with ML concepts, like algorithms, I expect those are attending the session to have a good knowledge of machine learning workflow. I mean, in machine learning workflow, you should be able to read your pandas, you should be able to read the CSV data, you should be able to like the feature transformation, you're able to do a parameter named model train and delight. So the bonus key flow as a service, bonus introduction to key flow fundamentals, that's like, by the way. So I'm not just going to key flow about you. So to start with, so I'm just going to give you some more of what Kubernetes is all about and why Kailh or key flow Kailh is like based on top of Kubernetes. So one of the key benefits of key flow Kailh leverages from Kubernetes is the potability, microservices and scaling. So when we talk about potability, we're talking about being able to like create a program somewhere and being able to reproduce the same program elsewhere, so not having to like have environment issues or dependency issues. So with microservices, what platform needs to like interact with multiple services and with Kubernetes, or cloud services. So we have scaling. So when it comes to scaling, we're talking about being able to like scale up the computing resources or use to scale down the computing which is depending on your workload. So this is like a brief overview of what Kubernetes tends to offer key flow Kailh. Let's talk about key flow itself and what key flow does. So key flow is an open source project that converges of cloud native and ML. So key flow is an easy ML ops platform for data science and operations team. It's a compute tool kit for an ML workflow, including data, model training, like parameter tuning, model setting and model monitoring. So key flow was launched by Google in 2017. It started at Google as a way to like externalize the TensorFlow Extended Experience, CFX, and it is Apache 2.0 licensed. So what kind of news does key flow aim to solve? So basically key flow aim to solve aims to like make your deployment of machine learning workflow on Kubernetes simple, portable and scalable. So the idea is that we need our Kubernetes as this orchestration power, this portability power and this auto scaling power already. So how can we leverage this power to like, so how can we leverage this power to make our deployment of machine learning workflow very easy? So which is where key flow comes into play. So key flow makes our data loading, verification, splitting, processing, future engineering, a parameter tuning, opportunity, observation and monitoring very, very easy. So we're going to go over the different tools and how key flow is like the all in one platform to do this and why it's like, it's best to use key flow. So basically, if you do your model building, you do it on Jupyter, you do it on Databricks, RStudio or Visual Studio Code, key flow offers you all this all in one. So if you do your model training, probably it's Python, SQL and TensorFlow, key flow also offers this to you. If you need to do a parameter tuning, you probably use SQL, Pistamine, AI or any scale, key flow also does this, auto ML, analysis, ML pipelines, metadata tracking and deployment. So key flow offers all these all in one. So it's just like your only one ML pipeline. So we talked about this key flow architecture. So basically in this session, this workshop, we're going to use the following components in key flow. We're going to use the Jupyter and the book. We're going to use key flow pipelines. We're going to use operators and we'll be looking at the central dashboard. So to go about notebook, the key flow has a notebook feature which gives you the same Jupyter notebook like field, the same Jupyter lab you have in your local computer, the same group of lab like field you have in your cloud internet IDE. So the key flow notebook offers you the same set of features. So we have Jupyter lab, RStudio and Visual Studio code embedded in key flow notebook server. So you can run any of these ideas to carry out your machine learning workflows using key flow notebooks. So we all have an idea of what Jupyter lab is because it allows more proprietary Jupyter notebook for Jupyter lab. So I won't really go into details what's Jupyter lab is all about. So it's basically just a web-based environment for creating Jupyter notebooks that's it. So Jupyter lab in key flow. So to go to the pipeline basics. So the co-focus of this session is talk about repressive machine, earthquake repressive machine learning pipeline. So before I go into earthquake repressive machine learning pipelines, what exactly is this pipeline itself? So a pipeline is basically a description of an ML workflow that includes all of the components in the workflow and how they combine in the form of an execution graph. So basically a pipeline is basically like your end-to-end machine learning process as a task or a step. For example, if you would like, if I, okay, just like, so if you would like to like create your ML, if you'd like to go through a typical model development cycle, a typical machine learning development cycle, you're probably gonna start from like your data ingestion or you call it data loading. Another data loading problem to do like feature transformation or you could want to like clean the data. After clean data, you do feature transformation. After feature transformation, you want to do model training. After model training, probably want to do a parameter training afterwards or you could do a parameter training while doing a model training depending on whichever works for you. And you want to do like model seven last the model monitoring. So in Kipto pipeline, you get to define all these things as steps. So at the end of the day, when you define all these different steps, it comes out in form of like an execution graph that you're connected to each other. Before, so for your data loading, you could, before you get to your, you start from a data loading, from a data loading, you get to your cleaning data step, from a clean data step, you get your feature transformation step. From a feature transformation step, you get to your model training step. After your model training step, you get to your model seven steps depending on whichever work for you would like to do. So Kipto pipeline offers you this feature. So in Kipto pipeline, each steps are defined as component, pipeline components. And what is the pipeline component? Basically, a pipeline component is basically just a set of codes designed to perform it as cloud task. So a pipeline component, an example of a pipeline component could be a data loading component. So what the data loading component does basically is just perform data loading operations. And as a part of the pipeline component is the model training component. What the model training component does is carries our model training, that's all. So pipeline components are self-contained sets of user codes packaged over the Docker image and the perform one step in the pipeline. So components may run severely or in parallel based on the way these pipelines are defined and we'll get to see all this in the Anzon example. So Kipto pipeline design goes with orchestration, experimentation, and reusability. So we'll get to like experience all this in the pipeline. Let me see what's time. Okay, sorry, I'm going to stop. Okay. So what's included in the Kipto pipeline UI? So you have like, what's included in Kiflix? You have like a UI. So the UI is similar to what is up at my right hand here. You have like your own notebook, tensile boards, models, snapchat, volumes, experiments, pipeline runs, recurring runs, articles and execution. So with the help of SDKs, we are able to, with the help of Kipto SDKs, we are able to actually create pipelines that can run and appear in the UIs. I did talk about the component, the component. This is an example of a Kipto pipeline. Like this is like an example. So you can see this is for the Chicago Taxi Trip data sets. It's a popular data set used in most Google tutorials. So basically, this is like data ingestion process and there's a pandas transformed data frame into a CSV format. There's a removed data format. So these are all these things that are checked out, all steps or components by the performing particular tasks or the performing particular operation. So basically what HGBoost train does like it trains an HGBoost model, what HGBoost predicts, it predicts on an HGBoost model, what calculate the regression matrix from CRG does is basically let's calculate regression matrix. So we get to actually see all these things are live in the unknown technical expression. Let me check this, see how far we can go. Okay, this is just 14 minutes. Okay, let me talk about experiments. I hope I'm communicating, can you hear me? So basically this is experiments, Kipto experiments. So basically an experiment is it works with where you can try different conversion of pipelines. For some of us from here with MLops, we all, and for those that are not familiar with MLops, I'll try to like just explain this thing. So basically an experiment is like whereby you, like typically let's say you train a model with certain parameters. Let's say I train a model with A is equal to five, B is equal to six and C is equal to 10. So the moment I change, the moment I change it to like A is equal to seven, A is equal to nine and C is equal to like 20. And I run it like I've generated a new experiment. So basically an experiment is where you, you are able to like implement new set of ideas and you still get to keep the pipeline work with the same and you get like different results. An example here is like, you can see this image. So this is an open vaccine project. So different experiments were created for the same set of work flow. So we have like pipeline run. So basically a run is a single execution of a pipeline. So the moment you are done compiling, the moment you are done creating a pipeline and you compile and run it, so you get a pipeline run. So a pipeline run is a single execution of your pipeline. So all these things that you might seem like a little bit overwhelming at the moment, you get to see it answer with this simple digital organizer notebook. So recurring runs, in recurring runs, basically you're trying to like repeat a particular run. Let's say like you run it like last week or let's say you're trying to like schedule run. Let's say, okay, I want the run to like perform weekly. So with the record run feature, I get to like implement this. And this is actually very important in continuous training projects or continuous training work flows. Let's say you expect your data sets, you expect new set of data sets to come in weekly. So I mean, with the help of the recurring run, I don't need to like recreate new notebooks or recreate new pipelines. Recreation run helps me with this. So this will just go over this. So this is basically a step or you can call it a pipeline component. So if it performs, this is the load data step, pre-processed data step, model training step, model evaluation step. So I'll be going over the end zone session now. So let me go over, I'm going over the end zone session now. So first of all, before you can actually create this type of pipeline, you need to have Q-Flu installed. And this, there are different ways to install Q-Flu. You can install Q-Flu via a package distribution, you can install Q-Flu via manifest, or you can install Q-Flu via Q-Flu as a service. This is what I use, like I'm training my Q-Flu pipelines or my Q-Pipelines. So I would, so for those of us that are new to Q-Flu, they have no idea what Q-Flu as a service is all about. Probably need to like just go to Q-Flu. Theariplu.com. I think I probably registered. So you probably need to go to Q-Flu.ariplu.com. You see Q-Flu.com. Q-Flu as a service, okay, this is it. You can learn Q-Flu free Q-Flu cluster using Q-Flu as a service. So I mean, you can just use it to try your hands on like some cool notebooks that you have that you want to convert into pipelines. So for me, I've done that already. And I mean, it's the website as a very good UI and UX, I kind of explained things as well. So you should be able to like create this free service to use to back through. So there are different, like I said, Q-Flu, package Q-Flu distribution. So is that you install Q-Flu on AWS? You install your physical store on AWS? You install it on Google Cloud, Azure or other services? So the idea is that since Q-Flu works on the leverages Kubernetes for its operation, so before you can install Q-Flu, you need Kubernetes to be installed. So we have AKS, Azure Kubernetes service. We have GKE, Google Kubernetes Engine, and we have EKS, that's for AWS. So if it goes to the Q-Flu documentary or if you go to the Q-Flu documentation, I think it's quite easy. I don't know Q-Flu documentation website by app, so I could just go with this Q-Flu documentation, and I find myself, yeah. On that documentation, I'll probably click on installing Q-Flu, and it should be able to get me started. I mean, I think this was direct to me. Installing Q-Flu, so yeah. These are like different ways you can install Q-Flu. So you have different doves that can guide you through installing Q-Flu. Let's say you don't want to use Q-Flu as a service platform, you can't go through these routes. So this is Q-Flu as a service which I'll be using now. Think, let me just go back and get this sign in myself. Start with Q-Flu, yeah, learn Q-Flu free. If you have your system with QG, you can probably watch this process as well. So if you don't have an account, you can sign up. So let me have an account next. Then, yeah, this should work, we'll see if it could. So I'm coming out, I'll have to check this. Check my phone phone. So in the game, I'm going to stick on. I might guess it's, so let me just get off the chair. Verify, we'll see if we voice call next time. Okay, I think I got the voice call. Okay, so I just got a call from the code. So if you can, can you guys still hear me? Sorry, can you guys hear me? Okay, so. Okay, thank you. So once you've created the Q-Flu as a service, so it brings you here, all you need to do is create a new Q-Flu deployment. So I do have something running. I mean, the website is quite explanatory and it should be able to create this. So currently I have seven days left. So Q-Flu as a Q-Flu as a service gives me like two weeks free of this. Like I mean, try my pipeline or something. So my username is user and my password is my password. So you guys have to like, you also get yours and you log in. So once you do that, it brings it to like a Q-Flu, it brings it to a Q-Flu UI. And I think I would like to open this and call it into Pitch, it doesn't have power with my... So I can still see the UI. Okay, so this is the notebook. So I do have a notebook server created. So what I'll do is I will go to the process and create a new notebook server. So I'm going to delete the one I had before. I'm going to delete this. I'm going to delete some experiments I had before running to create space. Okay, I think it's delete nine. Let me delete some snapshots I had. Um, yes, there's gonna be some snapshots I had. So, yeah, delete some of this book. So I'm currently in a new notebook. So my name, let me just call it KCD, KCD. So I'm using Jupyter Lab. So I'm probably going to leave it at Jupyter Lab. If I also use Jupyter, if I also use Jupyter, if I also use Visual Studio Code, I'll probably pick Visual Studio Code or RStudio, whichever one works for me. Okay, yeah. Set my CPU requirements. Let's say I want two. And I want two big ROMs, so I'm just CPU calls. So my ROM, I could set this to 10. 10 big ROMs. And I think that's all. So you launch. So this shouldn't take up to a minute. So while we wait for this to load, so I think if you are with your laptop, you probably just go through this with me. What you like. I'm seeing this record, I'll probably get it later. So, I mean, I'm going to go to this website, this GitHub repo, and I'm going to put on it. Let me just show you what I'm going through. So this is like an open source repo. I contributed to this and yeah. So what I want to work on is the KLGPtallab, the KLGPtallab, I'm just going to put on this repo. Copy link address. This is it, okay, just launch. WindConnect, something connected, we're going to open a terminal. Let's open a terminal. So this is like the same GPtallab environment you get when you look at a computer. So key to clone, for example, so I'll go to the digital recognition notebook, which is this. Now open the Qo notebook, which is this. I have a, so this is the digital recognizer Qo pipeline notebook. So I'll just, I will walk us through the notebook. It's basically just a markdown that talks about what cloud competition is all about. And yeah, so to be able to create these pipelines and that's like, so basically this notebook server I have here, the notebook server I created via the cluster, it's just like an empty environment. It doesn't have any, it just has a, just Python environment installed. So it doesn't have some libraries installed. So I need to go through the process of installing some libraries I would want to use. And the libraries I would want to use for this notebook, I would make TensorFlow, C++, Wget, and pandas. Since if that's, since that's the case, let me work through it. So I'm going to install, I've studied the requirements, the TXE file. So I'm going to install it. I'm going to pass the Criot flag, because I see all those standard outputs. So after installing the, it should take probably two minutes to like install. After installing, I'm going to like import all the necessary libraries. So to now walk with Q, to walk with Q feel Q, this is like where the magic comes into play. And I would like, I would like it if, if you've not been paying attention before, I think just probably just try as much as I can say, concentrate here. So while I'm waiting for this to install, I'll click on this Q extension here. So if you have a, if you are using the Qflow as a service, this thing, because the, you'd have the Qflow pipeline deployment panel for the Q deployment panel, yeah. So you just click on it, you click on it, it opens, you enable the pipeline, you enable the panel. So after enabling the panel, you have to just, okay, if you've not created an experiment before, you click on new experiment, you pass in your experiment name. In my case, it is digit recognizer Q. The pipeline name is digit, digit recognizer Q as well. So the description, it performs processing, training and there, right. Training and prediction of digits. So the moment I enable the Q deployment panel, you see like the notebook changes. You are seeing some tags like skip, imports, pipeline parameters and delights. I'm going to just explain what these things are. So the beauty of QL is that it actually gives you the opportunity to like create components straight up from your notebook. Like you don't need to write extra codes or you don't need to do any extra. You just need to enable the Q deployment panel, create an experiment, come here. You see this like this pencil sign, you click on it. It tells you what kind of cell should this be. So basically, since I've already installed the libraries, I don't need to pass this into my Q flow pipeline again because I don't want to always install these libraries every time I run the pipeline. Since it has been installed already in my notebook server, QL is going to pick it up and you don't have to stress yourself about it. It's always installing every time you want to create a pipeline. So since I've installed, I'm going to have to skip the cell. So to skip the cell, you click on this and click on skip cell. From the word skip cell, just skip the notebook cell. Don't perform this operation, just skip it. Then I'm going to have to import my libraries. So I'll just make it a bit, and you see all your imports are in one place. So it makes scale be able to understand some of the libraries better. So when you call them or you want to use them down the line in your machine learning development process. So basically, you click on this, you pass in all your imports, you click on imports. From the word imports, just everything you're trying to import into the notebook, just pass it here. Then for my pipeline parameters, so pipeline parameters are basically parameters that the pipeline uses or several components can use. For example, now I could use the batch size in multiple cases, so far as I've passed it as a pipeline parameter. So several components can use my pipeline parameter. It's like a global variable now in Python. So basically, I pass my pipeline parameter, this is like the parameter I want my pipeline to always have access to. Every component should always have access to this. If I choose to use this, if I don't choose to, I just want it to be there. So my pipeline parameter is generally my hyperparameter. So I have learning rates for Shares for LR, epochs, batch size, conversation dimension one, conversation dimension two. So I pass that. So I also set random seed, but in this case, I set it to skip it. So I don't want this to go into my pipeline graph. Imagine me having a graph that says, okay, set random seed, like it's not really, doesn't really look nice. I mean, you could do it if you want to, but I didn't just, I didn't do it. It's a choice. So I skip the cell. So that's now good to like creating a pipeline step itself. So everything we've been talking so far have been things like, have been add-on towards Q. And let's talk about the codes itself that perform these tasks, which is the pipeline component. So to create a pipeline component in Q, it's very, very easy and it's as easy as this. You have like specific commands that perform download data. So what I want to do then, I want to download this data from my GitHub, from the Kiplo GitHub repo. So I pass in the data link. I import the big get and zip file. I come to this pencil button, click on it. I pass in the step name, which is download data. So when you click on, okay, I click on pipeline step. So it tells me to pass in the step name and it's dependencies. So basically, what it means by dependencies here is which other steps is this step depending on. Since this is the first step in my pipeline workflow, it doesn't depend on anything. And so I just leave that for this one. So this is like my source code that downloads this on the train data, test it on the sample submission data. So in case if you're wondering how the source code is available, you can just check this is the source code here from this GitHub repo. You can share on the dates and you can create new ones yourself. This is source code. So everything that you're seeing me doing is in this GitHub repo for your use. So yeah, this is the download data step. So next up is the load data step. So okay, after downloading data, I want to load data into CSV, one that's unloaded data into like, load the CSV data into pandas as a data, loads it and outputs it as a data frame. So to do that, I'll click on this pencil sign. I'll call me and click on pipeline step. Passing the step name, load data. Dependency, it depends on download data. So what this just means is that like, I mean, it's going to load what the download data downloaded. So you see why I'm this kind of interconnected anyway. So I mean, the download data that doesn't benefit me downloads data. So that's like an ingestion process. Typically if you're working in an enterprise, like what you want to do is like, they're usually like a storage, probably like AWS buckets, Azure Blob Store, your Google Cloud buckets. Or let's say there's a feature store somewhere that you load this data from, more like a data ingestion process. So this load data depends on what the download data downloads. So I mean, I read the data from, I read the data, the CSV data from what the data download data step are downloaded. And this is it. So the beauty of KEL is that multiple node cells can have a single multiple node that says can have a single pipeline step, can be under a single pipeline step. For example, the case here is the load data step. If I call me, this is also the load data step. This is also the load data step. If you can read this, it says leave the step name empty to make the cell to step to data. So basically you can make multiple cells to a single pipeline step. So the next pipeline step I have is pre-processed data. So it's as easy as, I was just coming to this page so I can click on this, go to pipeline step, click on pipeline step, come to the step name, enter pre-processed data, it's dependencies. So to pre-process data, I need what has been loaded. So it depends on the load data. It doesn't depend on the download data, it depends on the load data. So the load data information gets passed to the process data at the pre-processed data step, mixes of the loaded data, like carrier transformations and all. So there's a lot of pre-processed data, it's time for me to do my modeling. So for the modeling, so the modeling, you call me just like I did previously, you come to the cell type, you click on pipeline step, come to the step name, you click on, you type in modeling all, what server model name of your child, what server name you feel like giving it. It depends on the pre-processed data. So the pre-processed data gets passed into the modeling step. So yeah, what I just did, I create like my model, it's a TensorFlow, it's a convolutional model. So this is my modeling step, I have several new sales passed under my modeling step. This is my compilation because of time, I set it to like, what's the number of epochs I set it to? Okay, I set it to just two epochs. Take notes. So basically what I just did for my, for the parameters I passed into the pipeline parameters, before I pass them into my noodle cells, I tell the notebook what type of data type it is. In this case, while I was compiling, my learning rate is of data type flutes. And for some of us, if we are of Python background, we know what float is, or we know what float is in general. So this is like the decimals. So my learning rate is of type flutes, my batchize is of type integer, my epoch is of type integer. So I just pass it there and the notebook understands this. So the next step after modeling is the prediction step. So I just try to make this as simple as possible. And when it's time for the Q and A, I can ask questions. So the next step is the prediction step. So the prediction step depends on the modeling. So basically what has been passed to the prediction step is the model that has been fitted already. So the fitted model. And I make it as a fitted model to carry out predictions. And after my predictions, I get confusion matrix out of this. I believe we all have an idea of what confusion matrix is all about. After that, yeah. So I just taught my confusion matrix. So I'll be able to visualize it in the UI. So that's all about the, that's all about the question of pipeline. Let's see this coming to actualization. So once you've done all this, this is also the submission code. I have skipped it because I'm not submitting anything to NUS. So you can just skip this cell. So compile, when you're done, you can compile and run. So what Kail does is validate the notebook and it checks if your codes are correct. Once you see it, check them if your codes are correct. And it takes snapshots. So basically the snapshots, what Kail offers you, like once you've created, once you've created the notebook, it saves all the information, all the working environments, the storage inside your notebook server. It saves it inside Kail. In the event that, okay, let's say you want to use it next month, you want to use it in two months time. This information is really backed up for you. So once you're done, you get this view, whether you can view what your pipeline, how your pipeline is running. So I would want this to actually appear like on me. To view the actual pipeline, I think this is it, it should be there. So to view the actual pipeline, so what it does is it creates a volume first. After creating the volume, you see the download data step. So download data step is running. So when the download data step is done, you're going to get the load data step. When the load data step is done, you're going to get the pre-processed data step. When the pre-processed data step is done, you're going to get the modeling data step. When the modeling data step is done, you're going to get the prediction data step, which is the final step and you have your pipeline graph. So once I see this building issues like the issues the pipeline is now running, the issues the download data step is now running, and this should take about 10 minutes, 2000. So I think I can afford questions now. Why we wait for this front to finish? So basically the idea of this pipeline is that like, I mean, you want this process, since it's built under Kubernetes, in the event that like you are training something with a very huge workload, like it has the auto scaling feature because it leverages on Kubernetes, like to scale up based on computing resources and based on competing needs. Another feature of this pipeline is that, let's say you want to perform a continuous training operation for some of us that are from MLOs background, for some of us that are trying to get into like an MLOs background, you can create a pipeline that goes into like, you can create a pipeline that the output is a model that goes into a model registry. So depending on how you, I mean, this is like enterprise MLOs. So the download data step is completed. We are, this is the load data step, we have to wait for the load data step as well. So basically like I was saying, probably the output of your pipeline, it probably is a model that goes into a model registry. And then, thanks for before, you look at so far like a parameter training features, however, the focus of this session is not on a parameter, you know, model saving, or the life just on pipeline creation, possibly like some other time, about model saving with Kail or Kail through, talk about a parameter training using Kail or Kail through. So if you have a question, like we can start asking now, before we're done, the pipeline that have been put up, completed, I think that's like, that's all. If you have a question, you can start asking now. Hello, sorry, can I get in? Yes, we can hear you. I think we have a question. Is there a Kail extension for Colab? Okay, so Colab is not built on Kubernetes, it's not built on top of Kubernetes, so that would be possible. So there's no Kail extension on Colab. Okay. I think there's another question on how you can join the community. Although I'm not sure if it's KCB community or the Kail community. Okay, so the Kail community is still the CNCF community, Colab and CNCF community. So I feel like typically there's a key flow slot, probably just Google, I think let me see if I could Google this, key flow slot channel, key flow slot channel. I mean already, so I guess if it doesn't direct me to what I'm in. Okay, so this is key flow slot channel, you could Google these and the first thing that comes up is community bar, key flow, click on it and it should direct you to how you should be able to join the official key flow slot channel. I hope this is helpful. So this should be able to get all of this information there. Let's go to the pipeline. So the load data step is completed. We're going to pre-process the data step. So I have a demo question as well. So at the end of the day, your pipeline should look like something like this, which we would get to see. So your pipeline should look like something like this, something like this. You have like the create volume, download data, load data, pre-process data, modeling, prediction. So let's let me get out to this, this is it. So while this is, okay, pre-process data is completed. So modeling, second to the last panel, so let me start, let me copy this. So you could see the results live. So if you can hear me, I'm trying to like log in to like my private browser. So the display could like come out in full. So if I'm going to stop sharing now, I'm going to like request for access there. So I'm going to say everyone could see like the full pipeline. Can you see my, can you see my screen? It's coming up, okay. Okay, yeah. So like this is the full pipeline. So the beauty of KL again is that you could actually see some of the things you're doing. There are no visualizations. Okay, you could see some of the things you're doing like the model training strip. Can you see? I can't even look at it, so I should have something. Okay, yeah. For example, some of the, sorry, I think it's here for me. Give me a minute, sir. Lighty, if you are speaking, I'm not sure we can hear you. The output of your notebook, yeah, just as it is in your, just as it is in your notebook. Like when I call the dot ed operation, yeah, dot ed method, it's still the same thing I'm going to see, yeah. And the modern operation, I think I should see the model training step. Probably when it's done, you would get to see it. So it's just your entire notebook. Like, I mean, you don't get to like stress over all. I need to like do some things to like turn this into a pipeline. Like it's your entire notebook. The same way you have it in your co-labs, same way you have it in your GPTLA, but you just upload it here and turn it to a pipeline. So it's really typical for you to be able to see what you're doing. So typically, if you do this with the QVSDK, the traditional QVSDK, you would need to like write some components individually. And you would need to like run some methods, create clients, compile, or like this one, I just have my notebook. I don't stress about that. Plans to just click things, type things, click things, type things, yeah, I have a pipeline. And pipeline is actually very critical when you want like a reproducible ML workflow because every time I run this, I'm always going to get the steps. Like every time I run, I'm always going to get the steps. I could take this step on somebody's, on someone's like key flow course, different from mine. I'm still going to get the same thing. I could pass the output of my steps to like a model registry, still going to be the same thing. So like, this is like the beauty of machine learning on Kubernetes. We machine learning leveraging Kubernetes orchestration abilities, gain ability, capabilities, and the portability capabilities. So this is like my, the output of my, from my notebook, like the model, the summary, this is like the training step. And the last one is supposed to like show me my confusion matrix. So there are still other features which I didn't get to like show into this session. There are ways in which you could like output like pipeline parameters for some of us on to the experiment tracking. Like you want to like be able to compare different and pipeline metrics. I want to let's say, in each area of my first experiment, I go like 90% and I'm not running it. I'm getting like 89%. I want to be able to like track these changes. So these are some of the features you can also like get from like your pipelines on key flow in general. So let me see if I have something I've trained the past. Okay, then go to experiment. So these are, this is like the run UI. So these are some of my future experiments that have passed field experiments. Let me say experiment, go to experiment. So basically you get to see the metrics. Yeah, you get to be able to like compare these different metrics and it gets like we have to like rerun the metrics to the desired, the desired pipeline. Let's say you run about 10 different pipelines and you want to pick the best one based on your run. These are some of the things that you don't like taking into consideration, but that's not like the scope of these sessions. So we won't be going over there. But these are some of the things you could do with, you could do like experiment tracking, you could do like a pipeline orchestration with teach. You could do model seven with key flow. You could run distributed training with key flow. Yeah, I think that's it. So we have our pipeline run completed. So that's how to like create the pipe and machine learning pipeline using Q. That's all. So this is visualization. This is my confusion matrix. That's all. It's a good question because of the duration. Thank you, Eliade, for that awesome section. If you have any questions for Eliade so far, you can either drop it in the chat or you can also check him up on the session with their handles where you can ask for the questions just in case you're unable to get your response here.