 Right, we're gonna go ahead and get started now. So all right, thank you everyone who is joining us today. Welcome to the CNC webinar. Taming your AI ML workloads with a cube cloud the journey to the project 1.0. So my name is Daniel O. I'm a working for Red Hat as a technical product marketing manager. As I read I'm a participating in CNC ambassador. So today I will modeling this webinar and we'd like to welcome our awesome three pre-genre today. The Joe New Jersey technical lead at CPSDAI at Cisco and David Aaron-Chick head of open source and machine learning strategy and Microsoft and Alvira Jueva, the technical product major AI ML at Cisco. So we have a few things of housekeeping item before we get started. So during the webinar, you are not able to talk as a handy. So there are the Q&A box at the bottom of your screen. So please feel free to drop your question in there and we will get to as many as we can at the end. So this is our official webinar of CNCF. So as such she is subject to CNC code of conduct. Please do not add anything to the chat or question that will be in a violation of the code of conduct. So basically please be respectful of all of your fellow participants and the pre-generates. So please also know the recording and slides will be posted later today to CNC for webinar. Pay easy www.cnsm.io-webinars. So I'm gonna hand over back to Joe, David and Alvira. So David take it away. Thank you so much for the wonderful introduction and thank you so much all community who's come together to hear about Kubeflow. It has been quite a journey for us and I'm really excited with the opportunity to take you through that now. So I like to start by looking back to the past. Does anyone know what this is? Normally this works a little bit better in a conference room but this is in fact the snark maze solver and is widely credited as being the first ML machine learning implementation. Came out in 1951 and it's as you can see it's all a hard-coded circuit. But the reason I like to show this is because I wanna stress how machine learning really isn't a new thing. This has been something that has been thought about for a long time about taking raw input data that you don't have a clear direction on how to solve things and using that data to make predictions. More recently machine learning has had quite a bit of acceleration and in 2000 some of the first Python libraries were added to make machine learning easier. And then in 2006 we had the advent of numpy then along came Thanos, pandas, scikit-learn, cafe, DL4J and then in 2015 we had quite an explosion of open-sourced machine learning frameworks, TensorFlow, PyTorch, Chainr and so on. So there have been quite a few advances from a software perspective. Today we have even more of course. And so the question is do we need one more machine learning solution? And the answer I would say is yes. And the reason is because of quotes just like this. This is from a very sophisticated organization presented about two years ago at Strata Data. You can see. And basically what happened was the enterprise who was doing this developed the model very quickly but then 11 months later it still hadn't been rolled out to production. And the reason is because developing a model does tend to be pretty easy. You're able to get area under the curve. You have very clear metrics and things like that. But when it comes to all the other elements of machine learning that can take quite a while. This is a more recent example from the GitHub natural language search. They were able to demo in Jupiter in just two weeks. They were able to post to a front end mock-up in another three days. But then three months later, experiments at github.com were still going on because again, rolling it out to production was so challenging. And when I say all the elements of machine learning, I'm really talking about this. On the left-hand side, you have the building of the model using the tools that you see there or many more. In the middle, you have some of the more evaluation steps of the model. Tracking, comparing to previous rollouts. How do you do deployment? How do you tune to make sure it meets your SLOs and so on? And then on the right-hand side, you have more of the infrastructure layer, resource positioning, dealing with your enterprise access controls and so on. And unfortunately today or prior to Kubeflow, these things exploded into an exponential test matrices where every time you had one change in any one of the steps, it required massive changing in comparison all over the map. So in 2017, you could see me there on the left-hand side at KubeCon, we introduced Kubeflow. And the idea behind Kubeflow was to provide a simple layer in between many of these, as well as many of the core functions that you would be looking for so that all of those elements on the left and right-hand side could use clean, consistent, cloud-native interfaces to both input, output and run common machine learning tasks. And at the time, I presented this slide and I said we want to make it easy for everyone to deploy, develop, deploy and manage portable, distributed machine learning on Kubernetes, which prior to Kubeflow was challenging. And in fact, in many ways, still is that we can do a lot better. But the reality is that we just wanted to simplify it so that data scientists and machine learning engineers could really operate at the next higher level up. You know, a year later, I presented this slide to the KubeCon Community Summit and I presented this slide and this slide because our mission hasn't changed. It was 2018 and we still wanted to do this. In 2019, I did the exact same thing, presenting that slide and our mission still hasn't changed. And next year, I'm gonna say the exact same thing because this is a persistent mission. We really want to make it much, much easier for people to engage and use machine learning at scale. And when I, people say, well, why did you have to go about Kubeflow? What does that mission drill down into? It really looks like the following. First, we want it to be declarative. You know, most data scientists today will go on an iterative loop on their laptop and get great success. Unfortunately, many of the configurations and choices that they made are lost. And so when it comes to bringing those even to experimentation in the cloud, let alone development, staging and production, lots of things can be lost. If you can do things in a declarative format, if you can say, hey, you know what, I'm setting up a pipeline and it requires these four services, go figure out how to roll them out using constructs in Kubernetes, that can be extremely powerful and now gives you portability to any place that Kubernetes runs, which is your local laptop, your on-prem cluster and any hyperscale cloud. Second, we want it to be abstracted. Data scientists understand many, many things about building a model, but they don't need to understand, you know, what it needs to happen in order to get the master node to sync with the parameter nodes, to sync with the worker nodes for TensorFlow, right? That's obviously quite complicated and unnecessary. They'd much rather operate at the higher level. With Kubeflow, we're giving them an abstracted layer so that all they have to do is understand how to interact with a notebook and then they can hand off those problems to the framework. And finally, scalable. A lot of machine learning takes place on single nodes today, not because those people want to, they obviously want their machine learning models to be trained and rolled out faster, but because setting up distributed training or distributed data or pipelines or things like that is very complicated. If we can unlock a lot of scalability by giving people standard constructs, we should be able to do that. And I like to say this is a joke, but you know, only half a joke. We really want to get to machine learning without the letter K. If you're a machine learning engineer or a data scientist, you should never have to even know that you're running on top of Kubeflow, let alone Kubernetes. You should interact with the tools that you are familiar with that make sense to you, but magically be given all this additional framework and tooling in order to have your applications and training pipelines roll out quickly. You can see here that we've gone through quite a journey. In 2017, when we first got going, we had just three services. You can see there, Jupiter TensorFlow job, which allowed for TensorFlow distributed training and TensorFlow serving. We quickly added by May of 2018, our first pipelines, as well as Selden for serving non-TensorFlow models and ambassador for our front-end security, because to say the least that Jupiter and TensorFlow serving and so on, we're not built for a lot of enterprise requirements and being a surface to the internet. So the extent to which you could start hiding that is better. Then you can see we continue to move through January of 2019. Obviously, pipelines was an enormous release there, where now you can write entire workflows using nothing more than Python, very impressive stuff. In the middle of 2019, we introduced KF serving, which gave you a clean control plane for serving neutral rollout of your models, as well as faring, allowing you to declare using metadata, what your infrastructure should look like. And then through the rest of 2019, we continue to add more features, multi-user support, integrating more with pipelines. And finally, now in March of 2020, we are very proud to announce that Kubeflow 1.0 is here. And you can see how this breaks down. At the start, we really were just trying to find the individual applications. The second was connecting the apps and pulling the metadata together. And finally, landing on productionization and hardening. We've had enormous momentum. You can see the number of PRs that we've had rollout, commits from all over, including all of these communities and many, many more, all these companies and many more. And when I look at what makes a great community, I think about my experience with Kubernetes. That was one of the first PMs on Kubernetes. And you saw an enormous amount of contributions, but you know what really come down to it is, is this controlled by a single company? And a lot of people complained about Kubernetes saying, oh, you know, it's all Google. No, it's not. In fact, more than half of the things are not Google. Kubeflow looks like the same. We have obviously a significant, Google makes very significant contributions to this, but the majority of our contributions come from not Google. And then again, this is where we're really proud. It keeps us all honest and making sure that things work properly. And with that, I'd like to hand it off to Alvira to talk about some of the elements that went into Kubeflow 1.0. Thank you, David. And let Jonu to switch the slides. David, can you stop sharing? Oh, yep, sorry. Okay, great. So last year we conducted couple of user service interview that showed us that our main target audience are mostly machine learning engineers, data scientists and DevOps engineers. They all come from two different types of organizations, big enterprise organizations with more than 5,000 people in it, and as well as from small companies with less than 500 people. Interestingly enough that both of these organizations are using both on-premise and cloud infrastructure for the machine learning workload. And with this being said, that our goal, one of the our goal for Kubeflow 1.0 was to actually run it on both cloud and on-premise. If you can go to the next slide. So you can see that with one simple command, KFCaddleApply, now you can deploy your Kubeflow on any Kubernetes cluster, whether it's public cloud, private clouds, or even on-premise. So it's very easy to get started with machine learning today and deploy it anywhere. So can you go to the next slide, please? Right. So what is Kubeflow 1.0? With Kubeflow 1.0, we are graduating a core set of table applications that are taken together to deliver our core critical user journey to develop, build, train, and deploy machine learning models on Kubernetes efficiently. The focus of 1.0 was really about production, have a readiness and stability. So here's a set of applications that we consider production ready. You can see that we have a central dashboard UI that provides a quick access to the Kubeflow components deployed in the cluster. It's a really great tool. It's a housing for UIs of the components running, including pipelines, cut-in notebooks, and many other more. And training operators include most popular training frameworks such as TF operator, PyTorch operator, MPI, and XGBoost, with which you can create, manage, and run distributed jobs of that particular framework. Jupyter notebooks are the most used Kubeflow components. This is the best way for a data scientist to develop their models. With Profile Controller and UI, you can manage your multi-user environment. And KFCoddle is here for deployment and upgrade. Apart from stable components, you can also see beta components. That includes serving for easily serving model in production. Pipelines is an end-to-end orchestration tool for machine learning workflows. It's a really great tool. You can use it to get a big picture of what is going on for your machine learning workloads, as well as connect all steps together. Coddip tool provided by Kubeflow is a hyperparameter tuning tool for data scientists. It's usually very hard to find best parameters to get better accuracy for the model. So Coddip helps data scientists to automate this process. And we also have a fairing, is a tool that helps basically port the code from a local environment, machine learning code from local environment to the server or to the cloud. And we also have a meta-data store that stores all artifacts collected from the experiments and the model development so you can easily come back and track whatever you changed in the last time. Okay, let's go through the critical user joining. Here you can see two users, both data scientists Bob and Liz. Both of them have access to their personal notebook server with different types of Jupyter notebooks that they have created for variety of machine learning model developments. And the cool thing is that only Bob has access to his notebooks and Liz has access only to his notebooks stored in the list namespace. But what if they want to work both on the same model on the same data set? In this case, they can have a shared namespace where they both have access to the variety of notebooks and now they can collaborate together. So next slide, yes. So one machine learning model was developed. There is a code source available to that but usually how it happens data scientists develop this code on his local machine but now he wants to train on the server. And in most cases, this is a manual operation when data scientists need to manually pour his code to the machine. That includes the knowledge of the infrastructure, the need to create images all the time manually. So in order to reduce these complex steps, Kuflo provides fairing. It's a great tool that basically takes your machine learning code, put it, build an image for you automatically and pour it to the server or to the cloud. And in the next step, once you have a code ready on the server, we can go to the training step. And here before you start training, as I said, you might want to have your hyperparameter tuned. So you might want to use Kuflo cut tip to find best parameters for your model. And once these parameters are set, you can run either on a single run, your machine learning training or you can do it on a distributed way with multiple GPUs. And as I said, we ran a couple of user service and we asked like what are most used frameworks in your company. And it appears that TensorFlow, PyTorch, XGBoos are one of the most used of framework as well as Scikit-learn. And they are all available in Kuflo. And the last step is a deploy step. It's the step where you actually have a model ready and now you want to deploy it to have available on production. So and Kuflo provide this capability with Kuf serving. Kuf serving is a custom resource on top of Knative. It actually really helps for both data scientists and IT operators or DevOps engineers to put your model in production because it helps to hide the complexity of Kubernetes and Knative. You don't need to know much about it. On the right hand side, you can see configuration files and the cool thing about it that you don't need to do a lot of things. With this simple configuration file, you just only need to specify the framework you use like Scikit-learn or TensorFlow and point to the storage where your current model is stored. And if you use different frameworks, you only change these two lines of code and you don't see the whole complexity of Kubernetes or any infrastructure layer. And for IT operator and DevOps engineers, this is also a cool thing with additional advanced features that helps to manage GPUs to do automated scaling based on load and do deployments and rollouts. And on this note, I'd like Jonu to give some demo about Kufo. Thanks, Alira. Okay. So this is a Kufo dashboard which is deployed with latest stable one-to-two release manifest. So this is a standard installation with all the components, default components are there. So I'll have a brief work through this. So on the left, you would see the KF pipelines with which you can actually create end-to-end machine learning workflows, notebook servers, which can be used to create Jupyter notebooks, CATIP, which is a scalable hypergram tattooing server which is already discussed, and the Artifact Store for storing various machine learning artifacts. So there are many other components like KF serving as discussed earlier for serving machine learning models, distributed operators like PFO operator, PyTorps operator, MPI operator, and XDPoost operator which can be involved via SDK or standard Kubernetes client tools. They don't have any specific dashboard right now. Now let's actually start with a data scientist perspective of things. So for a data scientist, he has a problem statement and he wants to experiment various solutions try building various machine learning models, find the best model, and then later put that into production. So he's not aware of any underlying infrastructure. So the question is, how does Kubeplur help here? So as we know, data scientists, they are comfortable with Jupyter notebooks. So here we have already created notebook servers for you. I'll just have a quick work through on the possible options in each of the notebook server configuration. So here you could specify custom image. If required, else you could actually use the built-in TensorFlow images for various versions. You could specify CPU and memory requirements. Also you could add GPUs if you need them. Also if you need persistent storage for long term storage, you could add them too. And then just click launch. So you have a separate notebook server for you. Now, once notebook server is created, just click on connect and this will take you to the Jupyter notebook interface. And I've created a sample notebook for you. Let's actually go to that. And right. So here, let's talk about one interesting use case here. What we have taken is an indoor localization problem. So the objective is to accurately locate a person in the building based on various measurements. So consider, this was a Kaggle competition and this dataset is created using Bluetooth Low Energy BLE RSSI readings of 13 beacons located on different parts of the floor in a library. So data was collected using a phone. And so here this RSSI measurements are negative. So the bigger measure, larger measurements means closer proximity to beacon. So data looks like something like this. So you have the input columns to be RSSI readings of 13 beacons corresponding to each of the columns. And the output column is the location, which is the grid number. So for example, P01 is the first grid in the pth column. So now, so once the problem sheet measure is given, the data scientist wants to try out this problem. So using Jupyter Notebook, he could start the pre-processing part. So that would include data cleaning to remove the columns that are not needed. Then he could normalize data. Then once the initial pre-processing part is done, he would think about the model to be used in this. So here we have used the base model, which was given in the cattle competition, which was a DNN classifier. Now, when it comes to neural network, the one another question that comes to a mind of a data scientist would be, how do I find out the hyperparameters for my neural network? So for people who are not familiar with hyperparameters that parameters which are set outside the actual model training, which are those are the fixed values. So for example, in case of neural network, the learning rate, momentum rate, et cetera. So how does Kubeflow help you there? So here it provides a component called Cattop, which takes care of your hyperparameter joining process. So let's see that how it works. So here in the main dashboard, there is a Cattop option, there just go and click on submit for hyperparameter tuner. Here, this is a sample form in order to start a simple hyperparameter tuning process. So it basically has four major sections. One is objective, second one is algorithm, parameters, and then the trial spec. So the objective specifies whether I have to maximize or minimize a particular objective metric name, which can be for example, here it is accuracy or it can be a loss function. And the final goal that we have to reach. The second one is the algorithm which we have to use for hyperparameter tuning. It can be a random grid hyperplan based on optimization. Those are provided by default in Kubeflow Cattop. We could add initial settings for the algorithm if needed. Then the third section is about adding parameters which we have to tune. So this list down each of the parameters and their corresponding parameter type whether it is double indoor categorical and corresponding to that, what is this feasible space minimum, maximum, and if it is categorical how much, what are the list items? And finally, we specify the image or the real training code to be optimized. So we have done this before since it takes like more than five minutes. We have done this before and this is one exactly same code which we have done hyperparameter tuning on process on top of it. So let's see the experiment settings that we have done for this. So as discussed, we have used the objective metric name to be L2 loss. So since it's a loss function, it has to be minimized. And the parameters that we are tuning is the learning rate and the beta one and the corresponding feasible space is provided. And we have provided the trial template which specifies the image to be tuned in the process which is the image corresponding to the code that we have. So once it is completed, it lists down the best trial that we have got and the corresponding parameter assignments and the L2 loss value. And this is the minimal value across all the trials that we have got. So here we have a visual interface too. So here you could see that the left most column is the L2 loss which is the objective metric name and the corresponding two different hyperparameters which are tuned. So for each combination, you'll have a L2 loss value. So there are 18 trials which are done and the least one was the first trial with 2.07 as the L2 loss and corresponding learning rate and the beta one. So just to demonstrate the power of hyperparameter tuning, this results basically be the existing Kaggle model by 20% with just five minutes of training. Now let's go back to our classification code. So we have got the best hyperparameter values, right? So we set that and now the model classifier is specified, the exporter, the training and the evaluation spec is specified and then the evaluation spec should contain the exporter function which specifies where exactly the model is tuned and finally the estimated train and evaluate functions called from there. Now, so here one important thing to be noted is we use estimated class train and evaluate function to enable model export and saving. So the advantage is that there is no code change if you want to run this in this local mode or in the future if you want to move it to a distributed setup. So as discussed earlier, we could use the Kubeflow pairing component to wrap the notebook code and run in a distributed manner in any remote Kubernetes cluster. It can be on-prem or it can be on-cloud. So here Kubeflow operators come in handy in order to set all the required distributed conflicts automatically for you in the case of distributed setup. So for example, in case of TensorFlow if you want to run a distributed job you have to set PF config or in PyTorch you have to set the rank for each of the nodes. So these things are automatically done by the distributed operators that we have in Kubeflow. So we just need to move it to distributed setup and run this exactly same code and you basically get a scale of setup. So all work as a magic when you are moving from local to remote. So for now we skip the faring part since it takes a little time we will try to cover it at the end. So we start the training and once training is done here we just list down the values which we have got from the hypergram training process. So once we have done the training now a data scientist wants to try sending some sample prediction custom inputs. So how can Kubeflow help here? So as discussed earlier like with a simple KF serving config model can be deployed. It just needs parameters for the model storage location as well as the framework type TensorFlow. And one just apply that and we just need to wait for the port to come up serving port to come up model deployment. And once model is deployed which is the route URL is automatically created and this serves as the end prediction endpoint. So now once we have the prediction endpoint you could send custom inputs to see our prediction. So as in like he could send different values for RSSI measurement values for each of the sensors and that happens to be the custom input and like as a result we get which class ID it belongs to which corresponds to the corresponding grid location. So it is the 13th grid location which is predicted and the corresponding highest probability value for that. So till now we have talked about like one complete local training with a hyper parameter tuning and then local deployment. Now once this is verified data scientist wants to move this into production. So how can you do that? So the same code can be packaged, built and uploaded to remote repository. It can be a Docker repository or GCS or inertia start training and serve it from there like using varying library. So here just to show about it like this since the building takes time I'm not going through the actual building process but here you could see like in a single using the simple pairing SDK which specifies the actual training job and once it is submitted the image is built and the deployment process is started in the remote cluster. And once the training is completed using pairing itself end point can be created and we can do exactly same the prediction example that we have done from the remote cluster. So scaling up from local deployment to the remote deployment is kind of like in just three lines of code using pairing library. Yeah, so this covers the basic journey of a data scientist from actual training the local training to the final production ready cluster. So Elvira, want to take from there? Yes, sure. Right, so all this would be nothing without feedback from our users and collaborations. We're really glad to see that the code flow helps these days to manage machine learning workflows at scale. And you can read these nice codes and feedbacks from our users and now what is next? If you can go to the next slide. Yes, what is next? So you saw different variety of cool flow components and as I said, we have beta components that currently we're working on to have them eventually also ready for production. This includes pipelines, I tool fork is stretching complex machine learning workflows, metadata, cut tip and many other components that's still in beta. And of course, enterprise readiness that involves a lot of work on security and vulnerability. So we always started working toward these action items and of course, a managing cool flow upgrades is a main topic too. And you can now see these different sources. You can go to the cool flow website. You can use the next link to run the demo yourself. And I also ask you to please take the survey and survey results really help us to develop prioritized features in cool flow. We really listened to our users and you also saw some user survey results and I constantly update them once we have enough respondents. So please take it. And I'd like to thank everyone for joining today. And now we open for questions and answer sessions. You can post them in our Q&A box. Thank you very much again. Awesome. Thank you, Junu, Davy and Elvia for the great presentation and a really awesome demo. So we now have some time for questions. If you have any questions you would like to ask, please drop it in the Q&A tab at the bottom of the screen and we will get as many as we can have time for that. So actually we have six questions and three questions already, two questions is already answered. And the next question, I can see here. I think we can go through all questions including that we already answered just in case other people would like to know. Yeah, that's really good idea. So, yeah. So first question, Davy, you already add some command and why we decided to invest your effort in a key app of serving rather than helping sell them. What are the future of a survey on cool flow? So, Davy, can you answer a little bit more about that? I mean, you can see it all right there. The net is that KF serving is just an abstraction interface and it doesn't in any way handle the serving itself. It relies on underlying servers to do that. So, because what we heard from the community is that people want maximum answers and choice when it comes to the actual server using there for their particular solution. And so now with a single interface, you're able to interact with Selden, Triton, TF serving, hyperscale serving options, you name it. Got it, thank you. And next question, we already had answered. How do you share the notebook feature coming along? For example, as a modeler, can I share the notebook server with another user or a group or even team? So, the answer was typically you create to share the namespace where you can add a user so they will have access to all notebooks stored in that namespace. Any other comment, Eivira? Yes, so he will have access to all the notebook servers in that particular namespace, yes. All right, perfect. So, next question, what is the direction for managing Qflow without being a cluster admin? So how it happens is like a cluster admin adds users and users are given a separate namespace. So after that, it is kind of autopilot. So cluster admin doesn't need to worry anything about what the individual users try to do. All right, cool, thanks for that. And next one is probably I missed this part, but how hyperparameter from the hyperparameter tuning are inject into the notebook? So here, like in the demo, I just used shell command, but there is a Python SDK for Catip. So which means you could programmatically pick values from the output that we have from the Catip experiment and then directly inject into the notebook, whatever code that you have. Okay, thanks. So next question from Harry, will the notebook in the demo will be made available? Something like a gillipaw, anyone can access and replicate? Yes, right, so you can use the link that I posted here on the slide, the demo source. You can find these notebooks and many other notebooks under that link. Yeah, and you can use it. Yeah, nice, perfect, I love it. So next question, is there a notion of doing things interactively using Eclipse chair or VS code rather than doing this in a by a notebook or jobs? So I have seen like very similar issues in GitHub, but I'm actually not sure about the progress about it. I've seen people asking for it. We would ideally like to have community contributions for that, I can track the changes and the updates happening to that. So I'm not currently not updated on that. One thing that we should stress, I would say one thing we've really tried to do is not just what Kubeflow is good at, but what it's not good at. And though certainly we offer an interactive experience in Kubeflow, we do expect the vast majority of people to continue using whatever IDEs and tooling they're familiar with when it comes to doing the production, so for experimentation and so on. In this particular case, our recommendation would probably be to use a standard GitOps or MLOps pipeline code in the language that makes sense to you and explore locally. And that includes VS Code or Eclipse Chee or whatever you like and then check those tools or excuse me, check those artifacts in. That really is the best practice. Though we offer Jupyter as a way of hosting, I wouldn't say that that solves all solutions nor would I expect us to figure out how to do that. And that said, we're always willing to take additional plugins, there should be a VS Code extension and so on, and we'd really love to do that. Next, David and Joe Milo. So next question, is there any plans to support the GenCine in the future? Again, I'm not aware of that, but given the compulsor-breaking nature, I don't think there is any problem to support it in the future, but I don't see any particular effort currently on that particular point. If you don't mind, I'd love to hear what you'd like to see supported. I agree, ultimately with John Yu. The extensibility of Kubeflow is such that you can basically build any package you want and add it as a pipeline component or whatever you like, but I'd love to hear, maybe we can do it offline, just email us, you can see all our email addresses down there, what you're looking for support or better yet, file an issue in Kubeflow and let us know about it and mail to Kubeflow Discuss. All right, yeah, send it into directory. Okay, so what about pain point? We have a production, pre- and post-processing of the data during interface. So does your Kubeflow help with it? If we ask, how? So here, like in V1, Alpha2 version of KServing, it actually supports that. So there is a pre-processor and a post-processor. So in the example that we have done, we have just done the prediction part, but there is an option currently in the current version to have a pre-processor and post-processor of the data. And based on that, like based on your user needs, it is possible to do it. Okay, thanks to you. And again, this is another area where I would probably expect the underlying serving tool to take care of the work more than things at the Kubeflow layer. That said, you shouldn't look at KServing for the interfaces they use to declare what pre- and post-processing you need. All right, cool. So next question from Jeffrey. How does version management of the code fit into this flow in terms of the reproducibility? For example, all the Python code in a notebook? So I would say like, for example, in the case of like using caring, you could actually build the entire code to image and then it actually becomes immutable. So that is definitely possible. And ultimately, the files, the what the training code that you have can be versioned like using the same manner that we actually do for the normal training process. So that is possible. Yeah, and again, let me stress. I think the best organizations in the world solve this. I know I'm highly biased here, but solve this using things like GitOps and GitHub Actions as your CI CD. And so what would happen is a data scientist would iterate on their interloop, on their laptop or on their private cluster. And continue to iterate. And then once they were done, they would commit that into a GitHub repo. And in that GitHub repo, you would build a set of actions that would process that into a production ready or training ready or some similar thing where things could now be distributed at scale. So using something like NB-Dime or other automated tools to strip out and clean and so on. But by making Git the center point, the source of truth for everything, we are not trying to reinvent the wheel when it comes to all the details around merge conflicts and single source of truth and immutable data structures and tracking over time and things like that. All right, perfect. So this question from Dimitri. As I know, KF serving based on K&A, but K&A with Python support is not very good because K&A with developers of course on Go-Lang. So do you have any plans to contribute for better Python support for K&A leaf? Okay, so this looks a little out of context here, but again, like happy to hear like if you have any particular use case which is not supported, we could chat offline to see what if you have like a different set of requirements for you. Yeah, I just to follow on exactly what he said. Let us know what's working or not working. Just to show you, if you just go to the KF serving repo, you can see the, I can show you in a screen share or whatever, but you can see the stacks are basically completely flexible. So at that top layer, you should be able to declare just about any framework you like, throw it in a container and then everything else should cascade down. So if that doesn't work, then that's absolutely a bug and you should mail out to, you know, you should file a bug in the KF serving issues in GitHub and you should mail out to, you know, KF discuss or excuse me, Kubeflow discuss and add into the Slack and so on. But there's nothing about Python that should slow anything here. Let us know what you want. I mean, you can see right there in the repo, you know, PyTorch is up there, XGBoost, which is obviously Python based and so on. You know, Scikit-learn. Cool. So next question, I'm doing some actually computing implementation. Has anyone done any Qflow or Raspberry Pis? Boy, I haven't heard of it. There's absolutely, because all we require is a conformant Kubernetes cluster, there's nothing preventing you from doing that. What I would recommend, however, is it's really about building that last pipeline step and thinking about how you do a distribution of your model. And in that particular case, I would actually strongly recommend if you can to use either a service that you built yourself using one of the IoT frameworks out there or ideally a hosted IoT framework. Distribution of models across edge devices is a particularly challenging problem. Loose connections, irregular updates, low bandwidth and things like that. And there are many, many problems that you can run into. So if you go out and use whatever Azure IoT or Google IoT or Greengrass from Amazon, whatever it might be and build that into a pipeline step, but then ultimately use their service, that would be the recommended path. You're gonna run into a lot of problems that have been discovered and figured out. Right. The other question is, is a good practice separate training in QFARO and the serving in the two separate clusters? Generally speaking, yes, because the uptime requirements are quite significantly different. If a training cluster goes down, it's fairly easy to just flatten it and restart. The serving cluster goes down, you're in for a world of hurt. The general best practice in the cloud native world is to install only what you need on the clusters that you need and then use any time you have a blast radius change or any kind of latency or other requirement change to have a different cluster that is specific to that. So in this particular case, what you'd do is you'd build two, Kubeflow clusters side by side and one you would install your training and potentially serving. I mean, like if you wanna do smoke testing and then the other you would install Kubeflow and training only, or excuse me, and serving only and then you would communicate between them. You are getting into new problems, of course, how you're gonna deal with sharing credentials and things like that, but those are solved problems through things like vault and so on. Thanks, David. And I think the last question is the latency of World of War. For example, eventing in Python supporting and scale from zero. Any other comments? That is an interesting problem. So you want to, I'm gonna have to like plead ignorance as to know exactly what the issue you're running into. I appreciate you following up. Please drop this in Kubeflow discuss or better yet, like I said, an issue in the KF serving repo and the right people will look at it. Perfect, all right. Okay, great. Thanks, John, you, David and Nevada for the great presentation and awesome demo. It was really practical. So all right, these are all the questions that we have a time for today. So thanks for joining again today and the webinar recording and slides that will be online later today. So we are looking forward to seeing you at future CNC for webinar. Have a good rest of your day. Thank you.