 Hello, everyone. I'm Kimo Nasotirjos and I'm a software engineer at Arcto and also a lead at the notebooks and manifest working group and also the release manager for Kubeflow 1.4. And I'm Andrei Vilichkevich. I'm a software engineer in Cisco and I'm the lead of AutoML and Training Working Group in Kubeflow. And we're really excited that we are part of the Kubernetes AID and thank you for joining our talk. And today we're going to talk about how we went around developing and scaling our Kubeflow's web applications. But before we start actually discussing about the applications, we'll need to learn a little bit more about the users. And in our case, it's an entire ML team which will have different roles that will need to play together to have an efficient workflow. We'll have the data engineers that will be handling the initial states of the data. We'll have data scientists as well as ML engineers who will be taking this data, analyzing, creating the models, training them distribute in a distributed fashion. And then we will also have DevOps because we're Kubernetes who will be helping along the process and also taking care of model serving. And all of this will need to be able to happen in one cohesive platform with some intuitive UIs. And in this presentation, we're going to see how did we go around with Kubeflow to build such a platform that will have these three pillars, which is team isolation, scalability, and user experience. So the first one is team isolation. And since we are building a platform on top of Kubernetes, the solution here is we only rely in namespace isolation which means that different users and teams will have their own namespaces and they will be using RBAC, which is Kubernetes native, to control what entities will have access to which resources. But let's take it one step higher and see how can it work from a user's perspective when they're at their browser. A user will just send a request, for example, they would like to delete a notebook, to a web application living in the Kubeflow namespace. And then the web application will perform an action like delete the actual notebook on behalf of the user with the permission it has and with its own service account. But here there comes the first initial question, which is how can I ensure there's namespace isolation and RBAC is being respected if the web app has permission and can do anything it wants? Well, the answer here is with subject access reviews. Subject access reviews the same mechanism that Kube CTL, OAuth, Kennai is using under the hood in which we can ask Kubernetes if a user has permissions to make an action on top of an object in the cluster. So the backend can create the subject access review and be sure that the user is indeed authorized by RBAC to do the action they want. But this also then raises the next question, which is, okay, how can the web app know which user is actually connected and doing the request? The answer here is with HTTP headers. The web apps will expect that a specific authentication mechanism, whether this is, for example, Google IAP, will be in front of the cluster and ensuring that there's an authenticated user and setting this user identity into specific headers. And to get this all together, we can also do it by a complete open source solution by using DEX. So a user will be connected to Istio's Ingress Gateway, which will be working alongside DEX for IDC. And ensuring an authenticated user is doing all of the actions with the web apps by respecting RBAC. So the three things you need to remember here is that we have namespace isolations for users and teams. We have always have authenticated users. And we are using subject access reviews to ensure we comply with Kubernetes RBAC. The next part is scalability. Well, the cheat solution here since we're at KubeCon is we're allowing Kubernetes for a lot of scalability. For example, and this has to do both for the actual ML workflows as well as the web applications. It will have a lot of requests. Then we just spin up more pods for our web apps, which are restful. But at this point, I'd like to also discuss the scalability point of view from the web apps. And what we had to do to actually scale our web apps when we have hundreds of requests coming from hundreds of users that want to do things on top of thousands of Kubernetes objects. So how can we do this efficiently? I'll just focus on how can we actually keep on fetching the data and have another reference mechanism here. And what we currently do is exponential polling. So we ask for new data every one, two, four, eight seconds, as long as the data is stale. But once we have new data, then we reset the counter and start asking more frequently. But this has some drawbacks like the user might wait even 32 seconds if there's no activity on the resources on the list of objects. Or on the other hand, if there's a sudden change, then all of the hundreds of users will be asking for the same data with consecutive requests, which can result in a lot of load. One future improvement we've considered here is to keep the simple polling mechanism every request every four seconds, for example. But the request will be a simple head request and it will be getting only the resource version of the latest list object in Kubernetes. And only once the resource version is changed, then the UI will do a get request to get the latest data. And we can even do a slightly better optimization here because even in that case, we might be still getting thousands of objects in one request, but we can actually use Kubernetes API chunking to cut these requests into smaller pieces so that the load is less on all of the components and we can have data more easily to the browser. We've also evaluated other solutions like WebSockets or server-side events, but since we are a little bit tight on time, we can discuss more on this on our Q&A session. And last but not least is the user experience part. And it's really important for us because in Qflow, our targeted users, while they are working on top of Kubernetes, they do not have experience of Kubernetes or we cannot assume that they have. So our users, the ML scientists only know how to do their specific ML tasks and user algorithms, but we cannot expect them to be able to kubectl logs to see the logs or kubectl describe if something goes wrong in order and be able to debug it. So our web applications will need to be able to provide all of this information in a user-friendly way while at the same time layering this information so that the more machine learning information is exposed to the user first and the more advanced Kubernetes concepts are also shown but kept for an advanced user. But because I can't just describe a UX with words, you will need to actually see what we talk about and in order to see if you'll also need to understand what are the requirements of an ML scientist and what do they actually need from such a platform. And to further understand this, Andrei is going to walk you through what does an ML scientist do and specifically how can they do AutoML and how can they do it on top of Kubernetes and with Kubeflow. So Andrei, back to you. Thank you, Kamonos. So I'm going to tell you a little bit about what is AutoML and how do we solve this problem in Kubeflow. So AutoML is the process of automation machine learning tasks. And especially here in this page, you can see the landscape of AutoML where users can provide the training data in the configuration space similar to the search space description. A search space can be differently depends on your task. So it can be architecture, it can be hyperparameters, it can be features, and all this configuration can pass to your optimizers based on an algorithm, for example, Bayesian optimization or neural architecture search algorithms, and then optimizer provides a configuration to you. Then we passing this configuration to the model with the test data and push this model to production. Covering AutoML covers a lot of different aspects such as feature engineering, model compression, neural architecture search, hyperparameter tuning, and we try to adopt all of these features in a cloud native way in Kubeflow. So how do we solve this in Kubeflow? We have a project called KDEAP, which is Kubernetes native open source platform for AutoML and this project includes in Kubeflow distribution. So which is a very good part of this project since it is built on top of the Kubernetes. It is agnostic to machine learning frameworks and programming languages. So you can tune your hyperparameters in any languages. And also we have in place support for any sort of the Kubernetes CRD. So KDEAP can be used as the orchestrator on top of your Kubernetes customer resources. Also we support famous optimization frameworks such as hyperopt, optuna, sqopt, and many more. We continually evaluating with supporting new algorithms and also our researchers can use our platform to develop and evaluate new AutoML algorithms. Since this platform is language agnostic and also since it built on top of Kubernetes, we can deploy KDEAP on local or public or private cloud. Also we have native integration with Kubeflow components such as notebooks, pipelines, and training operators. Going to KDEAP architecture, it's quite straightforward. Since KDEAP has the Kubernetes customer source, we have three different controllers. One is experiment controller, which is eventually takes the user experiment YAML and try to proceed this YAML, parse the information from the user needs. Then we have a suggestion controller, which is received the search space in the description for an experiment and also algorithm, increase the algorithm service when it's eventually the AutoML algorithm is running. Our AutoML algorithm can produce a hyperparameters or a new architecture based on the needs. And then we have a trial controller. Trial controller just spawns the trials where the actual training job is running. And since trial is an obstruction, can be any type of training job, we can define even the text on pipeline or even the Argo workflow in our trials. And then our KDEAP is able to actually parse the training results on trials. And then we send this matrix to the DB, and then we send the evaluation matrix back to the suggestion service and produce new hyperparameters. Speaking about the KDEAP, one of the example of our experiment, here you can see the YAML structure. So our users define the experiment budget with specific trial information, then they define objective, then they define an algorithm, and then they define the search space. We will see the more information in the demo that I will show you later. And here you can see the trial template. And since, as I said before, trial can be anything, you can define your specific specification here for your customer source. You can define the whole Argo workflow here, and you can set up your experiment with the preprocessing step, with post-processing step. So it's very powerful if you want to have some sophisticated hyperparameteric examples. And with that said, I will jump directly to our demo, that we show all of the UIs that we built in Kubeflow that Kamonos mentioned before. And I will just show how AutoML is working in Kubeflow. So this is the Kubeflow central dashboard. I think all of you are familiar with this central dashboard because we are trying to show every time our UI. And here the DevOps engineers can always see the CPU utilization, see the port CPU utilization, see all of the monitoring information that they want to reach from the scientist team. Of course UI is split across the namespace. So you can, as Kamonos mentioned before, you can easily adopt this UI for your huge ML team with namespace, isolation, and everything will be work. So here you can see the notebooks UI where you can define the notebook. You can choose the images from the predefined images. You can also define the CPU, GPU, the volume, where you want to take your data. And in the meantime, once you create the notebook, you can click the connect, and you will be in the familiar JupyterLab environment, which is everything built on top of the cloud. So this is very powerful because you can run the predictions. You can run the pipelines with your notebook. You can run training cooperators or KDP experiment. Then you can also create the tensorboard using our Kubeflow UI. This tensorboard is also a very powerful tool for the data scientists to be able to develop algorithms in a cloud native way. And you can also deploy your own tensorboards. We can have also like UI for the volumes, where you can monitor the volumes. And also we have the pipelines UI, which is basically the Kubeflow Pipelines project. And also here you can see all of your pipelines and pipeline runs, recurrent runs, and all of the functionality from the KFP. And then if we jump to KDP experiment, I will try to submit a new experiment to show how this works from the from the UI perspective. So first of all, we just need to define the name experiments. We will use the base optimization for this demo. This is the budget that I mentioned before. So you can define the number of trials that you want to run. Then you define the objective with your metrics that you want to tune. You can also define the all of the metrics that you want to also parse for the training containers. Then you select the search algorithms. We have a lot of algorithms that KDP supports. We will choose the base optimization for this demo, but you can also choose the list of the algorithms from the NAS as well. So let's just select the base optimization. Here you can see the algorithm settings, which also user can modify and adopt for their needs. Also, this is the Aristopic section, which can help you to avoid overfitting and very powerful. And you definitely need to use this in your hyperparametering experiment. But for this experiment, we just simply run one algorithm. And this is the search space. So in this example, we will tune learning create a number of players and optimizer. Also, of course, you can add the new parameters with the particular distribution. We support like integer, double, discrete or categorical parameters. You can edit the current parameter. You can delete them. And you can define your search space for the experiment. For the metrics collector, we support this is the list of the supported metrics collector from the KDP. We will use STD out. And this is the trial template. So eventually, we will use the default settings. So we are not going to change any information from here. This is a template that's where our training process will be run. And we are going to tune learning create, number of players and the optimizer. So we just need to define the reference for the particular parameters. And for the advanced user, we have a button which just produced the whole Kubernetes YAML structure, where you can define the object metadata. You can define some API parameters, which just forms the support and which these provide more functionality for the more like Kubernetes native way. So once this experiment is running, you can see that some of the experiments have been created before. You can see the time when it was created. You can see the number of successful trials, the optimal trials. Also, you will see the parameters that the best trial produce. And if we click to the particular example, you will see this plot, which actually introduced all of the distribution between different hyperparameters, all of the results and these parameters achieved. And also here at the bottom, you will see the overview of the experiment with the current status, with the best trial, with the trial parameters, and also the current experiment conditions that the experiment has met. If you click to the trial page, you will see that all of the trials that the current experiment was running. And if you click to the particular trial, we will see the distributions, sorry, the metrics that the category just collected. And you can always like monitor what's happening with your metrics in a particular trial. And also, which is very powerful, we highlight the best trial. And you can see the results with this best trial by clicking on the name. Of course, we have more discrete details with the experiment. So it's like, for example, like objective trials, parameters. And as I said before, for the Kubernetes user, they can always see the Java structure for the experiments with all the required information. I think our experiment is running. We're not going to wait. But here you can see in the lifetime how the parameters will be changed, will be changing. And also, you can see which trials are currently running and which trials were succeeded. So and last but not the least, I just want to briefly show the models UI, which is related to KF serving. And here I already like deploy it one inference to show which kind of the parameters you will see. So this is the one inference that I deployed before. And if we click to this inference, you will also see the overview of the inference, the current statuses of this inference. You can see the details of this KF serving instance. You can see the locks from the KF serving, which is very powerful. You can see the metrics. So these metrics produced by Grafana. And we can see what is the current status for our inference from the KF serving site. And for example, if I try to invoke this service with some of the predictions, and this is just a simple and missed example, and I try to invoke this service with the predictions, we will see that the metrics should be actually changed. And we will see the information that actually Grafana per use. So yeah, as we can see here, the request has been come. And we will see like the DevOps engineers can just monitor this page by doing this analysis. And with that said, if you have any questions regarding the demo, we'll just feel free to just answer in the Q&A session because we don't have much time right now, but I will jump back to our presentation. And the last but very important thing that I really want to mention about our community. We grow very fast. We have more than 12,000 commits and 22,000 GitHub stars on Kubeflow, which is very exciting. And we want to say thank you so much for our contributions. As me and Kamonis, we're part of the AutoML and notebook working group. We really want to, we want for you to attend our working group meetings. We have our like regular meetings every week. Please join our Slack channels that we mentioned here. If you're using KDIP in production, please update the adopters list. This will be very valuable for us to make the connection with the users. Also, please feel free to check out our latest presentations with the demos and with our latest summits. Also, if you want to contribute, please feel free to call the developer guide, the help wanted issue, and submit the new proposals. And which is very important, please feel free to check out our own labs for AutoML and for Kubeflow in 2021. And with that said, thank you so much for listening to us today. We're happy to answer all of your questions.