 Good afternoon. I'm William Wang from Huawei Cloud, and I'm the manager of the volcano community. And in the last 10 years I have been working on the traditional software, HPC software development, and the AI and big data on Kubernetes. So today my topic is how to leverage your volcano to improve the resource utilization for Kubernetes cluster. So firstly I'm going to talk about the volcano project and then I'd like to talk about the architecture of the volcano and how it works in Kubernetes. After that I'm going to talk about the new challenges and some features. And finally I will show you several use cases to show you how volcano have users to improve their job performance and the resource utilization. Okay, so volcano project is open sourced in 2019 and donated by Huawei to CNCF in 2020. And currently we have more than 500 computers all over the world and more than 50 enterprise users have adopted volcano in their production environment. And for this month we will release the 24th version. Here we can see a volcano project has strong relationship with the upstream computing framework, such as the Spark link in big data and the Kubeflow, MPI and Petarch, etc. So currently we support most of these computing frameworks efficiently. And here we can see volcano is not just a scheduler. It has the job controller to control the enhanced job lifecycle management and support the multiple pod template. And also it has the queue to have users to share resources for much time. And we also support the scheduler, enhanced scheduler and support just like the topology based scheduling, GUN scheduling, and preemption and backfill. For the analyzing hardware, we are working with Kubernetes to support the more kind of heterogeneous devices like S86, ARM, GPU, and KUNLIN. And also for the command line, we also developed a variety of command lines to help traditionally HTC users to migrate from this learn or HTC to Kubernetes more smoothly. Here we can have a look about the volcano scheduler architecture. So we cannot support AI and big data through the job-based scheduling and the plug-in mechanism. So basically there are three parts in the volcano scheduler. The first part is the cache. So the cache is watching the Kube API server and build the job and task relationship based on the pod group and pod. And for the ECG scheduling cycle, as we all know, in the distributed system, it's very difficult to keep a real-time consistency for making decisions. So what kind of schedule jobs based on the snapshot at a certain point in time to make decisions? So we can make sure in each scheduling cycle, it's always consistent. And in each scheduling cycle, we have the open session and multiple actions and the closed session. In the open session, the user can register a variety of algorithm plug-in in the red part. And we support four actions in default, and these actions are executed in sequence. So let's take the allocate action as an example. So the allocate action defines the resource allocation process. So it will call the job order function of the algorithm plug-in to solve the job. And then it calls the node order function to solve the nodes. And finally, it will check whether the job is ready and submit the decision to the APS server. For the algorithm plug-in, we have support more than 10 algorithm plug-ins in volcano. So the first level plug-in and the second level plug-in, with this kind of design, volcano has strong flexibility to support the customer's scenario. So this is the journey of the volcano. At the beginning, we supported, we developed a variety of scheduling policies and integrated with the TFO operator and the Petrochial operator and the AGO to have a user to improve the training workload performance on Kubernetes. And then we enhanced the queue and the resource reservation and this throughput. And also we integrated with the Spark operator and Flink operator officially to have user to aggregate their workload from Hadoop access them to Kubernetes. And then we found the job management in Kubernetes is very difficult for users to maintain. So we enhanced the job management to support the TensorFlow Petroch and MPI in volcano job plug-in. So users no longer have to install all these kind of operators anymore. So currently there are more and more users running their AI workload and big data workload all through the microservice on Kubernetes. But the most user are most concerned about the resource utilization. So we have been exploring in this area. Here are the new challenges in the about the resource utilization. Therefore, as we all know, the AI technology grows very fast in recent years. Currently it has entered this stage of communization. According to the analyzed report of OpenAI since 2012, the computing power used in AI training has doubled every three to four months. And the computing power is becoming the bottleneck of the batch computing. So let's take the GPT as an example. One GPT training requires about 10,000 GPU card based on the 100 GPU type of GPU. And the charge GPT training requires about 10,000 GPU card based on the 800 GPU card. So the computing power is the most bottleneck. But on the other hand, from the third party surveys, we found that the overall CPU utilization is less than 15. So there are many reasons. From the figure, we can see that the long running service has the peaks and the trough. Especially at night, the resource utilization is really low. And also the requested resources and the used resources, there's a big gap. So if this part of the resources can be reused, the utilization will be greatly improved. So in volcano, we are going to support the colocation and the oversuppression to have users to improve their resource utilization. The first solution is if we want to improve the resource utilization, we need to break up the isolated resource pool to serve the different kind of workload. And secondly, we are going to deploy multiple kind of tasks in the same cluster. Here is an example. The MPI job is a big job. It is like the store. So the big data ETL or transcoding workload is like the sand. We can put, we can scatter them into the bottle. And for the function service, like Monte Carlo, we can pour them in the bottle as pouring the water. In this way, we can improve the resource allocation greatly. And also, we are going to support the oversuppression. In this figure, we can see that there is a big gap between the requested resources and the used resources. So the scheduler can oversold this part of resource to run some kind of lower priority task. Next, I'd like to introduce several specific features in volcano to have users to share resources. So three years ago, we added the queue in volcano. And for resource sharing between multi-tenant, the queue is decoupled with the namespace. That means different namespace can submit jobs in the same queue. And the namespace also can submit jobs to multiple queue. It's very flexible. Here is an example. There are two queue, Q1 and Q2. At the beginning, there's no jobs in Q2. So the jobs in the Q1 can borrow the resources from the Q2. And all the six pods get running. And then a new job was submitted to the Q2. So the scheduler reclaimed two CPUs from the resource pool and got the new job running and kept the ratio as 2 to 1. So with this mechanism, multiple users can share resources with each other. And the second use case is about for some users who have urgent jobs, they want to reserve, make a reservation for their urgent job. So in volcano, we support the guarantee. User can configure the guaranteeing fields to make a reservation. Also, if user submitted multiple jobs in the same queue, so the basic requirement is how to ensure the SRA of each job. Here we support two levels of fair share. The first one is the share resources between jobs in the same queue. Here is an example. The user one, the user to submit a big job and a small job in the Q1. Volcano scheduler allocated resources fairly to these two jobs. In Kubernetes, as we all know, the more pods submitted, the more possibility the job get resources. So the second use case is about the namespace level fair share. So namespace 2 and namespace 3. As we can see, the namespace 3 have submitted a lot of jobs. So Volcano scheduler is able to control the namespace to have the resources fairly. The second one is about the big data. So as we all know, the Spark submit, Spark supports the Kubernetes from the Spark 2.3 in 2017. But for a long time, there's no batch scheduling for the Spark on Kubernetes. So in the 2019 and 2020, Volcano integrated with Spark operator and Flink operator to support the batch scheduling in Kubernetes. And in 2022, Volcano contributor and the Spark contributor work together to to support the Spark batch scheduling in Spark native community. So in the Spark 3.3 version, this feature is a user can use this feature in this version. And last week, Spark published 3.4. This feature has entered the GA. So with this kind of integration, user have the following benefits. Just like the job is scheduling the priority, fair share, queue, and resource reservation. At the same time, we improve the throughput in Spark on Kubernetes. Currently, we support 1.5 post per second. How to enable this feature is Spark. It's very easy. When you submit a Spark job, you need to specify the scheduler name and then prepare a pod group template. Within the pod group template, you can configure a number of batch related parameters, just like the queue and the priority and the sort of things. So next one is about the is about the collocation. As we discussed earlier, for the long running services, there are peaks and trough period. In volcano, we are going to support the collocation. On the upright figure, it shows the basic model. The scheduler view is able to calculate dynamically the over subscription resources and schedule the low priority task to use this kind of resources. At the schedule level, the scheduler will support the queue awareness scheduling for the online service and offline workload. On the node, we have the SRA agent and the enhanced OS to work together to ensure the SRA of the online service. Here, there will be a bunch of technologies to ensure the SRA that the CPU, memory, cache, network, disk as a solution in the OS level. The next scenario is about the global scheduling. Late last year, we have a number of users from the community. Their business is about the AI biomedicine and autonomous driving, such as like that. In this field, their workload requires massive computing power. Generally, one region resources is not enough, so the user have to management a lot of clusters and all these clusters are distributed in different regions, so it's very difficult to maintain. Also, the scheduling is very difficult. We are going to launch a new sub-project in volcano to handle this. The following features will be supported in this sub-project. The first one is managing the batch workload across the cluster. Second, the scheduler will schedule the workload to the proper cluster for better performance or for the better utilization. The third one is about the fair share scheduling and the cost awareness scheduling. If you are interested in this sub-project, you are welcome to visit our GitHub and work together. Next, I will show you several use cases. The first one is about the ING. The ING provides services in more than 40 countries. Its core business is about the banking insurance and access management. For them, the biggest challenge is when they are introducing the cognitive technology to create the next generation data analysis platform, they have the interactive services, the rest of the services, and the offline analysis services. They want to unify all of them into one platform. Also, they want to have the fair resource allocation to ensure the SRA and the job preemption for the quick response for help with the task. With the volcano and with the volcano's rich scheduling policies, they have succeeded to aggregate their workload from Hadoop to Kubernetes smoothly. And the number of projects running on the data analysis platform has increased to more than 450. Another use case is about a new platform. This enterprise uses the AI and molecular simulation algorithm to create next generation micro-scale industry design and simulation platform for energy, material, and research. So their goal is to achieve high-performance computing based on the Kubernetes cluster and the traditionally slurm cluster. The requirement is all of these Kubernetes clusters are distributed in different regions. It's difficult to maintain. And also, the drug discovery workload requires massive computing power. So we work with them to develop a global scheduling based on the volcano and also provided the cluster load balancing, impacting, and cluster affinity. These kind of features. Also, users use the volcano job to run their testing flow and pet torch and MPI workload uniformly. So here is part of our users. Currently, we have a lot of users, especially in the AI and data area. And for the contributors, we have diverse contributors and more than 50% are independent contributors. So here is the volcano resources. You can connect us with the Slack or GitHub or our website. That's all from me. Thank you so much. If someone has questions, is there any similar project on the community? There are some multi-cluster projects in the community. But different projects, they are focusing in different areas. Currently, there are multiple cluster projects. They are focusing on the high ability to fuel over and recover. Yeah, things like that. But there's no project focusing on the batch scheduling across the multi-cluster and the cognitive. Can volcano work together with airflow? Airflow. Currently, I have no idea about the integration. But as far as I know, there are some customers that combine the volcano and the airflow together. In their platform, I think it works. Their platform means their own Kubernetes cluster, right? Yeah. Pure Kubernetes cluster. Okay, thank you. Hello. Okay. So a follow-up on the airflow one. So our team use Argo workflow. So I'm not sure how how is the integration with Argo means. So is that with Argo workflow or other Argo sub-project? Yeah. Argo is one of the, we integrated with Argo several years ago. Yeah, we support that. So how does the use case fit? So maybe there's some document we contributed to the Argo community. User can use Argo to plan their volcano job and configure the dependency and control the pipeline. Okay. So it's like Argo workflow is scheduled things on top of volcano? Yeah. Okay. Thanks. But in this area, we also have job flow in volcano. As you know, the Argo is currently, it's very heavy. So a lot of users, they are volcano users, they want more lightweight job dependency management. So one of our contributors, they contribute a sub-project named job flow to volcano. Currently, maybe in this month, the project will be merged in the volcano release version 1.8. Okay. That's good for you. You can also have a try on the job flow. Yeah. We'll do some research. Thank you. Hello. I'm interested more in using volcano with Spark. So I was wondering if there is on the roadmap or already exists a declarative way to handle my Spark jobs, because now I see it as an imperative. I do Spark submit. You showed us an example. But I'm thinking maybe the future would be to go into a declarative manner and I have a manifest, if it makes sense, I don't know, for you to have a manifest. So behind the scene, I change you in a GitOps manner. I change and I say, I want you to submit a job and it will, by the manifest, it will trigger the job. I don't know if it makes sense. So are you seeing have a manifest to control the Spark submit? Okay. Maybe there are already some tools to do this kind of things in Spark community. But maybe in volcano community, we don't cover this area. Okay. So it's more of a Spark issue, not a Volcanic. Understand. Yeah. Okay. Thank you. Hello. Thank you for the presentation. I was wondering for the architecture you showed about training, large machine learning model. Is it something you tried? Is there any documentation about it for the training process, not for the inference process? Yeah. We have documented for every kind of training operator in volcano repo, like user can use the YAML we prepared, like TensorFlow, PyTorch, MXNet, MPI, Herod, all these kind of training models we have samples in there. But for the inference, currently we haven't do some things for the inference. Because for the inference, it's more like a microservice. It's deployed in the type of deployment. Yeah. And for the model, you have a model feature or it's something model store or it's something about... You mean the upper layers, such as the TensorFlow model? Yeah. Where do you store the model after training? Store the model. Yes. In our platform, users often store the models in the object storage, the remote storage. Yeah. One last question from my side. Regarding day two, how do I monitor how volcano schedules? I'm still interested in volcano sparks. How do I monitor? I do get logs and metrics about the scheduling of jobs. How do I debug? How to debug? Yeah. I put it in production. It doesn't work. What do I do? Where do I go? It doesn't work as expected. Okay. Debug. Currently, we have some tools to have users to find the issues from the log, from the the volcano log and the spark log and the spark history server. And maybe all of these are common tools and common approaches. Yeah. But metrics, do we get any metrics from volcano? Metrics, yes. We have a kind of metrics we can connect from the volcano scheduler and put these metrics to the parameters. Okay. So do we have a Prometheus integration already built in or how do we plug it into Prometheus? Is it a pull or a push model? How does it work? Push. So you can deploy parameters with a volcano and you can configure the statics on there. You can see how many jobs in the queue and how many tasks are running and how many tasks are pending. Yeah. Okay. Thank you. Okay. If there are no questions. Thank you so much. Thank you for your time.