 Hi, everyone. Welcome to the North America virtual keep count today I'm going to give you a deep dive on the six scheduling and Little bit on myself. My name is Wei Huang. I work at IBM I'm also the culture of the six scheduling so today's content is Separated into three parts and at each part we want the content is delivered to the exact audience so we separate them on to different personas from the Kubernetes beginner to some scheduling expert So we want to that the content to be delivered so that the specific audience can absorb easily Okay, the first kind of audience we are targeting those regular users who just write and Deploy their applications on to the Kubernetes platform. So for them, they just want to know some schedule of basics so that they can use the schedule of Features efficiently and correctly and also they want maybe want to do some basic job shooting to identify whether the issue is belongs to schedule or other components and For sure, they don't want to go too deep into their schedule into those So for them for that is to answer them what you would stick schedule does so in one sentence keep a schedule just find all the pending parts and as on the part of to the most Best fit now, so that's it and in term of API perspective so the pending means the know that doesn't have a Spec download name field set is empty. So you can see from the picture and then Scheduler will Go through some internal logic And we'll find it's the best and know for the incoming part and then assign the part to the note in this case then spec download name was sent to Can work it too. So this is a basically schedule it does and with that being said There are some things I need to mention that which keep a schedule it doesn't do which Often may stand stood by the users like Kota enforcement This is a thing that the admission plugins does added the part of creation time instead of scheduling time and also spinning up and skating down the replicas of Department is the job of catching a manager because they are responsible for Managed number of replicas meets the desire number and also it's the cubelite responsible for evict the running paths maybe because the What can it has running out of memory running out of desk and such and such so I left a few of the Common mist and standing that people would think that this thing belongs to schedule, but it doesn't So understanding this will help you better Identify the issues whether it's belongs to schedule or not and the next we will give a very high level introduction about the scheduling flow the most basics so Once up schedule gets a part it goes through Internal scheduling cycle the schedule will schedule the part one by one and one schedule cycle can be divided into several phases the first one is called filtering so filtering tries to Answer the paths hard constraints. It's like answer your question like as a user I need my part to be scheduled to another which has to give him up to get by marry or I need my part to be scheduled to another which so that to code the existing with some kind of path So this kind of requirements. I call them how to come they must be satisfied and all how Constraints are ended it not all and in terms of the hot constraints, they are usually I think 99% of the cases are from the past back which are specified directly by the user for example the pub request and The the pub affinity so notice there Here the primitive starts with required. So if you Read out some term starts with require is usually the hard work on because because it's required and some other primitive use other terms like the pub project spread feature use A constant strain to identify this is a hard constant that if the constant cannot be satisfied Then do not schedule this part. So you can usually tell from the literals meanings to tell whether it's a hard constant and or a soft the constant our later mention this is Important because only hard constant only Unsatisfying hard constant will make your part into pending status and Next phase so suppose you have 100 Nails and the only 10 nails has passed the filter phase then we go to the next phase called scouring So scouring in contrast of the filtering are best efforts So that I try to answer some question like I Prefer my power to be scheduled to a node which has SSD But if that node doesn't have it doesn't matter. I just want Scheduled to try its best that they'll do this kind of thing. So based on this soft the constant Each filtered node filter means that node has passed the filter filtering constant those filter nails will get a score and we aggregate those score and and finally pick up the highest score and Asign the power to the node which has the highest score. So I won't highlight that in terms of software constant it doesn't block scheduling your power is just describe a relation of Preference I prefer my part to be scheduled on to some some note and Maybe there are some software constant conflict with each other. So it happens so that and The software constant has have two sources. The first one is from the power spec you can identify some software constant in the power spec like part of the node affinity and The primitive starts with pre-ford blah blah blah blah, so this is software constant and usually it comes With additional parameter like weight so that can give the user a Customized weight so that I favor this constant more than other constant software constant and also some features are using other Primitives which are usually self descriptive like When the power cannot be satisfied, I will still schedule it anyway So this is a software constant and there is another kind of Software constant which are not specified by the user instead usually specified by the cluster enemy Upon the scheduled startup. So they are called I call them implicit scheduled config so there are sort of Scheduling policies like the default policy tries to Use the node which has the list and located resource as possible so that It will try to make the cluster the utilization balance Rather than being packed or the pause to one single note. So this is for software constant and You may think about think of a question that what if No node can satisfy all the cost hard constraint in the filtering phase. What should we do? so We have a special phase called preemption. So preemption is only triggered one There's no node can satisfy the hard constraint and it will See whether there are low priority Candidates which can be sacrificed can be Preempted to make room for the high priority part if this is possible a schedule will go Print the low priority pass to make room so that the high priority part is schedule both so in terms of API you have to define the priority class And the specify their value the higher the value the more important the priority is and In terms of the path spec you'll define the spec dot the priority class name and correspond to that prayer priority test Which the schedule we are on there in the long term All right, this is pretty much I think in day one a regular scheduling user needs to know about the basics of the schedule and You just need to learn that scheduling has Basically three phases filter scoring and preemption and If you see a party see pending status first ensure that the path spec dot no net is not set So that is in the scope of scheduling which are trying to schedule the pattern to the node and Also use keep control describe that part check which constraint hard constraint is not satisfied Okay, so the power in payments that has to check those kind of go through those simply capturing steps but if a part Can't be scheduled, but it's not the desired node that it is scheduled on to There is usually something wrong in the scoring plug in scoring face maybe some default Scoring policies that given by the cluster and mean doesn't fit your clothes You need to check with them or you do some debug to see why one Score policy is way higher than the other. So this is basically for record users All right for day two We are talking the audience as cluster and mean or DevOps So the goal here is a little different. So you are not only satisfied of trading schedule as a black box instead you want to know more about schedule internals and to get to know some configuration back best practices and Also and stand how the scheduling Behavior car related to internal we call the scheduling plugins So that you can make the most of schedule and expose them to the end users And of course, you don't want to write additional code. No matter is a scheduler extender or plug-in. You just do Make customize the schedule with the official schedule image So the first thing you can look into is configuration So configuration as time goes changes a lot in the recent Kubernetes releases and the first thing I want to mention is that Don't use command-law arguments anymore because we have a better design And organize object called cube schedule Configurations so that you can pass on all the individual command-law arguments through that config object through the dash dash config parameter but if you are still at Like Kubernetes 1 7 you have to still adapt to the to the old style Kind of cube schedule configuration and the second thing is Even inside the cube schedule configuration, there are different versions vary from V1 alpha 1 to V1 beta 1 and that the difference This is that before V1 alpha 2 we are still using the old execution paths or old terms like per decays corresponding to filtering and priority corresponding to scoring and They are a little bit different in terms of their naming and the execution paths. So if they are Still using the older config you may be using some thing called the policy file So the policy file is something like this. You define the cube schedule config with V1 alpha 1 version and specify one policy as the policy file and Inside of the policy file if you want to customize some Behavior you have to list all the predicates or the priorities Even if you just want to change one. So that is not good user user experience, right? And also here you can see that you have to provide two files Why is the cube schedule config? Why is the policy file itself? It's also not user-friendly. So as time goes We want to duplicate this kind of policy-based config and instead we are transitioning to the plugin-based config so Plug-in is simply a functional unit that is written to Satisfy one specific constraint either from the user's input or from some Impressive general policy So right now if you are using V1 alpha 2 or V1 beta 1 You can use the latest the cube schedule configurations under the profiles subfield there is a plugin field so that you can Spotify you can disable or enable or Disable all the parking there by writing the minimum YAML snippet as possible. So it's very user-friendly and The more importantly so right now you can see that we have a new profiles Field that means so right now starting with 118 Your schedule is not a single flavor schedule. You can build a multiplex multi-flavored schedule. I will give an example So using the we've up beta 1 configurations here I define four profiles and Default schedule image first the bean pack skip score and they can map to different behaviors in runtime The different scheduling profiles like the skip score is something that I totally Just disable the scoring phase Because I don't care about which node is highest the highest score I just want the part to be scheduled as fast as possible and the bean pack on the other hand Try to favor the node which has the most allocated resource as possible So it's quite fit the the requirements of Autoscaler because autoscaler wants to save the cost of running machines they want to pack all the workloads and minimum machines as possible, etc and In runtime in your workload, you just specify the spec the schedule name and correspond to the profile So in the example that we have four Workload YAMLs and with different Scheduled name and in runtime schedule knows which specific flavor you want your workload to be scheduled So that's the I think the very key change in recent Kubernetes releases So back to this picture So as a custom enemy you should know exactly what each plugin is doing not only by understanding the logic and also knows the Internal stuff like their arguments and how to conflict them and how that arguments Reflect in terms of behavior and here we have a page to list all the default plugins and Which extension point they belong to for example 10 toleration it implements several Extension point in filter phase pre-score phase and the score phase So as a customer enemy if you want enable or disable do some customization I do recommend to check out this The official website And also there are some plugins Was compiled into the schedule image but not enabled by default so you can choose to enable them But again, you should to understand the real semantics of them And some of them have conflict with default ones If you want to enable some of them, maybe you want to disable some of these default plugins that is the basic Cluster I mean need to understand in addition to that There are some other global settings in the cube schedule config like Percentage of notes to score which are very useful if you're running on 1,000 or 5,000 now Properly you don't need to score all the filter now So just need to score maybe 10% or 20% of the notes because scoring is more To favor one note to to the other so some of the booklets just care about whether It's hot constraints is has been satisfied and don't quite care about the soft constraints so that's a very useful features and It's a crust I mean you may want to catch up with the latest Progress in the upstreams a list of a few of them here and the eventual goal is that you Suppose just it's like you are running a burger shop like right in the before you just provide one flavor burger which it has meat onion and Tomatoes that's it You use it just can take it a little bit. I right now you can have You have multiple Ingredients right you can compose them and into different flavors like it's a veggie one. It's With beef is with chicken and with beef. You can also divide it into with green grid and year or with wrong and year or with Tomatoes without tomato s h s h so you can The image is still the same image, but you can provide more flavors so that to adapt to your users work it So this is basically the I think the trust that I mean wants to learn Okay, let's go to day three this three we are talking to the audience which usually skating experts or innovators It's like you're still running a burger shop But you are not satisfied with existing ingredients. You want to make some Innovated ingredients from outside for example then provide more flavors than the default Schedule image provides so in this case they don't care about writing some sort sort of code and The ultimate goal is to fit diverse work clothes such as batching work clothes, which that the current schedule is not supporting well but they are They're not they don't want to stop on scratch to write a secondary schedule because Ronnie and multiple schedule Well invictably caused some conflicts on the pod scheduling So for this kind of users still go back to this picture you should not be satisfied with only knowing that the filtering has some Plucking XYZ and the scoring has plucking XYZ You need to understand for example, why we need pre-filter and how to use pre-filter right pre-filter for example is a phase that Especially for the requirement that you need to consider cross-node Cross-node status so that you need to do some pre-calculations and They put that result into sort of a cache and that result can be used later In filter phase that's for pre-filter and also you need to understand what Permit is and what reserve it right reserve is a phase that you can Before you part guess schedule you can make an optimized Assumption that this part has been scheduled so that you assume this part of resource in the scheduling cache And the permit is a very useful extension point that you can Wait Until a certain criteria like a group of paths come inside all of them are satisfied And you approve all of this of pause in batch. So this permit is a useful extension point in Bachelor clothes and also post the filter is a new extension point we introduce in 119 it Replace the old hard-coded preemption logic so that it's the preemption Logic is more extensible like in the before we just in the default preemption We just consider the strategy to preempt that the nose. Sorry preempting the path on the single node But sometimes it's not not the case and you want to extend this behavior like you want to preempt One group will pass and that kind of a one group pass may be scheduled on to different nodes etc, and also you need to follow up with the latest the change in the scheduling framework and Just read through the release now if you have any questions go to the slack it's a million less that to raise your concern and also we we are going to Jay the scheduling framework in 120 we have started to build this framework since 115 now we are able to say that it's stable enough. You can be go to the next stage All right, it's still the same topic that If you want to implement one specific features, you should know which specific extension point you want to extend And write the corresponding logic there, for example if you want to Can schedule in the common you may need to implement this kind of extended point and if you want to do some Scoring plugins you need to implement the pre-score and score extension point so that So with the default schedule on the base you build something on top of that and finally you get you compile them together and then you get a Unified schedule binary. It's a new new binary which has 100% of the vanilla schedule functionality as well as the additional functionality you built into it and because building a schedule plugin has a I won't say it has a high bar, but it's has a bar. So we initiated sub project called schedule plugins which Writing some examples there and some guidance there for you to start with so right now We already have some caps on gas scheduling Capacity scheduling I'll say scheduling is about elastic order and also some ongoing care like which Favors to use some real-time Metrics of the cluster to do a scheduling decisions, etc. And also we Do some CIA Optimizations to do automatic build and which so that you can just download the image from Kubernetes.GCR.io Right and this this is pretty much for the day three so that Steal the Knowledge that you are running a burger shop. So right now you can use the default ingredients From the upstream and also you can buy some other ingredients From the local market, right? You can buy some special cheese You can buy some special ingredients hazelnut or what who knows and Combine them together to provide additional flavors to for your user to use and the last favorite contact us and You can find all the information here. We have weekly meetings raise your topic there and Happy us Build this community better All right. Thank you. I think this is pretty much for this session. Thank you