 Right, let's get started welcome everybody. So we're going to talk about policy-driven platform our Nova scheduler I'm Ramji Christian from Dell. I had the NFE technology strategy They did open for mental Tim Henrich from Styra Thank you So I'd also like to thank I mean we had the team presenting but there is a core team behind us who made this all happen and Also, there have been a whole set of contributors again contributing towards this big effort here And again interest of time. I'm not going to spell out all the names, but you have all the details here So first let's look at What are the challenges with the current open stack Nova scheduler? Right both from an admin perspective and also user perspective by admin. We mean really the open stack Nova admin and user is essentially it could be any orchestrator or just any other user. So Where are we with the say platform features beyond computer? I mean so far the talk is always around compute say taken as software defined storage use case such as you want to add High-performance storage and also you desire compute isolation because storage and computer running on the same node I Mean that right now in the current paradigm as we see we have to wait for the next open stack release to make it happen basically you're waiting out six months and The next point we see is around ease of use where you know, if you're looking at a generic use case to determine highly Loaded or unusable host. Hey, I need to build custom tools for all this, right? You know can the admin write its own tools or the user writers own tools now, you know It's all basically you need to write code to You know to develop these tools and The last Interesting part is what we are seeing is you know all the work has been focused on initial placement But what about the other functions say, you know what we are really seeing is from NFV use cases there is a very strong need for dynamic mode monitoring and violation detection and Monitoring I mean basically one of the strong uses there is essentially Dynamic monitoring around work out workload consolidation and the way things are right now You have to design a one-off monitoring framework and which is not, you know closely tied to placement basically you define your own framework and what type of you know attributes are useful placement and you go about doing recreating the entire effort you put towards placement So now we talked about the challenges. Let's talk about certain use cases which deliver value to the community and Really the first one, you know, I pride myself for saying hey, this is really the road to 5g, you know, what all we need to do but that said essentially the key idea here is around You know application performance awareness by that, you know, how you deliver certain challenging workloads such as you know needing those needing low latency or And reliable delivery the most challenging ones are great examples of those are broadcast video distance learning and you know augmented reality in a telco cloud and here again the scope of this Examination is around how you deliver these applications. We're not talking about the cloud infrastructure for actually hosting these applications here and Again to slice and dice the problem Essentially when you are trying to deliver these applications There is an NFE orchestrator involved which is taking care of the end-to-end aspects Which means what needs to happen across data centers in the van, right? Or even within the data center when you're creating a service change to deliver functions, you know And which involves even selecting the right network functions for delivering the you know The solution you want that is in the scope of the NFE orchestrator. Our scope is really what does it mean? from just from the infrastructure perspective open-stack NOAA perspective to deliver these right which is basically You know what is needed in the hardware and tying it to you know, what we need to do with respect to placement and other functions and in the specific example you know good set of vnfs which we see are very useful a stateful firewall and You know crypto for encryption decryption of videos or any any other function and also when it comes to Something like wireless networks or wireless video proxies a critical function because essentially, you know You're sending video at one rate, but then you're actually Transcoding video to the desired rate for the mobile user and depending on where the mobile user is how mobile it is You know the quality of video he can receive is limited and we need to do the transcoding, you know as an application those are Good vnf network function examples, which you know, we're talking about when you virtualize them what happens and For delivering these challenging use cases, you know around these Virtual network functions essentially what we see is you know from the compute side We need fine-gain resource partitioning for the VMs, you know belong to the network functions and What is interesting is so far, you know, there is always a talk about Prescriptive methods of delivering these applications say this is the one and only way to do it But what we are seeing is not necessary. It's all about meeting the application. It's in this case You know low latency reliable delivery That means if you are able to build the right amount of isolation security infrastructure Then you can indeed deliver the application from by that what we're really saying is one method is you can use dedicated course and no more awareness and L3 cache partitioning enabled by Intel RBT or resource director technology and Sraov, but what is also interesting is hey if you don't have a sorrow It's fine. You could even use a dpdkv switch as an alternative but as long as you're able to dedicate separate course for the dpdkv switch and Also be able to partition L3 cache for it in the sense you build isolated pools of this pool of resources for it And you know you can still deliver this application that's efficiently except Prefer perhaps what happens in the case of sraov will be consuming lesser CPU course and you could get the hardware Acceleration benefit that's kind of the key point we want to hone in on and if none of these options are available You could use dedicated physical servers, which could mean even micro servers So basically these are these choices, which will deliver the right amount of isolation infrastructure And when it comes to the network, you know Essentially the real focus we need is overlay not just overlay QoS, but also the underlay becomes critical You know just the data center fabric the end-to-end and What we see as a need real need is high QoS and minimum buffer depth and all the switches, you know the underlay and When it comes to storage high performance logging is a critical need, you know and again NVMe-based SSD storage would be desirable as to pure SSD storage and there's kind of a great relation to this You know essentially tying it to you know end-to-end service assurance for these Logging and log analysis a critical piece to make sure that hey, I have these flows They're indeed following the same path as they should right and and if you're doing real-time log analysis then storage becomes a critical piece and You know as you can see from this picture on the right side, you know Hey if I messed up and it started dropping packets and you know You have one opportunity to just transmit these packets, you know There is no retransmission opportunity because of the low latency requirements We can see that hey we started even a single packet drop could potentially result in a poor quality video as you can see below Right, you know unhappy customer versus happy customer where you delivered the right amount of security and isolation and Now moving on to some other use cases essentially So what does it take to deliver a classic enterprise type of a code? I mean not so challenging as what we talked about so far. Let's say email CRM in the telco cloud again here exemplary data plane BNF's or stateful firewall ideas IPS van optimization or you know IPS I break base crypto From again from a compute perspective what we really see is you know deterministic performance is the key need There you could use numerous Awareness and SRO. We are just no more awareness and more course kind of the same concept you talked about From a network, there are no specific HR requirements and you know from a storage again high performance logging and The last least challenging of the use cases is hey if I want to deliver cost effective residential broadband You know cost is key through the telco cloud What we see is Nat is one of the popular network functions Here what we are really seeing is from a compute network will institute some max capacity limits But not really win guarantees because the win guarantees you need to pay more money Basically, it's a tiered service model and you know SSD basically not even SSD Hard HDD for low cost storage again just want to clarify this all exemplary There are different ways of slicing and dicing these based on you know what the operator wants to do Now mapping these two, you know essentially what you'd like to accomplish to the policy-driven approach essentially our goal is to Minimize any vendor lock-in and dependency while we you know maximize the featured velocity in kind of beating all this Release cycles, you know, it's not going to be six months agile delivery model. How do we get there, right? From that the number one point is you know extensibility, right? So how can the admin or the user right add new capabilities, you know, not just you know Compute but storage could be sender or it would be neutron constraints on the fly and get them to deployment, you know quickly And while in the whole process our goal is to minimize additional code we write right and The next point is around Understandability for example, it should be all human readable scheduling policies, right and so that you know analysis tools can be built on a need basis without any issues and Essentially that means there are no need for any custom analysis tools And the last point is around monitoring we talked about hey You know NFE example around the need for workload consolidation other use cases Our goal is to make sure you know while we deliver the rich policy The admin of the user has a single representation for you know monitoring a type of framework, right? And that means hey just like placement. I'm not waiting to You know deliver my monitoring feature, you know, it can be made available You know pretty smoothly right in a jail fashion and With that to sum up, you know with all this How do you go about it? Essentially our approach is really a best of breed a combination of imperative and declarative choices When it comes to imperative interface choices, we'll be extending the current JSON filter, which is there in OpenStack Nova And what this enables is you I mean basically it empowers the user to Customize specific applications. We see essentially, you know some of the challenging ones where the NFE orchestrator or the operator wants complete control Then you know opening up all the norms are useful and people are familiar with networking, you know, they know about open flow This is kind of very similar to that approach but we also realize that it's equally important in fact even more powerful to you know introduce the declarative interface choices where Essentially, you know could be around JSON filter extensions to the current Nova flavors, which is you know quite a popular deployment model today and The other option could be around data log embedded in AML again data log is a powerful variation of you know simplified version of prologue, but yet very powerful which can directly access SQL database tables for flexible it can be used for flexible You know constraints specification, you know, whatever you want any type of new constraints and also direct database manipulation again here the goal is while addressing Understandability make it really simple for the user to use but it's also extensible framework for the admins That said I would like to hand it off to Adrian who will give us some more go through some more examples Give a very quick overview of where open stack Nova is and you know some of the specific challenges of one placement Okay Thanks, frankly for team up that vision and some of the benefits that we see can come from investing in policy based scheduling So I wanted to talk first about the the imperative related benefits So we already have an imperative related scheduler in open stack today and I'll talk about more of that in a moment But in this example looking at what imperative request means when you start to roll in policy The idea that we're putting forward here is that you can combine multiple sets of requests into that imperative set So in the example showing here, let's say you've got a particular service level. You're trying to achieve and Based on that, you know that if you allocate a new related platform You allocate some sRIOV device possibly for networking possibly for acceleration Or if you don't get that Adding new map and the possibility of adding extra cores Maybe in the first case you just needed one or two cores the second case ten would deliver the same service level but obviously with different constraints and consumption of your resources So what policy driven scheduling allows you to do is make that request going? I need to fulfill a service and I need to do it in any of these particular ways And what we can do with the policy then is to take both of those are how many ever requests you want to roll into it Look at the entire set of hosts. You've gotten your environment to figure out which of those can fulfill the imperative asks Based on that you you can determine a list Typically we look at waiting then to kick in but with policy driven scheduling what we can do is We can prioritize Parts of the ask that will work out better for some type of parameter We're not saying what that is right now. It could be you want to prioritize based on power consumption or resource use validation So in this case, you may say host to because that has new minus rio V and I only need two cores I can consume less power potentially to fulfill this service requirement. So that gets weighted out on top when you look at a Declarative example and this is more forward-thinking I think but if you're to describe more in terms of what the workload really needs from the infrastructure and In this example, we're saying imagine you need a workload that's looking for a low latency Reliable delivery mechanism in that context the admin Through this declarative method can have already specified in the policy spec in the policy store What that type of a categorization means? So one example, it could be a firm or local storage backed by SSDs That's the non-persistent store just because you know you've got a workload that needs to access the SSDs really quickly It's dealing with a large data set You could to cater for the reliable delivery piece look at how you might want to allocate caches on the platform or CPU pinning or lots of different properties The point of the declarative path here is that you're making it an admin and potentially in time a user driven thing To declare what that type of categorization represents and with the policy-driven scheduler We can take that as an input which is now no longer imperative figure out what the imperative related ask For that really represent and use that to both filter and if you have a selection weight hosts So that's where we want to get to I guess it's important to say where we're at right now And if you look at the way the Nova scheduler is constructed It's got this mix of Filtering capability a really expansive list has been developed there. We're at 30 plus I think even 32 right now and you can combine through the the configs back you create the admin defines in Nova The order in which you want to run these different filters Ultimately that will get you Excuse me that will get you the subset of hosts in your environment that meets that imperative request So whichever one of these hosts that you decide to land the work load on is going to meet the needs of that particular specification What kicks in next is the weighting functionality and weighting figures out what order you should look to try and deploy a virtual machine on those hosts So to look a little bit more than at that filter and I'm just showing some examples of filtering here. You've got As a default list, you know compute capabilities availability zones Not shown the list. You've got like Numa filters. There's Aggregate incident extra specs filters PCI filters like I said a really comprehensive list The key point here is that it's generally admin driven for well, certainly admin driven for configuring which filters you're going to run and For many of them the admin defines what metadata or what attributes that filter is going to process So for instance if you're looking at flavors The admin creates the flavor the admin tags that flavor with extra specs saying what that flavor really represents And I'm just giving some examples here that we know are applicable in an FE context You may say you need a certain amount of memory, of course You may look at huge pages and the amount you want you look at the Numa topology you're looking for CPU pinning There's a whole set of things related to the network. You might want to go configure in there There's a set of properties that you may want to be able to parse that come in through the image metadata service And based on all of that we'll figure out what the right set of hosts that can comply with that imperative ask our Moving on then the waiting. So Four plus we five. I think actually waiting methods exist the RAM based which by default will try to Spread the VM allocation across the entire set of hosts that you identified have or meet the imperative ask that the filter schedulers Determined for you And there's a metrics method which you can look at various host state metrics and combine those and put different weights around it IO ops for second and then affinity soft and Affinity and soft anti affinity And I think there's also one related to the disk and you can define the sort of normalization properties for each of these weights And then do an end statement that you end up with this kind of one-ten order for what the applicable hosts are and the example I'm showing here is when you've just got let's say the RAM Wasting schedule or the RAM wager involved So given that's how it works today And we know what doesn't meet to the kind of vision that Rampakie teed up first The kind of problem statements that we're looking at on the shorter term to start with are looking at the fact that it's the administrator That's required to specify all of these imperative asks And they must do it in a Nova centric way So we don't have a method of reaching out to the other systems in the cloud to say I also need to consider the network or the storage related properties It's not possible today to make in a single request a different resource specification asks so You can't like I mentioned go from I need to deliver on a service X and that can be Multiple sets of ways of how to go and deliver that and we can't do that under policy governance And then the third piece you want to look at is that we can't Define different weighting methods for different parts of the cloud So as you define different regions within your cloud possibly through host aggregates You can't say in those regions. I'm going to ram stack or in another region. I'm going to go a really dispersed type of model So with that I'm going to choose Tim who's going to talk in more detail about the steps that we'd like to take to try and close some of these gaps All right. Thank you, Adrian All right, that was great And so I think at this point the thing to keep in mind is that we're actually targeting two kinds of folks using Nova We're targeting the end users and what we want to be able to do is enable them to write policy statements that actually control What kind of scheduling decision gets made? But we're also targeting administrators and we want to give them policy control over what scheduling decisions get made And so what I'm going to start with is an approach to giving end users the ability to write policy statements to make scheduling decisions and On this slide what you'll see is the the blue stuff is stuff that already exists in Nova And the green stuff is the stuff that's new and that we're introducing And so if you're a user today using Nova and you want some sort of rich control over which hosts Your VM gets assigned to what you can do today is use JSON filter All right and JSON filter think of this as like an and or not expression where The tests that you're running are things like how much RAM does it have is the disk larger than this and so on Right, so you've got ands and ors and knots over this over the properties of any host And so using JSON filter the the end user has a great deal of control over which hosts they eventually select for scheduling What you don't have today though is the ability for the end user to provide weights and provide Preferences about which hosts they would prefer to have so in the example that Adrian talked through earlier. We've got Let's say the end user wants to to say that I prefer new minus ROV But I'll also take new ma and ten extra cores And so there's no way today within Nova to actually for the end user to say those are my preferences go off and give me a Machine that satisfies those preferences So JSON weight is a new weight that we're but we're proposing and the idea there is that the user can use a language Very similar to JSON filter But describe the weights that they want to assign to each of the hosts All right, and so from the end user's point of view I've got the hard constraints I can express those with JSON filter I've got soft constraints I can express those with JSON weight and now I as a user have a great deal of control over the actual host that ends up that My VM ends up being scheduled on Yeah Okay, so that's the end user And now what we have is a number of options for the administrator to take control and provide policy statements dictating how Hosts should actually be scheduled and the thing to keep in mind about the administrator is it is quite different Than the end user right the end user has total knowledge total control over exactly what hosts they want the administrator on the other hand Really when they're trying to write policy what they care about is governing how each user request gets mapped down into the hardware Right, and so fundamentally what it what an admin wants to do is write this this policy expression that represents a map from a user request and To the collection of hosts that satisfy that request Okay, and so one way of doing this is to introduce two new filters One of the and these filters are very much analogous to the JSON filter and JSON weight I spoke about a moment ago for the end user the difference here for admin JSON white and admin JSON filter is that it's the Administrator writing the policy not the end user and so if the administrator is writing this policy Remember this policy is something to map user requests down to the host that satisfy them There has to be some policy store for that administrator to put that policy into All right, but just like the JSON filter and JSON weight the admin versions of those filters give the administrator the ability to write both hard constraints and soft constraints that describe how to map any particular user request Into a host and so in this example the way we see this being different is that the user request is a is a description It says I need low latency and reliable delivery and the administrative policy actually dictates exactly how that kind of request gets mapped down into hosts in particular The administrator might decide that what what this request actually needs is new man s r i o v Preferentially or new ma and ten extra hosts. So there's this level of indirection that the administrative weights provides the pros here are that this is The the the main motivation here is that this kind of Administrative interface allows the administrator to adapt scheduling decisions as external data such as from nova and such as from Neutron and cinder become available, which is something the nova team is working hard on this cycle The con of course to this kind of approach is it's yet another filter on the long list of filters that we already have the alternative Another approach to empowering administrators write policy to control scheduling decisions is to modify an existing filter in particular What I've shown here on the slide is well We could actually add a policy field to either the the current nova flavors or the current host aggregates And in the flavors case that policy is really sort of a souped up version of the extra specs for those of you Who are familiar with it remember that a pop remember that a flavor is something that basically Implicitly described the whole collection of hosts And so what this policy field would allow you to do is get even finer grain control over exactly what hosts belong to this particular flavor All right in the host aggregate example if we extend that to include policy really the power that you end up with is that if you've Got one host aggregate You can the administrator can define weights over how the hosts within that host aggregate actually gets scheduled so if in one case the the Administrator decides that he wants to actually distribute workloads as evenly as possible across all those hosts Then he could define weights to do so and if in another case of the host aggregate The administrator decides that I want to bin pack as many of those VMs on a single host as I can then the administrator can Likewise to find hosts for that particular host aggregate All right and so in both these cases what we're seeing is that the administrator has the ability to control scheduling and take into account the user request and in so doing We're moving towards these visions that Ramkin Adrian discussed pros This is extensible again by the admin the nice thing about existing filters It is that everybody already knows what a flavor and host aggregate is and of course Now it's easier for people to understand of course the downside is that it adds complexity to the existing constructs Okay, and with that I'll give it back to Ramkey to wrap up Thank you Tim. So just to give a status of where we are we are in a concept stage with several draft specs and You know imperative part is surround JSON weight And the declarative part we have you know several options new scheduler Which is the policy based scheduler then new filter plus weight, which is the admin JSON filter option then again modifying existing flavor or Specific host aggregate based policies right and in fact, it's not just this presentation We lab going to have three more sessions at a summit one is a nova scheduler working session Are there the Congress integration session? Are there the NFE orchestration buff please to join us? And What are the key takeaways to summarize? We have contributors 10 plus companies and again You know we have really an operator but advisory would like to call it you know, thanks for all the input and our goal is a policy driven scheduling and you know towards delivering service assurance addressing complex scenarios, you know the road to 5g IOT and Our approach is indeed a combination of best of breed, you know declarative plus imperative imperative case user You know describes desired hardware policy in the policy language or JSON weight I mean completely open interface and declarative user describes the application what you know he needs and admin Maps the you know the application to the hardware capabilities, you know host aggregates and all those Which sums up as admin JSON filter and weight and also enhancements to flavors and host aggregates and You know with that said we do have a weekly meeting 8 a.m. Pacific, which is 13 UTC Please to join us regularly. Thank you so much and we're open to questions to Providing anti affinity with the scheduler via a method like this I Think so right now definitely as we see there are affinity and anti affinity rules But the way we see it is a very simplistic, but with this paradigm they can certainly be more in hands right now the rules are more hey These are basically it's just whether I belong to a particular server or not. That's a level of Affinity and anti affinity with this framework extensible framework Our goal is to make sure that you can have much more complex rules Basically, you could craft a rule based on distance network distance Hi, are they going to be any standard terms like you mentioned low latency or resiliency and things like that That can be mapped by users Also for cloud interoperability good question, so That's why it's a combination of both terms. I mean You know if you even go back to the 5g white papers or all those, right? I mean There is never You know a particular value to a low latency. It's always a range, right? Correct. That's that's how that's why we are modeling it this way, correct But it's also about a combination of separate different properties low latency and reliable delivery You know zero packet loss. That's the challenge you're trying to meet, right? so And then I agree on the value of low latency, but The terms like low latency reliability Are there going to be some fixed terms that people can use to describe these so there is Even there I mean just like the word real time There is going to be quite a bit of variability it all comes to the specific applications you're talking about I think that's the best way to look at it. So that's why it's low with a range not saying it's the exact value But no Like we're not going to codify low latency and build it into the code. In fact, it's quite the opposite which is that Let's say the administrator in that case would actually define what low latency means And that would be the point that you can write an expression that says here's what for me as the administrator Low latency in my data center means and then you've got your user who just says I need low latency I don't really know or care necessarily what that means. Yeah Can we introduce a little about the integration of Congress? Because I know that Congress is also a policy driven engine. So Yeah That's a good question. Yeah, right. So so one of the things that we're exploring with now is is How do you make policy support? possible within Nova One of the longer things that we've talked about in Congress for a while is it would be great if all of the other projects in Open stack had policy Capabilities because if they did then you could take policies written in Congress which span all the silos of compute networking storage and so on Have Congress do some analysis over those policies and sort of pick out the compute related portion hand it off to Nova and Pick out the the networking related portion handed to neutron And so you could see this kind of distributed policy based enforcement But of course that assumes that all the all the different Projects with an open stack have policy support and so what we're focused on here is how do we provide policy support? It's valuable in and of itself for Nova's end users But they could also perhaps eventually be used by Congress to do distributed enforcement. Thank you all so much