 Thank you for coming to the Neutron quality of service features and future roadmap talk. I'm Victor Howard. I'm a principal engineer at Comcast Hello, I'm Suavek. I'm working in OVH company as developer operator and developer of OpenStack I'm Miguel Angel Ajo. I work at Red Hat in the Neutron team and I'm going to talk about the first about the agenda that we have for today. We're going to talk a little bit about what we did in in liberty and where QS for Neutron started and What we did in Mitaka and what we planned for Neutron and beyond so during Sorry, let me close that During liberty we built a model for QS in Neutron That it's based in policies and Policies are built by rules at that time. We only had one rule with It was one with limit, but the idea is that you can You can build quality of service policies With rules and then attach those policies to networks If you attach a policy to a network all the ports on that network that belong to instances are going to be configured with the specific Quality of service policy tails if you modify Any rule of the policy those ports Will be updated in in real time and You could also attach specific ports to a specific policy Policies as long as your tenant had access to that policy, or if you are the administrator that That was also a lot so as we go aside In liberty liberty was the first release with quality of service in Neutron It was on there was only one type of rule Available it was egress bandwidth limit It was supported only by ML to plug-in and two of three main L2 agents It was supported by open-v-switch and SR IOV agent. There was no support for Linux breach agent for QoS in liberty Short-remind at the beginning how in Neutron how quality of service works in overall first if you want to use quality of service first you need to create You need to create policy policy is kind of container for rules Second step is to create band rule for example band with limit rule To band with limit rule you need to provide max bandwidth limit value with kilo bits per second and optionally max boost in kilo bits and The third parameter here is policy to which rule should be associated and Last step is to apply this policy with rule with rules to port or network if it's applied to network It will be automatically applied to all ports in this network so on this picture there is let's say in general how open-v-switch is connecting virtual machine to the network and in such case open-v-switch agent will Apply band with limit rule policy by execute by executing two commands on OVS vs CTL tool first command will Apply will configure a band with limit and second command will create a configure boost value if it's provided by user and Now as I said before in Mitaka. We provided also support for in Linux bridge agent for this quality of service rules and Now now I want to tell a little bit more about how it was how it was done so Linux bridge agent is using traffic control mechanism from Linux Directly because for example open-v-switch is also underneath using traffic control, but It's kind of wrapper for TC API in Linux bridge agent. We are using the traffic control directly and We are using Policing on ingress Q disk Q disk is Kind of kind of queue for packets Whenever packets kernel wants to send packets to the network the packets are and Q in this Q disk and Some algorithm is deciding which and how many packets can be sent There is plenty of algorithm available in traffic control for Q disks. There are classless classful algorithm with different configuration options and so on One thing which can be confusing and in fact was for me at the beginning when I started working on it why for when we want to provide support for limiting outgoing traffic we are using ingress ingress Q disk for that an answer is Quite simple. It depends of point of view traffic Which is outgoing from virtual machine from virtual machine point of view is in fact incoming to tap interface From bridge point of view and if we are talking about traffic control in Linux, we need to look on it from bridge perspective that's That's why it's there is this change of directions. Let's say and it's quite important to remember Because I made this mistake and first version was was made in wrong direction. In fact as I said, there is many Q disk algorithm available, but for Policing ingress on ingress Q disk this traffic control is using TBF so token bucket filter Q disk It's one of the simplest classless algorithm how it works In general, there is some kind of bucket which is collecting tokens tokens and are produced by Kernel in each tick There is also Q of packets packets which kernel want to send our and Q in this this line and Packet can be sent to the network Packet if packet want to Will be must be sent to the network it Need to obtain packets Tokens from from this bucket if there is not enough tokens for to send packet packet is waiting in this this line After it can wait after some configured time and after this time packets if they are not sent still they are just dropped The size of this bucket of tokens is defined by parameter called burst It's quite important because to To configure it in proper proper value for this parameter Because if it will be too low Then band with one band with limit will never be achieved. It will be real band with limit will be always lower then wanted one if burst will be too high then too many packets will have available tokens and Real band with limit can be higher than expected Example how Linux bridge agent is applying such rule band with limit rule on host For rule configurating neutron in with command like on this slide Linux bridge agent will execute two commands in TC on TC first command will enable ingress Qdisk and second command will call configure filter to match all protocols and with and policy to with rate given by given in notron API and burst given in Notron API API Third parameter here. It's MTU. It's max Size of packets which can be handled by this filter bigger packets will be will be automatically dropped so Finally, as I said in mitaka quality of service Has got egress band with limit Supported by all three agent open L2 agents from ML to plug-ins open with switch agent SRIV agent and also in which agent now supports band with limit rule at the end I want to say about two let's say side effects which we achieved during working on this Band with limiting for Linux bridge agent first of them was provide support for L2 extension drivers for Linux bridge agent its Mechanism introduced in Liberty also, but it it was supported by open V-switch and SRIV in Liberty now as QoS is implemented as Search driver we need to provide support for these Extensions also for Linux bridge agent and second thing is that we provide also support for Linux bridge agent in full stack tests because there was also tests only for open V-switch Now there are some tests for Linux bridge agent also so new in mitaka is the role-based access control in the Liberty release policies were shared across all tenants What role-based access control allows you to do in mitaka is to assign? Cross policy to a specific project There's also a few things new and neutron that are coming up We're going to talk about DSCP marking rules and ingress bandwidth limiting we wanted to add DSCP markings for Comcast because we We want to prioritize traffic and mark traffic so that certain types of traffic get more attention than others And also for security reasons sometimes we want to only allow certain types of traffic into various spots on the network To implement DSCP marking we modified the Neutron client also we modified the API. This is the Neutron client creating a policy then creating a marking rule and Assigning it to that policy and then assigning the policy to a port This is kind of what happens behind the scenes here a user through the client or through the API will request that a DSCP mark be added Then the OVS driver will send an RPC message to the compute They've received the RPC message then through the OVS agent The the mark is added to the port This is kind of a view of how DSCP sits alongside of the bandwidth marketing band with marking rule So they both both are attached to policy The the rule of the policy can have multiple rules and the policy can also be associated with multiple ports okay, so Another feature that is being working on For Newton actually the one taking it is the dangerous bandwidth limiting we in liberty we introduce the The bandwidth limit rules, but they were all egress because it seemed to be the most common case but in in other cases you can have applications that behave us that the consumers from the network so you may want to limit the amount of bandwidth that they that they consume from the network this is the The the proposal implementation Will probably look like this we have to short out a few details like SRIOV for example as far as we know is not able to to limit on ingress To to the policy now on ingress so we will have to see how to do that another option is to introduce another type of rule, but I Don't know from the usability point of view. I don't like that we'll see and These are other RFes that are Proposal currently for for For QS We have minimum bandwidth support And and I will explain that in more detail, but it's like two parts and we also have Requests for traffic traffic classification rate limiting bilanta villain priority marking and Some more stuff. I will talk more detail in the next slides, but let me start with the minimum bandwidth limit The idea behind minimum bandwidth limit is that if you have Competing data flows in the same hypervisor to start You can Let's see this start point the left bar is one port for an instance in in the supervisor that has 10g interface and Right bar here down is another interface from another instance and This other one is is not pushing packets. So if you have no policy and both Start pushing packets at the same moon, right depending on the protocol, but eventually they Will convert to do something like this if if you Express a minimum bandwidth guarantee on the port a and you say I want this port to have a minimum bandwidth of eight gigabits When they both are pushing packets at the same Rate the port a is going to be prioritized. So this is What happens in the hypervisor, but you cool if if you don't involve a schedule in in this equation You cool End up having ports that have minimum bandwidth guarantees that are that the total sum of them are above the the link bandwidth so you can over subscribe your your hypervisor, so this is an example of three three hypervisors and For instances one has eight Eight gigabit Minimum another one has a seven gigabit Minimum and another one has a three gigabit minimum. So if we start scheduling those We we need to coordinate with the with the novice scheduler to make sure that if we want an Strict bandwidth guarantee they don't always subscribe any hypervisor. So The first one could go to the first node that there's another bandwidth there If we schedule the the next one that has seven gigabit that one can go to to to any of the of the other three nodes and If we schedule the third one that had a requirement of three gigabits It could then go to this node, but it could go to this one because there's still enough bandwidth or it could have landed this one and Finally, I will show the case where But there's an instance with a port with no With no requested guarantee. So this instance could go in in any of them Because it really doesn't matter As long as it doesn't have a minimum bandwidth request One limitation of this is that once From from our point of view is that when the Instances are already scheduled to to a hypervisor. We won't be able to To modify the policy at least for for first for a first for an initial version because if you If The scheduling details and and the amount of resources are in control by Nova We will then know if modifying our policy to to raise Bangwith limit is going to over subscribe and out. So we We should at least prevent adding more minimum bandwidth to to a policy and So, yes, this is what it was saying. We are on working with the with the Nova community and the The work that they are doing in the scheduler with the generic resource pools To introduce a new resource that it will be tracked by the by the scheduler and that new throne will be reporting So this is the proposal about minimum bandwidth guarantees and We have another proposal about traffic class classification this one probably will take Will take more time because some of it it has more moving parts, but The idea is that eventually we could be able to Create rules in a policy that are attached to a to an specific Traffic classifier. So you could put on a higher priority your control traffic in this example is the SSH traffic. So if your Instances overloaded by requests of other types or other types of traffic you will be able to to control it yet and So this is How it will look like in the higher level we have to to sort out the details about how to To do this specific part because it's something that it's shared across different Across different projects and we don't want to implement the same thing in different ways is better to Do it just once From the feedback that I'm having During the summit and we are open to more to more feedback I had this this proposal that I want to discuss upstream of Setting a default Policy for a tenant so when you create your tenant you set up This is your default policy any resource that you create is going to be attached to that policy by default and the Tenant of course will be able to change that policy if it has access to two other policies to that one there's another proposal to to do To talk the priority on the villain packets So instead of doing L3 like the Dstp work does we could do it on level two that could be liberated in your data center for example to to prioritize the traffic at the Tenant level if you are using villain Tagging for providing sorry for tenant networks, but initially the idea is to use it for provider networks and There is another proposal that it's More pluri yet, but we are looking at it is to to limit to have some way to To limit the total external man with that Tenant is is using you can do that now if you Go to your router port and set policies on on the on the router ports But maybe it could be interesting to have some higher level abstraction to do that and not to handle the low-level details and Finally there is another one We are looking at that. It's using the easy and congestive notification to to do things more softer than Policing that's what we are doing now and maybe the detecting a congestion situations or depending on how you set rules to Notify the the other side that you are congested and it has to slow down, but that's still quite blurry and That's it you have the Another presentation about the HCP policing that Colleagues from Comcast did the the last day and you have fell into this presentation and Thank you very much for coming and if you have any questions speak back You are we are really glad to hear thank you for presentation just a question of clarification you mentioned in the first set of slides rate limiting you know between the the guests and Then later in these last sets you were talking about policing and dropping Is your intention to rate limit meaning kind of like a hard fix set and then you start to throw packets away once that once a Once there's congestion or are you talking more about shaping and kind of queuing up that excess Waiting until there's a little space wait. We have different. That's something also that has been mentioned and we will look at it Yeah, currently we are doing policing. It is we are just dropping packets when they are it's the simplest way to do it and Especially we have some technical limitations on the new external to do that because you cannot you can only do it on Bridge egress, but in this case is bridge ingress so Eventually we will get there. I think that the kernel now has support for that, but we have to Keep putting the pieces together So eventually we would be able to to queue instead of Police but the easy and thing is using Protocol that works with a type P level and tcp to notify the other side to slow down Okay, okay second. I'm sorry second second question. You mentioned the complexities of setting these minimum bandwidth limits and the relationship to scheduling in other Resource management models for other types of resource similar There's the idea of been packing that you know as your VMs are being Deployed they're arriving in a certain sequence and that you know as you know the second one shows up You have to deal with what's available at that particular point in time, but then there's an optimization cycle It says hey if we could actually Reorganize based on yeah today. We can actually get a better density or better collection. Yeah. Yeah That's controlled by the Nova scheduler at this point. Yeah, you have those two settings to spread or to pack Do you know if there's work on that on that on that problem? Is there an open? Is there I think proposal or any kind of work being done? You can once that we have I Mean it will work together with the Nova scheduler So I think that that's a setting that you have currently in the Nova scheduler You you can I don't know the details because I don't work too much with that, but I know that you can set it for Packing because maybe you want to optimize your data center and set of nodes that you are not using and say power And so this proposal will kind of just naturally inherit that capability. Yeah, it will go together with with all that I don't know if we'll have something like to spread bang we in particular But okay. Yeah, thank you very much. Thank you So the trade limiting or the bandwidth guarantee is that only on the hypervisor switch or do you actually try and do something with the infrastructure Because your destination could be outside your open stack cloud for instance or even within the cloud But it maybe is doing yeah, and well, so how do you actually make sure that the entire bandwidth is guaranteed from an end-to-end perspective? Yeah, yeah, this is I expected that question Yeah We're we're taking baby steps I think this is the first step is to do it on the hypervisor level at this point in time Neutron doesn't have any knowledge about the architecture of your network. I mean Yeah, it doesn't know the topology of your network, so it's not able to take any decision based on that This is the first baby step then when This works we can look further and see how to how to handle that I Guess that eventually We could be able to do that with the same framework that they are using that they are creating in Nova for the generic resource pools you could maybe specify Resource pools of external traffic or traffic on the switch and make sure that when the Instance is scheduled with a specific amount of minimum traffic that traffic is counted to the Top of rack switch and the other dependent switches or the external connectivity You know we'll have to look at that. Well, the other proposal I had in mind is these are apis and it's a service plug-in architecture Right, why could you not write plugins with the physical infrastructure against these? Minimum bandwidth guarantee and that makes sure that the physical infrastructure that the plug-in can configure does that as well for you Right yeah Eventually you could that you could do that in your plug-in if your plug-in is a way is aware of your topology Yeah, but they do have it will have to integrate also with the Nova scheduling to make sure that that happens Okay, so thanks Yeah, in fact, it was Because we when we compare we had these problems with direction of traffic. It was implemented first Wrong way later. We try to implement it Properly, let's say with this policy and we have some we had some problems with Difference between the same a rule configured for open V-switch agent and applied by open V-switch agent and Linux Bleach agent Finally we found that there is some bug with setting boost value for in open V-switch and That was a reason of real bond with limit difference But this rule is in fact exactly the same it's made exactly the same way like open V-switch is doing that and This In fact, I don't know exactly Yeah, it's a good question I didn't notice it before so maybe we have to look at why we have this or maybe when you set that rule in the In the in the filters you have to just give any priority So if there are other packets that match other filters, they are going to be taken in order So we only have one filter. I think the priority really doesn't matter We have to put any No, no, no, I think it's I Not sure but I think that it's the the priority that the filters are going to be evaluated But I know we will look at it for just in case Yeah, with that with with the with the current model and once that we have the villain The billion priority you will be able to define a policy and set both Do you mean with the with the villain or with yeah, so we maybe we have to document Yeah, thank you. I have two questions about the scheduler Do you have any blueprint aspect already in Nova or are you going to write it from yeah, yeah, there are I then if you go If you go to this link and You go to the Okay, I cannot show it there, but if you go to that link You will find the minimum bandwidth guarantee proposal for Neutron and from that There is a link to the Nova one. Yeah, look for something called a genetic resource pool Okay That's the name for it. Okay. It's it's made not only for this It's made for other use cases like storage and Routed networks. So my second question was do we have to wait for this generic resource full discussion settle in the Nova Yeah, I think that the discussion is quite We just have to wait for the work I think that we can work on the first part that it's Configuring everything on the hypervisor level so that's what we call the best effort Minimum bandwidth guarantee. So if you have you know done within the nick Your guarantees are going to work, but so we have two steps there. So thank you very much for coming