 Hi, hi everyone. Well, we have a little noise here. So good afternoon. I hope Not too tired after the lunch My name is a shed galore. I work for Huawei Out of the Tel Aviv research center and I'm here to talk about the one traffic controls the service for Neutron today so Little bit of a problem description. I know that this subject is a little misunderstood and there are a lot of Ideas about what it actually means. So what we see today is more and more hybrid clouds where hybrid cloud is acting is basically every cloud where you have also External services even if you're if your organization is using Gmail or Some other kind of software service, then it's already considered a hybrid cloud and what we see is There is an increase in east-west Communication across the one links which results usually in More cost Together with basically the lack of tenant level traffic shaping So what we see today that is you usually have traffic shaping Like on the one link itself aggregated to all of the tenants regardless of the tenants There is no transparency for the virtual networks of each tenant inside this one link so That means the chatty or greedy applications They will starve the rest of your application on the one link and it's impossible to to prioritize between the Various applications and virus virtual networks. That means that basically tenant level SLA is not possible So what is and what isn't supported in neutron? So what we have today in neutron is we have a port based quality of service So that's basically mean we can do rate limitation for a single VM port but not all of the VMs port if it has multiple ports and What is not supported is is Network-level quality of service as I mentioned so if you want to do a rate limitation on the virtual network For example all the virtual machines that belong to a specific virtual network. Let's say vNet one Limit them to five megabit total. So that's not possible now so tenant or project level quality of service or rate limitation Not possible. So if you want to limit all the networks that belong to a specific project Together to to be let's say one megabit. That's also not an option today so If you look at it in this graphics So you can see on the app on the upper side This is the tenants so the tenants can see the projects can see the virtual network state they create and They can also see the edge the external router that is provided to them by the admin the admin on the other side Can define the router? They see the one link or the The connection to the one but they do not see the virtual networks They don't have ability to see or to define any policies on the virtual network level So Why do I need traffic control in e-tron, right? Okay, so the purpose is We actually look at two kind of users for that. So hybrid cloud users for the hybrid cloud users And I will I chose the hybrid cloud as an example here. Of course it meets other use cases and the ability to limit and and force furnace on Services that go out to the one link so for example if you decide to have a pro to prioritize your disaster recovery plan replication and You know over other less important services for example FTP which is a greedy protocol Then that that will make it possible Another user will be the private cloud provider. Okay, so enterprise private provider This is the admin that actually provides cloud services for internal projects or customers And one to enforce an SLA so For example limit the tenant a complete tenant or Department if it's that if that's the it manager To a certain rate limit on the one bound traffic so just to see a few use cases we can see the Just a traffic that goes to the web that kind of One a consuming traffic. There is traffic between sites in a hybrid cloud between the VPCs That could be traffic between a branch in the main office or that could be just traffic between the main office and some remote storage like say S3 and Amazon or something like that and All these points are basically the entry and exit points of one links. This is this is where you would potentially want to enforce a Quality of service and traffic control So what's the implications for implications for users? So if there is no traffic control on the one gateway, there is no control on the utilization of the one bandwidth and then everyone crushes Into each other then no one gets really the Optimal service level. Oh, I decided to go with the use case of a hybrid cloud I think it's the the one that is same simplest to to to understand So on the left side just an enterprise data center some private cloud open stack, of course two projects There is a one link connecting that to a public cloud the extension in public cloud and This is this is what we want to this is the short reasons that we want to to to create our traffic control What are the requirements? So we want to be able to have a project level limit Project means tenant. It's a Keystone project. So different projects and have different limitations We want to be able to have a directional limitation. So traffic from specific Network to or going out to a specific destination within the project may have a different limit We also want to have group limitations meaning group together Make a logical group of VMs and let's for example say a department and To configure it with a different different limitations story and also hierarchy of limits so that You can limit a project to a specific bandwidth and you can limit the summer group of virtual machines inside this project with a different Limit of course a lesser amount of bandwidth So there are various connectivity options when you see when you look today at at all the setups that are enterprises and private clouds have so There are many options You have layer 2 connectivity using The layer 2 gateway or border gateway You have Just VPN layer 3 connectivity. You have MPLS VPN does both L2 and L3 or you just have Internet-bound traffic going to the web So now I'll explain about our implementation for a traffic control device And servicing neutron. So the traffic control device we didn't build a device. It's just software, of course this reference implementation, but What is it? Where do we put it? So It is a simplified figure. We see here some private cloud with three virtual networks Let's assume even different projects. Although that doesn't really matter And all these virtual networks eventually go through a one gateway one router, sorry The TC is an inline device. It should be placed Somewhere between the connectivity from the cloud and the one router Okay, so let's look one scenario. We have single one link. That's a Let's say a small branch connecting to a main office so All the virtual networks will go to the same inline the TC device and potentially we will want to define here some limitations Okay, so The colors match so the top one belongs to virtual network number one to limit virtual network one or VNI one in this case it's The virtual Networks are implemented using Vixlan just conceptually so VNI one limit to one megabit virtual network to Limited the Virtual network to is limited to five megabits together with virtual network three And then virtual network three kids limited to two megabits. So that's like hierarchy another scenario you have multiple one links and you want to you know Load balance between those two. So in this case virtual network is switched towards the TC device number one which goes between a One of the one routers and the virtual network two and three go to the other in this case It's we've selected to to put to put them like this because if you have shared Let's say balance bandwidth Relationship between two networks. They need to go through the same TC device We do not in this implementation at least know how to distribute the buckets Another scenario would be load balancing So in this case, you have a load balancer that you want to shift the traffic between the one routers So the inline TC device will be placed before the load balancer And then force limitation there so a little bit about the software components inside this project So what did we do? So we created a new API extension for for neutron It's currently in the process of being Pushed into it run. It's we have actually discussing this in in this summit and And we've added our reference implementation, which is using the Linux TC Capability which I'll explain about in the next slide. So Basically what you do is you have the API extension placed in the neutron server the plug-in as well and we've added The API is to configure the TC All the messages are sent on the message bus received by the agent that is placed in our implementation on the same Machine where they the TC device is configured and there's a driver there that actually configures the TC device That allows us to have really open Capabilities for every vendor to implement This as they want because the API is a very flexible. So we don't care what is written inside The device needs to understand what is sent to it So they will go together the plug-in and the device will go together from the same vendor but if you are a hardware vendor and you want to set up our Create an adoption for your device into The open stack then basically that means you need to implement the plug-in and That's it a little bit about the TC device inside. So how it works. So in our reference implementation, we're using a Bridge in in this case the OBS, but we are not really using the advanced OBS capabilities just a bridge and we've placed the Curing disciplines or the model the cures that are actually implementing the the the traffic control on the Egress ports between the port in the bridge and the actual nick port So we are only handling the egress. We do not handle the ingress in our example here and the reason for that is the troll most traffic is actually TCP although it may be encapsulated inside VXLan and It's enough that you limit one size and the other side will Very quickly adjusted to lower the MTU a little about the Linux traffic control so TC has Few terms that we should probably cover if you are not familiar with it. So TC is a user space utility For Linux, which is used to configure the Linux kernel packets scheduler you have a few terms here you have Q disks you have classes so Q disks Q Qing disciplines is Where the kernel and needs to send a packet on on the interface it will in Q it to a Q disk and The Q disk is configured for for that specific interface Classes are a term that defines some some Q disks may can contain can contain classes and These classes can contain more Q disks A Q disk may for example prior to a certain Kinds of traffic by by trying to dequeue them from certain classes before others You can find more information about the Linux DC on the Linux DC website a little about the TC filters So the filter is used by a classful Q disk to determine Which class a packet will be in queue whenever they traffic it arrives at a class with the stop classes It needs to be classified all filters attached to that class are called until one of them returns with a with a verdict so quality of service in Linux, so Usually what you do is you set a queue with with a queue with the queue disk With the TC Q disk story Then you set a class with with the limitation And then you select a filter How to you know decide on which which of the traffic to to apply this class? So as an example We are creating a queue here on the device eths eth zero sorry mixing the languages With You can see that on the left side Then we set the limitation of one megabit by creating a class and Then we choose a traffic which traffic to limit in this use case. We've actually Using the most powerful matching function which lets us look at any offset on the packet and do a match in this case We are matching the VNI field of the VXLan headers so an example of the one traffic control device deployment So Usually what we will do is first we will allocate or create advice Either it's deployed inside a virtual machine or a container or a bare metal With an image and then we connect as a pass through device That's and that means we place it before the one gateway or before the load balancer in that use case and then we configure it through the API's We configure it to connect to the message bus and then it will register on the neutron So I have some screenshots here for the Just for the API's we've implemented But the first one is the one TC device That's the first you will use so I Will start with the create the create is actually not necessary because it is done automatically when you connect the TC device It will register itself on neutron You have the option to list show and delete of course regular API's and have an example now over there We have the API for class again, it's a Create delete show list and we have an API for the filter Again with the filter you also specify the matching which we saw in the previous slide Just as an example for a single command setup to make it simple so Just create in one command the TC you specify the network on which you want to apply this with the minimum and maximum bandwidth in this case and That's it. Okay, so That's it about this presentation We are as I said currently in the process of I'm pushing this to the open stack neutron. It's on discussion. So it will be available in this github soon, I hope and With that if Anyone has questions, please which am I can go ahead? Okay, thank you. Oh, you have a question What kind of attributes I came in a little lately you covered this But what kind of attributes can you pass in the open stack that you can normally give TC like I notice you were using Hieroglyph token bucket filter But what if you were like trying to pass it through another like Q or something else and I'm just not seeing how you might accomplish that Okay, that's a good question So It depends on the implementation of the TC device itself So in our case we we've used the Linux TC that has you know Certain known set of functionalities We've kept the API clean so that you can write whatever you want as long as the device will detect it We know how what to do with it. So it's pretty flexible Okay, whatever you put there it will be sent on the message bus. It will reach the device And then it's up to the device to know to figure out what to do with it. Okay Thank you anyone everyone