 Hello. Welcome, everyone. So here, we are going to demonstrate a multi-node DVR SNAT HA for the audience. So just before we get started, we just wanted to understand what level of audience that we have so that we can actually drill down deep based on the audience expertise. So can anyone raise your hands like if you already have expert in installing DevStack and stacking up an open stack in your environment? OK, maybe 50% of the audience we have. And the remaining 50%, I assume that they are beginners. And are they beginners or intermediate? So people who have already worked on or had some slight experience. So what we will do is basically, we have already sent out an instruction to download all the images from the website. So all our OVA images that we have asked you to download will be working with VirtualBox. So if you don't have VirtualBox, please go ahead and download the VirtualBox. And once you have the VirtualBox installed on your system, you will now have all the OVA images that we have. And then you can actually import the OVA images into the VirtualBox. And then you can start deploying those VirtualBox images in your system, OK? So yeah, before we get going any farther, let us introduce ourselves. My name is Adolfo. We all work for HP Enterprises. That gentleman on the top is Swami. And in the back, you have Hardik. So if you have any questions, you can raise your hand. Hardik, I'm sorry, Swami and I will be walking around. If you have any questions, we can come. So before we proceed with the lab, we just want to give an introduction about the DVR and the multi-node setup and how it works so that you get an understanding about why we insist on having a multi-node setup to do this demonstration. Because otherwise, you cannot really realize the effect of DVR as well as its scheduling capabilities and how the traffic flows. So we want to make sure that you understand what the concept is first. And then, as you are actually all downloading the images, so we'll go through the introduction first, and then we'll start working on the demo. And then I think Hardik will take over from there on instruction on how to do the demonstration. So as Adolfo has introduced, so is Adolfo. My name is Swami, and here is Hardik. And our agenda for today, as I mentioned, is going to be an introduction. And we will be talking about the configurations. And then we will show you the namespaces, what the namespaces should consist of, and then the traffic flow. And then we'll talk about the troubleshooting aspects. So again, all these things will be covered under the lab as well. So initially, Hardik will go over some of the slides or the documentation that he has to tell you a brief about what you are expected to see on your lab session. And then if you have any Q&As, we can take over. We have a hard stop at 10.30 today because we have another session, a follow on session presentation on the DVR SNATHA in the convention center in ballroom A. So if you guys are interested in knowing more about the theory, you can attend that session later. So we need to run from here to there. So we have a hard stop at 10.30. If you have any follow on questions, so feel free to ping us, any three of us, or if you find us in the convention center, just call us. So just before we drill down deep into the lab session, I want to give an intro on how the DVR works. So if you look at here, the one that I'm showing here is a four node setup. But since we don't have enough resources in our laptops, we just restricted it to a three node setup just to give you a hint or taste of what it would contain. So basically DVR is an implementation feature that's in Neutron that either you can have a centralized router or a distributed virtual router. So once you configure as a distributed virtual router, you can configure L3 agent in two different modes. L3 agent can work in a DVR mode and a DVR SNAT mode. The DVR SNAT mode comprises of the legacy option as well as a DVR option. So basically it's basically used in a network node scenario where you have an SNAT capability, where you need an SNAT capability. That's when you need to enable the DVR L3 agent as a DVR SNAT agent. If you don't need SNAT capability, you just need the floating IP as well as you need the compute resources, then you can just have the DVR. You can start the L3 agent in a DVR mode. So these two are compute nodes. In here, you have to start the L3 agent in a DVR mode. And these two are network nodes. So here I have an active node and a standby node. Here you need to start the DVR SNAT node. The only difference, as I mentioned, in the DVR SNAT mode, the L3 agent will be capable enough to implement the SNAT functionality as well as the regular functionality that it would do. You can also use the DVR SNAT node for implementing your computes. But in a standard regular scenarios where you are deploying for production, you don't normally don't mix the network and compute. You normally separate network and compute. And that's the reason we want to have a lab set up where we have a three-node setup where two nodes are DVR SNAT and one node is a compute node. So the two nodes for DVR SNAT we have is to demonstrate how the high availability for SNAT works. Because the reason we introduced this high availability for SNAT was, previously, the DVR SNAT was running in a centralized network node. And it's a single point of failure. So if anyone is using SNAT heavily and if that node fails, then you don't have a capability to have higher availability. Even though the legacy routers had L3 high availability, but for DVRs, we are achieving high availability by distributing the routers on all the compute nodes. But for SNAT, it's a single point of failure. So we don't have that high availability feature. But now, since we have introduced it, we wanted to demonstrate it. So does anyone have any questions on this high-level architecture? So this is the high-level architecture. And what we are going to show you right now is how to bring up one node and the second node and how to bring up a compute node. And we are going to show you how to bring up VMs on this compute node. And then, if one of the VMs, we can have it as SNAT capable, meaning default SNAT. So all the traffic of the VM will now flow from here, like go through this VR tunnel, and then go through here, go through here, and then use the SNAT namespace and flow out here. And then if you want to create a floating IP for a VM in here, you create another VM in here, and then create a floating IP. And then your traffic for this VM can directly flow from this floating IP using this bridge outside. So the advantage of the DVR in this case is your floating IP traffic need not go through the network node every time you have a floating IP configured. So in that case, in the older model, what we were doing is in a centralized router model, all the floating IP traffic will actually hit the network node. And it's a condition that you're creating in a network node. So now, since we have distributed, we have the floating IP traffic flowing directly from the compute node to the outside world. And also, the outside traffic can actually hit the compute node through this network. The other advantage that we have is we also have the east-west traffic. So say, for example, in the older legacy models, what we have is if these two VMs see here, they are in red network and orange network, even though they are on the same node and they are connected to the same bridge, since because they are on the same two different subnets, the traffic has to go all the way here, go to the SNAT, go to the router namespace here in a legacy mode, and come back all the way here and go to this VM. So now, because of the DVR, east-west, what we can do is we can route traffic directly through this router because we have local routers on every compute node. So the traffic from this one comes into the router, immediately routes the traffic, and then you get the traffic in here. So that's the concept of DVR. So you guys are going to see hands-on what's going to happen, and you can actually go, we can drill down deep into the namespaces, the floating IP namespace, the SNAT namespace, and the QR namespaces. Yes. Yeah, this one is not a true network node because for demonstration purpose, I had it as a network node, but it is a controller with a network capability, and as I said, this is the DVR node. You can also enable compute on it, but it's... No, I don't want to make that work. Yeah, it's not just pure network node because then you might see, okay, where is the controller node? So it's basically a controller and network node. You're right. Okay, I'll hand it out to Hardik, and he can actually walk you through the lab setup. And then myself and Adolfo can help you guys if you have any questions. So the configurations. So before we go for the demo and go for the hands-on lab, we also wanted to go over the new configurations, or configurations required, and how to configure for the DVR SNET and HA functionality. So I'll ask Adolfo to go over the configurations first before even we start the lab and hands-on exercise. Okay, so just straight up front, there are actually no new configuration options introduced for DVR SNET HA. It uses already existing configuration options that you find in your etsyneutronneutron.com for the ml2.ini file. All you have to do now is set the correct values for them and they will get activated. Prior to Mitaka, you could enable HA and you could enable distributed true, which is DVR. But if you try to create a router with those two flags to true, you will get an error saying that you can't. With Mitaka now, you can. So you can get a router with distributed flag equals true and the HA flag equals true. One of the things you also have to do is you have to tell it how many agents you want to be participating in your redundancy or high availability group. You will see those in the configuration files. And you also have to set the configuration default for the router type. In other words, when you don't specify the value of the distributed flag or the HA flag, it has to select something. That's usually dependent on your configuration in the neutron.com file. And depending on how you want it, how you want your system to work, you touch a couple of those configuration options and you configure your default. Another thing too is that since we're using DevStack, I assume some of you are familiar with the local.com file. That's usually where you configure DevStack so that it will do all the correct configurations for neutron.com and ML2.ini. If you wanna see, when you bring up the images, just take a look at the local.com file. You will see the options that need to be set. We try to strip it down so that it only has, that the ones in your image only have the ones that are pertinent to what you're going to be doing today. So hopefully if you forget everything, when you walk out this room, you can actually look at that local.com file and it'll give you an idea how to set up something in your lab. Again, that's just for DevStack. So, and this is how you set your default router. In the etsyneutronutron.com, you have two options, which is L3 underscore HA and router underscore distributor. And that table is showing you what the actual default router will be depending on how you set those. So your options are true or false for both of them and the column on the right hand side shows you what your default router will be depending on how you configure those files. And in this case, we're using CVR as a centralized virtual router or what will be called your legacy router. So here's an example of the controller node configuration. This goes to your question. One, the controller, in this case, the swami, the diagram he had, one of those neutrons nodes was out, is actually acting as a controller as well. And what that means is you have to set these particular variables on the neutron.com file. And as you can see there, the top two are router distributor and L3HA, those are the flags I just mentioned in the past slide. The other things, the 3HA net sitter and the max and min for the agents are in exactly the same ones you will use for L3H legacy and they just simply give certain options that the HA needs. One of them is the seeder for the HA network, which you will understand what that is once we go into the lab. And then the other, the max, it just tells you how many agents or nodes you want to use in the group. And the minimum tells you how many, what's the least you can have. So obviously two is the least, all right? If you put it at one, then you don't really have any HA. And then the ML2.com for the network node, all you have to do is enable this distributor route, routing truth, okay? And you also have to do the L3HA, which I showed before. And then again, in the network on the L3agent.ini, these are some of the configurations you have to do. This is the HA part. The one that's important here is the one at the bottom, which is the agent mode. You have to put in DVR underscore SNAT. DVR has two options for the L3agent. One is DVR SNAT, which runs on the, what you will call the network node. And the other one is DVR, which is what you will do on a compute node. In a network node, you obviously have to do DVR on the score SNAT. On the compute node, it's quite simple actually. You just enable distributor router under the ML2.com file. And here's the agent mode again. In this case, it's a compute node, so we have to do DVR. And I'll hand it over now to Hardy. Thank you. So I think before we start, we just want to see how we are doing with the lab setup, right? In each of your own laptop. So does anyone have a kind of imported all the images and just put up the VMs? Anyone? Okay, a few. How about stacking? Few? Okay. Yeah, so maybe I'll also, before we go to the slides, let's go to, I'll just present some of the MyEnvironment here, which is pretty much the same. As you can see it. So I also using the same exact images with the virtual box, where I have a two controller and one compute node, right? And I also wanted to show a little bit about the networking that I set up. It's pretty much there in the guide, but once you go to the setup network, I want everyone to use adapter two for the virtual box. So this could vary based on your environment. I'm using vboxnet three and four, but it could be different number based on when you create it, right? And virtual box three, adapter three is a vbox some number because I also wanted to show when you go to the networks, you go to the host only networks and then when I click on the vboxnet three, you see I have specified a specific IP address. So the lab is designed in a way that the images have a hard-coded IP addresses for all three nodes, just to make sure that all of the VMs you can reach from your laptop, right? So this IP will be configured in your machine and then 132.2 is the first controller, 132.3 is the second controller and then .4 is the compute node. Then the second adapter, we just configure this IP address to do some exercise for the floating IP and just trying to reach to this IP, right? But if you go to the VM and if you see your second interface, which doesn't have IP address, so that's expected, okay? And maybe now I'll go to the machine, right? So this is the controller, so let me go back. So basically I'm using the IPs, I think it already mentioned in the documentation like how to access each node, right? So I'm just going to the controller, the username and password is stack stack, right? And then when you go to the, there's already an, under the home, there's a dev stack folder in which already has a local con that Adolfo was mentioning, right? So this, we have kind of created some of the service we already added just to kind of match how they are running the upstream CI, right? So we kind of copy this local con that general open stack upstream CI is running and we little bit trim it down to make sure it works proper with the DVR, right? So some of the settings that we are mentioning in this one. So you see the QDVR mode, right? That is specified with the DVR SNET agent, a DVR SNET that is for the controller or the network node. The other thing I wanted to mention is, so these are the way you can also configure, right? So if you see that we are configuring the Neutron con, this is exactly the same settings that Adolfo was saying, right? You turn on the L3HA to true and you specify the minimum and maximum number of agents per router, right? So in this setup, if you only have a two controller or network node, so we just set the value to two, two. So once you go to that dev stack, right? You just have to run the stack. I will not run it because I just did it before come here just to make my setup ready, right? So, but you run it, it will take like, for first controller, it will take maybe, I would say 10 minute because it's already set to offline mode, right? Every code in everything is got downloaded. So you don't have to worry about the internet and any connections. And the other controller, too, when a compute node will be very quick, it's less than a minute per node because it has only few services running, right? So any questions so far on dev stack setting up the node in your own laptop? If not, okay, sure. Yeah, oh, it should be offline, true, so, but I'm not sure. We double check, but if you see, I'm on the controller too. Oh, okay, okay. Okay, yeah, so it's good to know if you hit onto that issue, make sure this to flag, right? So offline, true says like, you don't download any kind of code, right? You just, whatever the code you already have for the each component of the open stack, you just use those code and build the dev stack, right? And this is the same, like you don't clone the new repo. And in case if you want to see it, it's under OPT stack, you have all the repositories for the different components that are running on the dev stack environment. But one more thing I would recommend is if you already have a stack up with the offline false and reclone years, I would recommend you delete that VM, okay? Re-import that VM, change the local Conf and run the stack because the problem could happen is that you might run the different code on the both controller, right? Because now in one controller, you have a setup offline equal to true, a false and reclone equal to yes, which means when you try to stack it, it's trying to get the current master code from the upstream. But while you stack the controller, which is running like few days old code. So I would recommend if you run into that issue, just delete that VM for now, re-import that image, go to the local Conf for the controller too, sorry? That's fine, but if you set those settings properly with the offline true, the dev stack will not go to the internet and try to get the repo. Yeah, so you will like mismatch the running the code on the both controller. So yeah, yeah, then just go to the controller and change those two flag as mentioned here, offline true and reclone no and just re-stack it so that all the nodes have a same code. So I can go over the slides now, or if you guys want to take some more time to set it up, any preference? Yeah, so maybe I can go over the slides and then once everybody is stacked up, I can also run the lab here and you can do it together. Sounds good? Okay, okay, so I'm not sure whether everyone can see it in the back, but here I just wanted to mention about the two DVR SNET modes, right? Which are required to run the DVR SNET and in general DVR. So there are two more DVR SNET and DVR as mentioned by Adolfo. So generally, right, when you do the neutron agent list, you should see the number of L3 agent, right? To run the HA environment, you should at least have two of your controller running in the DVR SNET agent node. And there are a few commands available to just verify that how the agent mode is running in each node, right? So you first find out the L3 agent by just running the neutron agent list and just kept for L3 agent. Once you have a L3 agent, you can run the neutron agent show with that particular agent, which will give you the mode, whether it's a DVR SNET or a DVR, and it also shows the which host it is running. So now, with Adolfo mentioned about that, from the Mitaka, right? If you see this option, how about router creation, right? So now you can set up the distributed as well as HA to true, right? And at the end, you will have a distributed SNET HA router running in your environment. Before Mitaka, you can still have this command, but when you execute it, complains that, okay, it's not supported for now. Even though you use the same config, it will just complain, right? So the two screenshot mentioned that based on your default configurations, in your default configuration, if you have already set up for the DVR and HA, then you really don't have to pass those flags. You just create a router, and any router will be DVR and HA enabled, right? So these flags are available just for the flexibility in your environment, if you want to create the CVR, DVR, HA, and non-HA routers. So once you create the router, now you can see the HA status. And basically it provides which L3 agent is hosting the active router, and which agent is hosting the standby, right? So it's sometimes it's easy to troubleshoot, to make sure that where exactly your SNET namespace will be active, where all of your traffic will be flowing externally, right? So if you see in this example, it says that the active is running on the controller two and standby is on controller one. And the other one thing to note is that if you have previously created a router, a DVR router, when you execute the DVR router, it just a simple database entry, right? From the previous releases. It doesn't do anything, it just create the DVR entry in the back end. But now with the DVR SNET, when you create a DVR SNET router, just creating a router will create the namespaces. If you see in this example, same namespaces, Q router and SNET namespaces will be created on each controller. So that's little different than if you are used to with the previous DVR routers, right? So as soon as you execute command, it will be namespace will be there. On appropriate number of controller again, based on your configurations. This namespaces will be there, but it will only have a loopback addresses. There is no other interfaces attached. Just a loopback address is there. And as it is an HA, it will also create a HA network. So if you remember from few of the slides where Adolfo mentioned about a specific cider for the HA network, something like 169 or any network, right? So based on that configuration, so first of all, that HA network is a open stack tenant network. You do not have to create it. It will be created by itself, by the neutron itself, when you create a HA router, right? So as an example, let's stop here, go back. So now, right, I am on that dev stack. So first I'll say, there are no networks. And I'm sure there are no routers too. So now I'll go ahead and just create a router, right? I'll not specify any flags because these configs are already set up for the distributed as well as HA, right? So it created the distributed equal to true. And it says HA equal to true, right? So now, if you do those previous commands again, you see router is there with the flags is distributed and the HA is true. Now if you do the net list, it already, so neutron has created this HA network based on the cider that you specified in your con files, right? So, and this creation and deletion of this HA network is done by the neutron itself. So you just have to specify which subnet or a cider you have to specify for your HA traffic to talk and use for just detecting which is active and which is passive or a standby, okay? And yeah, we can also see L3, you know what I think. So now we can see the status, right? There is a HA status there. If you just do the DVR, you don't see those active passive. So in this example, we can see the controller two is hosting the active router or active snet router, but controller one is in standby, right? So I wanted to show one more thing. As I mentioned that when you create a router, right? We haven't attached any kind of interfaces yet to the router, right? But if you go and just check the namespace, there are two namespace, right? Router namespace and snet namespace on a controller one. Now I switch back to the controller two and it already got created, right? So all namespaces are got created as soon as you just create a router. It's exit the same copy except some IPs are different. Maybe we'll go a little later after this. Okay, so far any questions? Okay. So we'll go back to the slides. Yeah, actually, yeah, it's good to know, right? Because DVR routers needs admin rights unless you change your policy.json files to overwrite those settings, right? So that's a good point, actually. Just when you source that file, I think in the instruction it also saying to do the admin admin. Okay, so going back to the slides. So as I was saying, right? After you create a router, it already has a namespaces. So this is the namespace. In this one, I'm just going to the queue router namespace and the second I'm going on to the snet namespace. So if you see the queue router namespace is only having a loopback address or loopback interface. But if you see the snet namespace which has a loopback plus HNA network. So what it does, it also creates the neutron port and bind the neutron port to that each snet namespace. So it already assigned the IP address, right? We'll look into it in detail. But here on the controller too, there is one difference. If you noticed, queue router namespace is pretty much same, only loopback interface. But snet on the snet interface, if you see the HNA router or HNA interface, which has one more IP address, which is 169.254.0.1. So that is basically an virtual IP address, right? For HNA to work basically, it is just a basic HNA. So whichever the agent or whichever the snet is acting as running as a active namespace, which will hold that virtual IP. So if you fail over to the different router, or sorry, different agent, you will see this, this IP will move over to the different namespace. We have some exercise on that just to see how it's moving the IP from active to the standby and standby to the active. And this IP is also used for monitoring, right? Just to see that the health of this namespace or agent is fine, which is hosting the router. So we can take a look again. So in this scene, it's I think same. So let me ask, can we wait for a few minutes so you all can have us take up and running or we keep going? Keep going? Okay, see it only has a loopback. So again, the queue router namespace, everything is, it's just a copy, right? So same thing on the controller too. We'll check the snet namespace how it looks like. So as the controller one is hosting the standby, it only has a one IP address, which is just to do some communication for the HA traffic. The one which is running the active, which already has one more IP, right? And if you see this IP addresses, right? 192.2 and here is dot one. So if you do neutron port list, you should see those IPs, not the VIP. The IP which is configured on the interface when it's got created, it's all the neutron ports. And in fact, in this lab, right, you can also try exploring creating a non-HA router. It should still work. Just make sure you set the HA to false and just create it. And then you see the namespaces, right? You will see the difference between the HA and non-HA in a DVR. And in case we can create. So now when we create this one, oh no, I just set up to true because I wanted to create a DVR router, but non-HA, right? So here we can see the difference, right? Oh, I created the same name, so which is little weird to check down, but that's fine for now. The first one I created both as a true and the second one I created as a true and a false. So remember, when I created the HA one, it already got namespace got created, right? But if you see now, there will be no, there will be no extra namespace. It is still the same thing. So that's the difference to remember, right? If you suddenly see the namespace when you create a HA router, but if you are not used to it when you used to create the DVR router. So that's the one difference you should keep in mind. What else? Okay, so this one is, oh, I should. So now this is, I think I'm assuming most of you are already familiar with it because this is just a simple that you create tenant networks, right? And you create an external network. We'll create here, but, and then you just attach those, the appropriate interface to the router. So these are just the syntax. The main point is, I'm not sure, can you guys see it in the back? Little difficult. But anyway, the same screenshots are available into the lab hands-on in case if you want to just take a look. Yeah, I can do that. Okay. So let's do that. So I don't have any networks. I'll just create quickly. So I'll create first tenant network and I create two tenant network. So now we'll have a three network, two N1, N2, and then the HA network got created. I'll create subnet for each. So this is just a, so now we have basically two tenant network. Again, this is just a neutron stuff. Now I'll create external network. So I created an external network and then I'll create a subnet, which is little challenging for the, I'll try. So this command, right? I want everyone to use the same subnet in even your environment. It's already there in the document so that you can at least explore the connectivity from your VM to the external world. So in external world, in this case, it's just a host laptop or a computer, right? So I want to make sure everybody use the same subnet, at least for the external network, use any network for your tenant networks so that you have a, you can follow the instruction and that will help you to understand. I hope I'm not, oh, good. Okay, so now at least we have a two tenant network and external network in the H&N network itself, right? So remember, we haven't, we haven't add any of the, this network to the router yet. So if you do the neutron router port list, you know what, I should remove that the router that I created because it has the same name. So now we'll add the two tenant network to the router or before that I should also do neutron port list for that particular router. So as we see, there are only two ports which we already know that they are bind to the each SNET namespace on each controller, right? So now I'll go ahead and add the internal tenant networks that I have created and then I'll add the other one. So at this point, we should now check the namespace again. So now we see like other than loopback, we have two more interfaces which are QR for each internal subnet or a network that we just created, right? So as I mentioned again, Q router is always a copy everywhere, right? So now if I go from controller one to the controller two, it just the same. And even point to note down that using the same IP we'll go into the detail, but remember it used the same IP address in each namespace and even the MAC addresses are same, right? Interface names, everything is same. It's just a copy, router copy running everywhere, right? Wherever is needed, I would say. And then we'll check SNET thing. Hmm, looks like we have a problem here. Hardic, just the information. So just a quick information on that router creation. We normally create routers only on demand on the nodes because we don't want to create routers on all the nodes because if you create routers simultaneously on all nodes, there's a lot of control plane traffic that will actually flow from the network from the new transfer to the agent every time there is an outer update or every time there is automatic sync from the agent. So we always do it on demand. So when there is a VM that actually comes up or a port, which we call it as a DVR serviceable port, we normally call the load balancer port or the compute port or the DHCP port as the DVR serviceable ports. If those ports are coming up on a particular node, then we go ahead and create the routers on demand. We don't go ahead and create routers ahead of time, just for your information. Okay, so I just made some modification. So here if you see, I already added external gateway. So now our router has a interface connected to the external network and it also has an interface connected to both of those internal network, right? So now let's go back and check the namespace how it looks like. So QR, no changes. Yeah, no changes. So in the SNET, so the initial two are we already discussed. There are three extra ports got created, right? So remember SG ports are always associated with the internal tenant networks. Oh, okay, sorry. So SG interface is inside the SNET namespaces are correlated to the tenant networks, right? And the QG port, which is basically connected to the external world or the external network. So in this scenario, if you see that all of those interfaces are down and it doesn't have IP addresses. Now if we go back and check the other namespace on the controller two, and now we can see all the SG and QG ports are having IP address, right? Which is saying that this controller two is hosting the active DVR SNET router for now. So that's one point. And again, remember when we execute that command like L3 router hosting agent, we see that active and standby status, right? So whichever is hosting the active, that namespace will only host the IP addresses, right? So again, the name, it's all same. Interface names are same. It use the same MAC addresses. But in this case, if you see it's just down, right? It doesn't have IP addresses because it's a standby. Sorry? Yes, it's the same. It has the same MAC address. Yeah. That's true. And again, in the QR router namespace, it will be the same as Swami mentioned, right? Same copy. What we do to get rid of this same MAC address problem is we don't want the underlying switches to see the same MAC address. So when a traffic actually goes out of through the tunnel endpoint, using the VR tunnel when the east-west traffic goes out, we actually modify the source MAC. We have a local MAC for every node. It's a unique MAC that we generate for every node that is in your network. And each node has a unique MAC. So we actually swap the MAC and send it out so that you're not exposing your gateway MAC anywhere outside in the local switches. So we continue? Okay. So this one just like showing that when you create VMs in your network, whenever the VM got created, router namespace will be created on that particular compute node, right? So if you have a compute node running without any VMs, even though you have created routers and attach all the interfaces, you should not see that QR router running on the compute node, right? Because there is no need to serve anybody there. Nobody is there, so there are no VMs on that particular compute node, so you should not see the QR there. It will only get created when there is a need. So this example is just showing like, you create a VM and the router namespace will be created on the compute node now. So again, the compute node router, it's again a copy. So if you see the QR router namespace interfaces, it will have a same IP and a Mac there too. Okay, so next I just wanted to go over some of the traffic flows, right? How actually packet flows in DVR and DVR has different modes. So when you're troubleshooting, it's good to know where exactly to look for the information, right? So first use case I'll take is a very kind of a simple use case. Again, it's a east-west traffic. VM to VM and both VMs are running on the same compute node, right? But again, those both VMs are part of a different subnet because here we are talking about DVR, so we are talking about routing, right? So in this case, you have both compute node, both VMs are running on the same compute node. So a simple, right? The first screenshot mentioned that I'm going into the one VM and just pinging the other one. So all the traffic you should see going through this QR namespace which is running on the compute node, okay? So traffic should not leave that compute node in this case. So only thing you have to troubleshoot is within that particular compute node you just look for the namespaces, right? And just capture the packets and see this packet coming in. So in this case, same QR router will send requests and response, right? Again, there are other complexity there about the Linux bridge and all, but we can take it if you have more time after that. So this slide, I just wanted to show about how how neutron ports are correlated with the OVS or open-v-switch ports, right? Because when you create any neutron ports, you have that UUID, right? So how do you correlate that UUID when you are troubleshooting the OVS or open-flow or any kind of open-v-switch things, right? So first, if you see on the first screenshot, I just did the neutron port list and grep for the VMs IP and I found those two UUID, right? And on the left-hand side, I'm running the OVS vsetl show command for just to see which are the ports are there in the integration bridge, right? So the theory is like, if you take the UUID, first 10 characters, I think including this dash, that UUID number will be appended with the QVO port, right? So that's how you correlate. When you look into the left-hand side screenshot, it says QVO plus the ID, the initial 10 character. So there you know that, okay, this port with the tag four is associated with my VM one, right? So if you want to troubleshoot, maybe you can, and QVO and all port will be there on your host. So if you just do IPA and grep for those port, you'll be seeing those things. Maybe I can do it real quick after this, right? And you can even capture the package. So even if you hit to the router, you can look into those ports and figure it out whether it's even coming there or maybe some Linux bridge is doing some security or anything like that, right? And then on the second screenshot, I wanted to see that I run the command says OVS, OF, CTL, DOM, port description, and the bridge name. So there we are exactly seeing the same port but there's one more number in front of it. It says port number nine, right? So those port number are for the open, we switch ports, right? So if you look into the flows and all, you will see this, they are doing the operation on those particular ports, right? So in this case, number nine port, if you see some of the OVS flow, I'll show you later on, that are associated with that particular port connected to the VM itself, right? So let me boot VM, I'll show real quick. I'll just boot real quick. Okay, so now one thing we can see as we are booting up the VM, we go to the compute node and now we have that QR namespace that I was talking about, right? And then some of the command we can run. So integration bridge and I am suspecting this is the first port. So QVO tagged one. This port is basically our VM port on this particular compute node, right? And maybe we can quickly look at ID 471. So yeah, here. So it takes this number, first few, and up and with the QVO port and it creates the port in the OVS, right? So now if you do this QVO, you should have that port, right? So now if you do TCP down, I don't know whether we'll see traffic, but I'm just saying you can do it. Yeah, we'll see some traffic if we are sending some traffic here. Okay, so far so good. So we move on, right? Yeah, okay. So how are we doing with the Dev stack? All good? Yes, no? Yes, good. So I don't know whether we'll get time to do all the exercise by yourself. I don't know how much time. Oh, we have 25 minutes left, right? So I still have a lot of slides to cover for rules and all. So I think we should finish at least slide. And then you can exercise by yourself if time permits and feel free to ask. Yeah, so maybe, I think so far good, right? Ports and all and you can capture it here. So this one is before even routing. So maybe we can also, really quick, I can put one more and see if you want to do some capture and all. So now we have two VM and similarly, you'll see one more Qo port here, which is on a different network. So we'll have their own internal tag. So I think we'll go over the next slides. But I would say for VMs running on the same compute node, you just certain things to look around, right? The maximum, if you are looking in the router, you go to that compute node, do the capturing on that router namespace, or you can look into those OVS ports and just do the capture there to see the VM traffic. Because those traffic is not leaving your compute node, at least in DVR for East West. Okay, so this one I'll go over quickly, right? So these are basically in flows. This is not in particularly all the DVR flows, but some of the general flows that how the traffic is flowing from one bridge to the other bridge and how they're communicating, right? And then some of the screenshots are there. So you can execute this command once you have your DevStack running and you will be able to notice all those flows, right? So in integration bridge, right? So simple use case that we took earlier, that when the both VMs are on the same compute node, right? So in this case, if you see the first rule says if traffic is coming from VR tunnel, right? Which means if VMs are running on a different compute node and it's traffic coming from the tunnel, and then we do something. But let's avoid that for now. So other simple rules, if you take the first three, the second, third, and fourth, which is basically looking for the ports that I mentioned earlier, whether it's ARP traffic or any other traffic, just send it to the different table. When it goes to the different table, it also looks whether it's coming for that particular VM port, what kind of traffic, whether it's ARP traffic, and then it also validates whether it's coming from the MAC address of the VM itself, and then it do the normal processing, right? So these are all, I put it on the slides in the document in case if you want to just follow three. And I think for LMAC, it's little, I can explain it later, next stage. But based on that LMAC, I mentioned about we have a local MAC assigned on every compute node. So we make sure if a packet comes in with an LMAC, what we do is the packet first hits the VR tunnel, so the flow that first receives the packet, it actually strips off the LMAC and then throws the packet into the VR int, and then there you have, already the VMs are in there, and then the packet will get into the VM. So the only, the one design issue that we have with the DVR is basically we don't route the packet on every node, okay? So that the routing happens on the node where the packet originates. Because if you are actually sending a traffic from compute node one to compute node two, the compute node one's router is the one that's routing the traffic, the compute node two just receives the packet. So if you want to do some kind of stateful firewall or something on the routed packets, you cannot do statefulness because we are not doing routing on both sides. It's kind of a symmetric routing that we are doing because we are doing routing on one node and when the packet is coming back, we are doing routing on other node. So when the receiving node receives the packet, there's no more routing in there. It just drops it into the VR int and the packet just flows through. So in order to address this statefulness, what we have done is I think the initially firewall as a service used to be applied onto the ports inside the router namespace. Because of the statefulness issue, what they have done right now is they are redesigning the firewall as a service to apply the firewall rules on the VM ports itself. So before the packet even hits the VR int from the VM port, the firewall rules are all applied in the VM port. Either you use the OBS based firewalls or you use the regular based firewalls. Yeah, but in the case of, yes. No, no, no, I think the restriction what they have done is firewall was associated with the router, they don't want it to be associated with the router and they wanted to move it port-based firewall. So they are moving it into the port-based and once they move it into the port-based, so what we recommend is if we are using DVR, just don't apply the firewall rules on the DVR router ports because you will not get a stateful firewall in that case. Yes, yeah. Yeah, so that's, so if you see the first rule, right, if traffic is coming from the different node, right, and it's matching one of the LMAG, so LMAG is unique per node, it goes to the table one. And if you see in the table one, it strips the VLAN ID, okay, and remove the, like I said, the source MAC address. So it's remove the local MAC and put the routers or the VMs gateway MAC and send it to the VM port, right. So LMAG and all, it's only basically when it's leaving a node from a BR tunnel to the other one, it rewrites the source MAC to the LMAG MAC while coming back on the BR tunnel, it removes the LMAG and put it into the router's MAC, right. So these are the flows. I would recommend once you take a look, it's based on this flow, if you follow it, it will be easy to understand. There is also YouTube recording that we did a presentation in Paris regarding the DVR, which actually clearly explains the tables, the OBS flow tables, what are the flows that we added and what are the tables that was existing and how the flow goes through. So if you have some time, just make sure that you go through that YouTube video. Yeah, so as Swami mentioned, right, and I'm saying like this, I'm just summarizing the flows that you should see it in your environment, right. These are not all DVR related, but I'm just trying to give you the overview and summary of how the flows are there inside the OBS. This is a good utility to look for like how the OBS is learning, right. So if you see the seven, eight and six, nine, those are all two of them are VMs and two of them are router port, which has a VLAN and all. So it's good for showing the whole MAC address that is MAC and a port learning by the OBS itself. So when any of the OBS flow goes to the normal processing, it hits to this whole table and see where the MAC and port binding is there. So this is a good command to look in your environment. I have one thing to add. Because we are using the LMAC and we are not exposing the Gateway MAC outside of the node. So what we do is we also prevent ARPs from going out of the node. Yes. We don't want ARP to be polluted with all the Gateway MACs from coming from different nodes. So what we do is we actually prevent any ARPs going out of the node. How the DVR works is we prevent the ARP, but what happens is the control plane takes care of populating the ARP database in the outer name space. So whenever a DVR serviceable port comes up, we know where the port is coming up is landing up and then we know the MAC of the port. So what we do is the control plane goes and writes the ARP table in each of the outer name space. So as soon as the port lands into the node, the ARP table is populated by the control plane. So we don't send any ARPs outside. We prevent the ARPs just for your information. So if you have any issues in pinging to a port, you're not getting a response from a port. The one thing that you can actually go and check is go to your outer name space and see if you have the ARP table populated for that particular MAC. What? As soon as the VM dies, the ARP table will be cleaned up. So what we do is that we have a binding between an L2 agent and an L3 agent. So whenever an L2 agent cleans up a port or deletes a port, we get a notification back to the neutron server and the neutron server goes and cleans up those ARP tables. So it's not left over there forever. Yeah, I think that's how it works right now. We haven't seen any kind of scale issues with respect to ARP entries because I think so far we have tested 8,000 VMs as a scalable one in the control plane. So we haven't seen any issues with that so far. Okay, so to continue, this is also one of the good command to look for to understand the flows, right? So I would recommend to check this out. So basically it gives you, so you have to put out the bridge name, which port, support number, these are the OVS ports and then you can specify there are a lot of different fields. That's why I put the main page to look for it. You can specify the source MAC, destination MAC, VLAN, IP, protocol, all those things. So the idea behind is let's take an example, quick one. You have two VMs, right? Now they are on a base subnet or a different subnet. You can put those command with their appropriate MAC address and once you execute it will tell you exactly which flows it's going through the whole chain, right? To whether it's going, whether it's dropping the packet due to some OVS rule, whether it's going to the normal processing or it's going through a different bridge. So it give you the whole chain. And I have put some example here, right? So the first example is that I put the source and the destination of the VMs MAC. No, in this case, source is the VM MAC, destination is the router's interface MAC. So if you see the line is saying that it's going to the port in, so I provided the port. It's a VM port. It's doing, it's saying it go to the table 25 and the table 25, it's matching whether it's a VM source and then it's doing the normal processing. It says forwarding to learn ports and learning is basically, it means learn by the OVS for the integration bridge itself. So if you don't have any of these entries and if you execute the same command, it will hit the different flows and it will tell you that I just flooded out to learn the MAC for that particular inside the bridge. And on the second I'm doing it from other router interface going to the other VM. So that is just says it's go to the normal processing for the learning, right? So I would highly recommend to check out this command if you are really troubleshooting the OVS flow rules because in a larger environment you may have a lot of flows and it will be kind of difficult to figure it out which flows to hit, right? With this command and if you have appropriate VMs, IP and MAC that you are troubleshooting, you can pinpoint exact flows that are going through. So I think we only have nine minutes. I think I'll go a little quick now. So this scenario I wanted to show is like again a East-West but VM are running on a two different compute node, right? So in this screenshot as I think Swami earlier explains that the routing is only happening on the source node, right? So if you look into the source node, compute node and a QR namespace, you should only see the requests are hitting, right? So if you're troubleshooting and you wondering why I'm only seeing requests on the QR router namespace, right? So this thing is good to remember that routing is happening on the source node. So if you are capturing packets of the QR router, you will only see the request. And if you go to the other compute node and if you do the capturing there, you will see the reply. So basically because the request is originated from compute node one, so you will see all the requests on QR router running on compute one. The reply is coming back from the compute node two, so you should see all the reply on the QR router namespace on the compute node two, just to make sure it's little different than the previous case, right? Because this, so again these are all I put down some of the good information to look for. We don't have much time, but I can go over basically real quick. So it's a simple idea, right? From br int, it goes to the br tunnel, right? So if it's routing, it will go to the QR namespace. From there it goes to the br tunnel. From the br tunnel, it converts the source mic to the local l mic before it goes to the other side of the tunnel. On the other side of the tunnel, once it receive it, I should talk here, once it receive it, it may basically based on the l mic, it send it to the right integration bridge, and on the integration bridge, it removes the l mic and put it the router's interface mic address and send it to the VM port. So that's how it goes from br int to the br tunnel to the other br tunnel, and then go to the br integration to the VM port. So this is basically an explanation of also if it's a packet is unique as broadcast multicast, it just go to the different table, strips the wheel and ID because those wheel and tag that you notice in the OVS integration bridge that are local to each node, right? So you may have a different wheel and ID on each side. And that's the reason that all those flows will strip out those wheel and set out the appropriate tunnel that based on the l2 population. So l2 population is required for DVR to work so that it doesn't broadcast on each tunnels, right? So these are the flows. Again, I have all those things into that guide. If you look it, I hope it's helpful. So this is, I just explained, right? If it's coming from the VXLand, you modify tunnel ID, set the local VLAN and just set it to the br int. In the integration bridge again, you remove that l mic and put it to the VM port. So this is the same thing. You remove the l mic and put it to the VM port. I think we have only five minutes. I think a quick time check, there's only five minutes. So I have a few slides, maybe I'll just quickly go over. Just quickly go over the slides and then. But again, everything is there in the lab instruction. I hope you can find it helpful. Oh, he has a question, so. Okay, so this is about how VM is going to the external world. So this is also important in the DVR, right? So if you see in the screenshot that the request, because routing will again happen on the source node, right? So now we are going from VM to the external world. So the request you will see on the QoR router namespace on the compute node. Now I put some rules, right? So in the QoR router namespace, there are some IP rules set up to send the traffic because again, QoR router namespace doesn't have an interface on the external world, right? Somehow it has to send the traffic from QoR namespace to the SNET namespace. So there are IP rules. If you follow those commands, you will find out those rules for that particular table. So that's how the QoR namespace is sending the traffic. So from VM goes to the QoR router namespace and based on that rule, it goes to the SNET namespace. And again, SNET namespace has the internals on the external world that can route the traffic, right? So if you see the second screenshot, right? It is captured on the SNET namespace where you can see the packets are going out and coming back, but coming back directly to the VM itself. It doesn't go back to the router because SNET namespace also has the interface into that particular internal interface. So this is, I just wanted to show the HA thing. Again, it's pretty much active passive. If you just flip over, you will see the traffic starts moving to the different SNET namespace. So I'll take just one minute, sorry, because we don't have much time left, but this is for the north-south traffic. North-south is basically when you assign the floating IPs to your VMs. So when you create a floating IP and when you associate the floating IP to the particular VM, a fifth namespace will be created on the compute node, right? And it's a one per compute node, and I think it's per external network too. So if you see, there is a FPR port. So what it does is FPR port, and next slide showing is again, a Q router namespace running on the same compute node. It has a RPF port. So FPR and RPF port are basically a V each per link between the fifth namespace and a Q router namespace on that particular compute node. So and that's the link that they are sending the traffic in a board direction from internal to the external having a fifth and from outside to the VM. So this port, I think it's good to look for when you are really troubleshooting the fifth namespace or fifth related issues. Make sure those ports and links are there because otherwise you have a fifth running to the connector to the external world and Q are connected to the internal world. So you need a link to connect both of them. I think that's it. And these are the document I think we already have in the USB. If you haven't read it, please copy it, run it by yourself when you get the time. Feel free to send out an email. I think all of you have my email, right? So if not, maybe you can get it. I sent out an email so hoping you have some email contact for me. Okay, thank you. Yeah, any question? Maybe we can still take a few minutes and that's fine. A namespace will be created and when a new ASNAT namespace will be created, will it be created when you created a new D router? Yes. So when you create a DVR in HA router, when you just execute the neutron router create, both the namespace will be created and all the number of controllers based on your... So both FEEP and ASNAT? ASNAT and a QR router. Okay, so when will the FEEP namespace... So FEEP namespace will be only created when... So when you create a floating IP and then you associate the floating IP to a VM, at that moment the FEEP namespace will be created with all appropriate ports, right? And FEEP namespace will be only deleted once you disassociate all the floating IP as well as now from Mitaka or even a little earlier that you also clear the gateway on your router. Then and then the FEEP namespace will be go away. Otherwise it will just remove and add the ports. So does Mitaka support multiple floating IP pool or multiple floating IP blocks? Or we can have only one floating IP blocks like one public network created? External networks. Okay, so we can create... Floating IPs are associated with external networks. So if you have multiple external networks you will see two floating IP namespaces. If you have two external networks you will see two floating IP namespaces on a compute node. Yes, so for every external network you will see one floating IP namespace on that compute node. Because the floating IP namespace is shared between the tenants on a compute node. So for every external network you create a FEEP namespace that's being shared between the tenants. So I think thanks folks for attending this session. And if you want to hear more about the SNAT HA we have a follow on session at 11 o'clock at the ballroom A in the convention center. You want to check it out.