 Hello, everyone. My name is Numan Siddik. I work for Red Hat. So today we are going to talk about the IPv6 impact on Neutron L3 high availability. Let's see what is the agenda. So initially, like we'll just brief about what an HR router is and how it is supported in the IPv4 world. And later on, we'll talk more details on IPv6 and how it is supported in Neutron L3. So let me just brief about what an HR router is. Normally, HR router will be used in a case where you have three or four network nodes. You have L3 agents running there and you want your router to be scheduled on all those, maybe on multiple L3 agents and one will be active. If suppose one network node goes down, the other will come up and will take over the L3 functionality. So HR router is implemented as active passive mode. It can be enabled using a configuration file in the neutron.com or an administrator can create an HR router using a particular flag when you create a Neutron router. As I said, it is spawned on multiple L3 agents and you can define various configuration options like minimum L3 agents and max L3 agents, which means that if you have a minimum of two and maximum of four, then an HR router will be scheduled at least on two network nodes and maximum of four network nodes. And HA is supported using KEPA LiveD, which is a popular demo which is used for the HA setups and KEPA LiveD uses VRRP protocol. So let's see a bit more like what happens when you create an HR router. Basically, as I said earlier, it gets scheduled on various L3 agents and an HR network is created for a tenant. An HR network is actually created for each tenant which will be used for the HA purposes. And it is scheduled on various L3 agents and after that a router namespace is created on each L3 agent. That router namespace will have various QR ports in it. It will also have an HA port in the namespace. So this router namespace is created on all the multiple L3 agents. And once the router namespace is created, a KEPA LiveD configuration file is generated on each of the L3 agents and the KEPA LiveD process is started. And then finally what happens is that a master router is selected and that becomes active. This diagram just shows in this case wherein an HR router is scheduled on two L3 agents, L3 agent one and L3 agent two. And at this point of time, like L3 agent one has become the master and all the QR interfaces will have the proper IP addresses so that the VM traffic can be routed to that. Okay. This slide talks about like how an KEPA LiveD configuration file looks like. So the first section shows the, you know, the, it has VRRP, the VR router ID which is like unique for each HR router and it also has other parameters like priority and what is the advertisement interval. The second section has the virtual IP address so basically any HR router which becomes master will have this IP address. So the HR routers which are in backup mode will not have this IP address in the HA port. And section three and section four contains the IP addresses and the virtual routes which will be configured by the master router. You know, when the KEPA LiveD becomes a master, it configure these IP addresses on the router namespaces on the HA, in the router HA ports, sorry, on the QR ports. This section shows like how an HA traffic is exchanged between the various L3 agents. As I said earlier, like an HR network is created for each tenant. So this will be in this particular subnet. And the reason why an HR network is created is because to isolate the KEPA LiveD traffic, the VRRP traffic, so that it doesn't use a tenant network. And so a master router will have this IP address. So next user will take over. Now let us look into the VRRP protocol and its support for IPv6. So the current version of KEPA LiveD that we are using internally in has support for VRRP v2 which is having support to include IPv4 addresses as part of the VRRP payload. Now what it really means is today we are creating a separate HA network, we are creating an HA subnet which is nothing but basically an IPv4 subnet which is derived out of the link local subnet. So let's have a close look at the VRRP v2 and VRRP v3. By the way what is this VRRP v3 is a new version of VRRP protocol that has been added primarily to address the IPv6 use cases. So it has support where we can include IPv6 addresses as part of the VRRP payload and it also has support where it gives some guidelines on how to use IPv6 in your HA use cases. So if you have a close look at the VRRP headers between VRRP v2 and VRRP v3, there are three main differences which you can identify. I try to use different color coding so that it will make easy to figure out the differences. Now let's look at each of the difference and try to see if we really need a new version of VRRP for moving to IPv6 functionality. I would like to first talk about the IPv4 and IPv6 addresses represented using the blue color. So one of the main things that got added as part of VRRP v3 is the provision to include an IPv6 address as your primary virtual IP which is nothing but section 2 if you get close look at the capability configuration in the previous slides. So by going to VRRP v3 what we will have is we can now create an HA subnet out of IPv6 subnet space IPv6 address space. So whether that is required or not that we will be discussing in the next few slides. The second difference is the advertisement interval. Now what is this advertisement interval? Advertisement interval plays a role where the master HA router would be periodically sending out the VRRP packets onto the network and the backup routers would be monitoring for these packets. When the backup routers realize that the packets are no longer received on the network they think that the master is no longer active and after three consecutive intervals if the packets are not received one of the participating backup routers would restart the election and take over the role of master. So advertisement interval plays a role over there. When we are in the VRRP version 2 that is the current implementation we only add a provision to include the advertisement interval in terms of seconds with the default one being one second. With VRRP v3 we will be able to configure an advertisement interval in 70 seconds that means you can configure a value like 100 milliseconds that is being the least value. So now let us see whether this is really important although this is not directly related for IPv6 use cases whether this is required. See even in the current situation we are not using the one second that is supported by VRRP v2. We use a value of two seconds in the Keeper-Levde configuration. So what it means is the master HA router is sending out the VRRP multicast packets with an interval of two seconds. So when we move to VRRP v3 the sub second granularity is not going to immediately benefit out benefit us unless we find some use case where we want to have the HA routers converge as quickly as possible. But at the same time there is an associated drawback because the moment you decrease the advertisement interval you will see lot more packets being sent out on the HA network which can have a direct impact on your CPU usage. The third one being the authentication. So in VRRP v2 we have a provision where we can specify authentication for the VRRP packets. In VRRP version 3 they have removed this authentication from the payload so that is marked as reserved. Now the reason why they have done it is they have seen that operational experience shows that the protocol does not help us in any way to protect the master HA router from the malicious node. Now let me explain this a little bit. Now what happens is when you use authentication and when you have a malicious node added to your network it will not be able to bring down a legitimate master HA router from falling back to backup. But at the same time the protocol as such does not stop the malicious node to take over the role of master. So that means we have a situation where we have one legitimate node acting as a master and simultaneously we have a malicious node acting as a master in parallel. And now when you have such a situation the kind of problems which you normally see will be mostly related to the IP conflicts and R-paddress collisions and these are beyond VRRP protocol. So for that same reason they have removed the authentication from VRRP version tree. So this is a slightly modified version of kipalavadi configuration where I try to use some IPv6 addresses so that I can explain it in a better way. Section one and section two are mostly related to VRRP protocol where you can configure the virtual router ID, the priority, advertisement interval and so on. Section two being the most important one which is the section that carries the primary IP which is transmitted as part of the VRRP payload. Now what are the other sections like section three and four? Section three is nothing but like virtual IP addresses excluded and virtual routes. Now like Numan has mentioned earlier these sections are kipalavadi specific they are not related to VRRP but kipalavadi allows us to specify some addresses which are part of your internal and gateway interface ports that is the QR and the QG ports. So you can specify the IP address like 2001 and so on which is part of your gateway interface. So what kipalavadi does today is even though it is using VRRP version two for its HA functionality it allows us to use IPv6 support, IPv6 addresses as well as IPv6 routes in these two particular sections. And once kipalavadi transitions to master on any of the L3 agents it would not only configure the primary IP on to the HA port but it would also configure the IP addresses and the routes on to the respective ports. Now along with that another important thing which we also require for HA functionality is when a backup router transitions into the master we want the neighbouring switches to update the port information so that packets which are meant to be sent out to a particular MAC address have to be sent to an appropriate port. In an IPv4 world this is done using gratuitous apps which I am representing as GARP. The equivalent thing in an IPv6 world is sending out unsolicited neighbour advertisements. Thankfully kipalavadi not only helps us in IPv4 use cases but it would also help for IPv6 use cases by sending out this unsolicited neighbour advertisement on the respective ports. So the summary is like even though we are using version 2 of VRRP people who are familiar that there is a new version of VRRP v3 and we have to wait for VRRP v3 to have IPv6 functionality. Thankfully because of the way we are using kipalavadi by creating a separate HA network and a separate HA subnet we need not really wait for VRRP v3. It is all possible because kipalavadi has the necessary support. In the next few slides I will be talking about couple of important changes or additions that got merged during the kilo time frame and which are also associated to the HA router. The first thing which I want to talk about is something called process monitor. Now what was the problem statement? The problem statement is when we were in Juneau, L3 agent spawns couple of external processes. Now for this case let us talk about the radvd kipalavadi process itself. If these external processes are terminated or dead for some reason, neutron L3 agent was not aware that the external process is terminated. It is not only the neutron process but it also sometimes administrators were also not aware of this incident and they may realize it at a later point and there is no easy way to restart these processes. So it was something like a known issue and thanks to McGill who addressed this particular use case by introducing something called as process monitor. Now what this does is you can register that basically the agents, neutron agents will register with process monitor the external processes that are going to be started in that particular namespace. Process monitor would now constantly monitor these processes and if it sees that any of these external processes terminated for some reason it would restart the process as well as log information saying that so and so process which was supposed to be running is terminated was terminated sorry and it was restarted. Now second major addition to HA functionality is called as kipalavadi state change monitor but prior to talking about kipalavadi state change monitor I will give a brief background about what was the problem statement and how we addressed it. So in the Juneau time frame because ok I will have to talk about this. Now first of all we said that the L3 agent is spawning an external process called as kipalavadi. Kipalavadi internally uses VRRP protocol and it is kipalavadi process which will be aware of which of the router is the master router. But we also want this information to be propagated back to neutron L3 agent so that it can spawn other processes while the router is in the master state. So in Juneau time frame we used to depend on something called as notification scripts. So if you closely look at section number 5 we have notify master backup and fault. Now what it means is kipalavadi would execute the script that is part of notify master or backup or fault whenever it transitions to that particular state. Now whatever you want to do while you are in the master state you can add those instructions in that batch scripts. Now the problem with this batch scripts is these scripts can sometime be executed out of order and when kipalavadi is transitioning from backup to a master because we would be using a similar configuration in all the agents sometimes it so happens that a backup router would transition to master and immediately fall back to backup because of some other guy taking over the role of master. So during this interval it is quite possible that these scripts can get executed out of order leaving with an inconsistent state and race conditions. So what we have done in kilo is we have now using the kipalavadi state change process. It is nothing but a python process and the principle in which it works is we knew that we have a primary IP. This primary IP would be owned only by the master HA router and this IP address would be configured on the HA port. So the kipalavadi state change monitor process would run IP monitor code on the HA port to see if there is any IP address addition or deletion. If the IP address is added to the HA port it would treat this as an indication that the router is in the master state and if the address is deleted it would treat that it is now in the backup state. So with this principle we use the kipalavadi state change monitor to get notifications. Once kipalavadi state change monitor knows that an address has been added in this particular slide you will see that we are in the master state. This information is written to the UNIX domain socket which is later read by the neutron L3 agent. As you can see in the master state we are currently spawning two processes which are the RADVD process and metadata proxy. I will come back to the other use cases about RADVD and so on in the next slides. So in the master state as you can see we are spawning two processes and neutron L3 agent is not only going to spawn this processes but it would also inform neutron server that that particular L3 agent is now hosting the master HA router. Now this is very important for us because sometimes we will not be able to know when you have multiple L3 agents and each of the agents having an HA router. We will not be aware of which of these agents is having the master router maybe for some debugging purposes we would like to know. This is a slightly modified version of the same diagram where I represented using the backup. So here it is the same use scenario. The backup information is updated to the UNIX domain socket which is read by L3 agent. So the previously spawned processes are now terminated and the information is passed on to the neutron server and that is done using the RPC. Another use case or another important addition as part of the kilo is the knowledge to know which of these network nodes is hosting the active router. And you can if you now execute neutron L3 agent list hosting router there is a new column added that is called HA state which displays the current state of the router on that particular L3 agent. Now in this diagram you can see that network node 2 has the active router and thanks to us who has been working on this. Now I would like to talk about couple of IPv6 related use cases. In an IPv4 world for all the internal and external ports like QR and QG the IP addresses are added by neutron. But in IPv6 when you have IPv6 enabled on the platform and when you create a port a link local address would be automatically added or configured by the operating system. The problem statement is since we are using the same port information on all the HA routers that means the MAC address is exactly the same for all the HA port for all the ports on the HA routers. Now take an example of the QR that is the internal ports or the QG ports. The MAC addresses associated to these ports is exactly the same and when Linux kernel configures an IPv6 link local address to these ports it would be the same IP address configured on all of them. Now this will create a problem because you have a situation where you have multiple nodes and IP addresses configured on same IP address configured on different hosts. The problem is if any of the backup routers transmits any packets these packets when they are received by the layer 2 switches the switches would update the forwarding tables saying that so and so port which was originally on some port number x is now moved to port number y which is generally done using the gratuitous ops but that is done in a normal use case when router moves from master to backup to master. But here even though the router is in the backup state the forwarding database has been updated. Now this would disrupt the traffic that is generated out of your tenant VMs. So the solution what we have taken is we would now consider the link local address similar to the addresses that are assigned by Neutron. So we now add the link local address on to the virtual IP addresses excluded section of KEPAL-ID configuration and we would let KEPAL-ID to handle this IP addresses as addition as well as deletion. So with this we would make sure that the link local address is available only on the master HR router. Another thing related to IPv6 is RADVD. So you all know that in Neutron we are using RADVD process for sending out router advertisements for various use cases like you know Slack, DHCPv6, Stateful and Stateless and so on. So in the Juneau time frame for an HR router we used to spawn the RADVD daemon irrespective of the state of the router. That means the moment the router is created we used to spawn RADVD in all of the HR routers. One problem what we have seen with RADVD implementation is now that we are removing the link local address from the QR and interfaces RADVD daemon would notice this and it would actually go into some kind of error state and it would never recover even when the router becomes master. That means the link local address is reconfigured back to the port it would never recover. The only way we could recover in such situations is to restart RADVD which is normally difficult because it needs manual intervention. So starting from kilo what we do is we spawn RADVD only on the master HR router and terminate it once the router moves to a backup state. And this is all possible because we now have a reliable mechanism to identify the master HR router. In the next few slides we will be talking about some of the IPv6 HA use cases. There are couple of IPv6 blueprints that got merged during the kilo time frame. I will be only focusing on some use cases that are relevant for HA and would be only talking briefly about that because if you need more information you can have look at the blueprint which gives additional information. So the first one I would like to talk about is the IP router which provides external connectivity. Now what is this blueprint? This blueprint actually allows us to create an external network without an external subnet and this is I am talking about from an IPv6 point of view. So in a normal use case when you have a tenant network you would also create an external network. You would create an external subnet that maps to your physical network and in an IPv4 world we would actually use the IP address that is configured on your external ports for source netting as well as any floating IP requirements. In IPv6 world we do not use source netting. At the same time we do not support floating IPs. So we do not have as such any use case associated to having an IP on the QG interface that is a gateway interface. Now that we know that the link local address would be added by the operating system for the QG interface that would be good enough for the router to talk to the next stop. The problem is how do you know what is the next stop? So there are two use cases for that. So I am calling that as a scenario here. So let us talk about scenario number one. In scenario you create external network without any subnet but you as an administrator know what is the link local address of your upstream gateway. So you configure this link local address as IPv6 underscore gateway in your L3 agent configuration. Now when you do this HA router would configure this as a default route in the master HA router instance. And this is all taken care by making sure this information is read from the configuration file and configured in the keep alive the virtual route section which is section number four in the keep alive the configuration what we have seen earlier. The second use case associated to the same blueprint is you have an external network and there is no subnet to that and at the same time you have not configured the IPv6 gateway in the agent configuration. So how do you know the next stop? So we assume that in this particular use case we would be having a router which would be sending out the router advertisements periodically. So we make sure that the gateway interface is configured to receive this router advertisements and configure the default route as well as IPv6 address. I'd like to thank the blueprint authors Robert Bolli and Abhishek Subramaniam who actually originally implemented this feature. One final thing can we take the questions at the end because almost done. So one final thing what we also seen is an important addition is the dual stack support for gateway ports. So we can now configure when we were in Kilo we could only configure an IPv4 subnet or an IPv6 subnet for your external network but starting from Kilo we would now be able to configure as PV4 as well as IPv6 which is very important for dual stack support and I'd like to thank Dane Libnank and Andrew Burke who worked on this particular blueprint. There's still some pending work in HA and it's something like today we cannot upgrade a legacy router to an HA router. We cannot manually schedule a master HA router onto a particular agent. You cannot have L2POP to coexist with HA router that means you have to disable L2POP when you want to have HA functionality. We cannot have support for HA in DVRS NAT. Connection tracking support is not yet there which is an important thing and finally the external gateway monitoring support using key private track scripts is not there. But I would like to say that you know there are a lot of patches already floating around and which are under review which would be adding the support for the missing things. So hopefully during the liberty cycle we should be able to see these features getting merged. So the final takeaway of the summary from this particular session is that you know we need not wait for VRRP v3 for having IPv6 functionality because of the specific way in which QPLAWD works and the way we are using QPLAWD. So it doesn't mean that you know you can use any version of QPLAWD. As long as you are using QPLAWD version 1.2.10 or above we are good. Our experiments have seen that you know try to use the same version of QPLAWD on all your agents in order to have a consistent behavior. With that I actually am done. I'm happy to take any questions. Okay so the reason I stood up at the time that I did was because my question was on the particular slide. You were talking about learning the fact that a router exists from the reception of an RA. Right. Which is great. I'm glad you're doing that. In an IPv6 network it's not at all unusual to have multiple routers advertising different subnets. Right. And what we'd like to see in that case would be that the router advertising a particular prefix is the router you want to see the packets go to when the source address uses that prefix. Correct. You used the singular when you were talking about that slide. Yes. Are you implementing the plural? Right. So what we do today is we configure the gateway interface to receive router advertisements. But today we are not restricting that to a specific router. So if you have an environment where you have routers advertising multiple prefixes you would normally see that the interface would have multiple prefixes multiple IPv6 addresses. Okay. Now when you have such a situation you would also have multiple routes configured. So there's something called source based. Well configured or learned. Sorry. Configured or learned. By the operating system. So when an RA as you probably know when an RA is received most of the distros now support configuring the IPv6 address automatically. Okay. So it's learned. Yeah. Okay. Thank you. I have a general question. So we just use the VRRP. So what happens if the VRRP channel is broken? Is there any proposal to use some sort of externally written cookie or something like that? Okay. It's a good question. Now normally what happens is when the VRRP channel is broken that means the HA network is broken. You would end up in a situation when both the routers would become the master because of the way VRRP protocol works. The backup router would no longer see the packets coming from the master even though the master is active because of the broken HA network. And both of them would take over. But considering that you know we are creating the HA network similar to the tenant network. That means your tenant network itself is broken. So when you have tenant network broken you will you will not even have the layer to connectivity for your traffic to be sent out to the external network. So but as such coming back to your question we do not have currently you know any mechanism to identify the situation. So it's only like you know you'll have to figure out that there's a problem and we'll have to take take care of that. Why would the tenant network and the VRRP network be the same? Shouldn't be different? Tenant network is different. But what I'm saying is it use the same segmentation technology. Say for example you're using VLAN based or if you're using tunnel based like GRE or VXLAN. It would use the same segmentation. It's just like any other network. You mean the tunnel is the same. Yes. Okay. It's a different network. You will not see the VRRP packets on your regular tenant networks. But it uses the same you know layer to technology which you'll be using. Yes. Yes. Yes. You can run. Okay. The biggest advantage or you know the one advantage what I was telling is when you move to VRRP tree you have a provision to have an IPv6 link local network. I'll just show this so that it will make sense. So I hope you're talking about this one. So here we are sending out the VRRP traffic on an IPv4 multicast address. If you move to VRRP version tree you have support where you can use IPv6 link local address which is FF02-12. So if you're interested in sending out the VRRP traffic only on IPv6 multicast addresses then you will have to wait for VRRP v3. So on this slide particularly notice you're using IPv4 as your signaling path for your IPv6 HA. Right. And depending on the physical or the physical network hardware it's not necessarily to carry by the same path or the same hardware. Layer 2 should be but it's possible that they could be carried differently. Is there any reason to link IPv4 and IPv6 failover so they're happening in lockstep or if you're doing IPv6 failover shouldn't signaling happen over IPv6? Okay. I'm not very sure about your question but if you're telling that IPv6 failover here we are not using IPv6 addresses as VRRP payload. So when you say failover can you can you please you're using IPv4 multicast VRRP packets to signal an IPv6 failover? No. We are using IPv4 multicast packets and I'll just go back. Okay. And the VRRP packets would contain the IPv4 addresses itself. Okay. But what Keeple ID allows us is to configure. Bring up additional interfaces. Additional IP means on the interfaces at the same time. So we are using the protocol for v4 and once we converge on the master HA router. Yeah. Keeple ID has support to configure IPv6 addresses and these are beyond VRRP protocol. It is something which is supported by VRR. Yeah but my point was you're using v4 as your signaling mechanism to move the v6 address. Right. Between active routers. Right. So let me pose the same question in different words. Okay. Okay. Do I should telecom terrestream is turning off IPv4? KDDIJPNE is turning off IPv4 carries v4 or carries v4s and overlay. Reliance in India is turning off v4. There is no v4. I can go down a list of data center operators and networks, enterprise and service provider that are turning off v4. Okay. Imagine a world in which there's no v4. Tell me how your thing works. Okay. So the way it works is you will not see the VRRP v4 packets on your physical network because it depends on say for example you are using tunneling like GRE or VXLAN. They will be encapsulated and sent out on a separate HA network and they will not be really sent out. There is no v4. But at the same time there is no v4. I guess you have to go to v3 then. Yeah. I'm serious. Yeah. It's in the process of being turned down. So the what I'm really trying to say here is these packets are used on an HA network that is internal to open stack, internal to neutron. These are not. There is no v4. Okay. If you switch this and all there's no v4 then yes we have to move to VRRP v3. Okay. But it's good to know that you know there are people who want to work on v6 alone. It will be fun. Okay. That's a good question. The current version of QPLAVD does not support VRRP v3. There are some patches floating around which will have the support for VRRP v3 headers but it's not upstream yet. So we'll have to wait for that. Okay. Thank you everyone.