 Hi guys, today we'll be talking about how we leveraged automation to include the patches making work for router network at scale at eBay, following where the problems that we are facing when Netforce was not in place. So consider the case of subnet onboarding where CRE team has to always watch the graph as to how many IP accounts are available in a VPC and then they'll file a network ticket. Network folks will give a subnet and then it will be added to Neutron which was painful and lengthy. It took much of our time. It was not productive. The same case was with worth bare metal onboarding for Ironic. We had the same issues that you have to file a network ticket to place the bare metal into respective worth. So how to handle that? So we came up with Netforce to address these use cases to serve on-demand subnet provisioning for open stack bare metals and VMs also for Kubernetes ports and also worth bare metal support for a VPC. Subnet recycling can be handled sometime which we have not included yet and we also support CLI so that a network engineer can do the changes. Netforce is written in the same way as Neutron plugins. It uses Keystone for authentication and we have leveraged some of the features from NEPA which is an open source library by Spotify to configure some changes on the device thanks to the Spotify team for that. We do have current driver support for Nexus OS, Juniper OS and Arista OS. We don't use the vendor APIs. We directly use the CLI underneath. It's deployed in production running on Kubernetes as a global service and CLI to update all the operations. This is the base view of a BGP-enabled worth which I want to just highlight since we'll be targeting it for bare metal use case. So this is what a bubble looks like in a worth. It has a set of distribution switches and tours underneath the bare metals and the hypervisors. This is what a Netforce data model looks like. There's a data center object. There's a VLAN. There's VPC. There are devices. There are bubbles and I highlighted the definitions in the PPT. You can walk through it. Routed networks. So routed networks is pretty much same as the one which we use at upstream. However, we have some tweaks because the upstream patches were not yet merged when we went live. So I'll just highlight the differences that we are using here. One of them is physical networks, which is an object in our own Netforce model and Neutron model. It's an attribute in upstream. So basically there is no Tor object modeling in upstream. We don't allow access to access VLAN flipping because it's super risky and the other data objects which I have highlighted are pretty much needed when you play around with the routed networks in Neutron. This is what on demand subnet onboarding looks like. So following are the consumers. One is our Kubernetes. One is network engineer and the Neutron itself. So Kubernetes these days is heavy consumer of our IP space because they need more IPs for ports compared to what we need for VM. So we have a cloud IPAM which is a subset of global IPAM. Kubernetes will call cloud IPAM to get a block. It will then use Neutron to push the subnet and Netforce will actually push the subnet to the switch. If it's BGP it will directly be advertised and it's fully functional and ready to use for the VM. This is what the create subnet payload looks like. You post it basically. But in routed network you have to specify the VLAN on which VLAN interface you want to push the subnet. After you push it it will be available in Neutron. If you do Neutron subnet list as the subnet is directly available and ready to be used by our VMs, bare metals or Kubernetes ports. This is what will be added to the switch as a secondary subnet. Right now we support secondary subnet as a go live. We don't have the feature to support primary and these are create subnet sounds pretty simple but there are a bunch of validations where we have spent most of our efforts for a go live. So they are basically preset of validations push and post validations. In preset of validations you have to make sure the subnet is not overlapping. The routes are not overlapping cross az cross dc cross bubble. You have to always make sure that the DNS entries are properly working out. No one has reserved it somewhere and it's not using and if it's successful you go and push it on the device. After you push it on the device if it's BGP you have to make sure that the routes get resolved all the way down to the Tor which you added it and also make sure in the distribution switches that the routes are resolving and the routes aggregates are configured properly. If none of the if any of the parameters are failed in validation we roll back the chain. It's all and obviously I forgot to add the change request every change on the device you have to have a change request tracked. So that's how there are other bunch of validations. So for Ironic Andrew is not here unfortunately he is managing most of our Ironic stuff but I'll walk you through the requirements that we pinned through router networks in Neutron which Ironic also leveraged to provision the to flip the bare metal to the respective work. So this diagram will give you the complete idea. So we for Ironic will leverage the update host segment binding of router networks where say Ironic I'll just walk you through a quick example that Ironic provisions bare metal in a native VLAN. After it provisions through DHCP it gives the bare metal a static IP for the corresponding work. That IP comes from Neutron create port which is a router network. After the IP is assigned now is the time to flip the bare metal to the corresponding work which is flipping it to access and we didn't have we leverage the update host segment binding of Neutron. So Ironic can directly get the segment details and update host segment binding and call Neutron Neutron will internally call Netforce to update port VLAN operations and it will always it will push the changes to the switch and since Netforce is devised in such a way that not only cloud services underlying network infrastructure team can also use it since it's not tightened to only cloud devices it's scaled across all the network devices. This is how the payload looks like for Ironic it will just call update bindings as a port since it's an update and it wants to flip the VLAN mode to access but this one is also just one part which Ironic just calls this API but underlying Netforce does too many checks. One is you have to make sure that you always check the MAC address of the bare metal and the physical MAC address on the port to make sure it's in sync and also when you're flipping the flipping the VLAN for the particular bare metal you have to make sure that traffic is considered as a safety check. It's not one KVPS every company has their own standards for networking right now I guess we are using 60 KVPS to check that if you're before you're flipping you make sure that there is no traffic and no one is running something and if you if the operation fails you can check from the logs that it was failed due to traffic you can reach out to customer that hey who is using it stuff like that CLI is pretty cool we just have some base support for it you can list devices so this is what it will do Ironic actually even Ironic does update port this is what is happening calls update port you give the VLANs and you give check MAC so basically this is what the device configuration is right now so it's one access and the current port status is not the database status we always get the device from the device details because network engineer wants that there is no conflict because there are chances that someone can log into the switch and change the configuration so you always face the fetch the current one and if you'll see it got updated to one trunk and there is a trace ticket that we that we track that stamps the current as well as the rollback status if something breaks and this is what got pushed up into the device and the actual bare metal is already placed into the settings pre provision then right now we provision the subnets on demand for open stack VMs and bare metals using a background script with IP threshold set to 64 so if it drops below it it will treat regal the only one subnet creation workflow and it will release the subnet available for bare metals in VM and Kubernetes does the same way it calls IPAM gets the stuff and pushes so subnet recycling we are planning to add it later on which we have not addressed right now just to conclude what net force can do since I had a limited time slot and yeah so it's pretty much the infrastructure automation service can add more and we can add more and more features to it but this was the base case why we wrote this to support our cloud use cases but it can always do other stuff like bgp and stuff like that if you want to add new features thank you everybody and I will be available here for any questions I guess I don't have time right now but you can always stop by I can take one question no we have some plans to open sources it's not yet okay thanks everybody