 Hello Let's get started We're gonna discuss in this session bringing provider networks into open stack using L2 gateways so I Am Sukdev Kapoor from a Rista network along with me. We have a co-presenters Armando Maruti and Selva Kumar from HP and a Saku from Intel so as As an agenda, we're gonna Talk about the use case what we what we took to implement this L2 gateway We're gonna we're gonna share with you our proposed solution And then we'll get into a little deep dive as to how this team came about and How we came up with the charter what we did and then obviously We'll step into the details of the architecture The configuration and the the workflow of the deployment how would you deploy this L2 gateway? And then the development and the testing what we have done to make it work We'll touch on the roadmap and this is where you have a paternity to tell us. What would you like to see? Going forward to make it work for your deployment scenarios We'll summarize and then we'll of course cover the Q&A Okay, so essentially the use case is you have You have existing data center deployments where you have critical services which are running on physical servers bare metal host Storage clusters and whatnot and and then you also have Open stack based virtual virtualized environment So the things work beautifully within the virtualized environment, but there are there are needs where you need to connect to To the bare metal servers or the clusters to access the file systems or or to use the existing services which exist in the data center, right? So how do you do that? So that's that's the real Crux of this L2 gate face essentially it allows you to take a virtual network which is managed by open stack and Connected with your existing physical networks or legacy networks, right? So so that's that's our use case in this case we've used VxLan as the virtual network and connected it with the the Vlan based legacy networks, so Here is our proposed solution to come up with within a gateway. So in this initial implementation We have used hardware device Physical switch to implement it, but the way the API is constructed. It's really at an abstract level You can build a virtual gateway We in the initial implementation. We have used VxLan to Vlan bridging In in future we will cover more and we'll cover that as a part of our roadmap, but that's essentially the use case so you have a bunch of VMs which are running in in a virtual world and they need to access the The the file systems or the services which are running on the bare metal servers So so you can create these L2 gateway on any given physical device And create the connections and you can achieve the connectivity Okay, so like I said the L2 gateway as as an entity Is pretty broad so essentially the goal is to take two different networks Two different L2 broadcast domains and connect them together So those could be any VPN based networks Vlan based networks MPLS based What what what not and on the other side you have a VxLan based networks Vlan based network what not so So several attempts have been made in the past To essentially so everybody recognizes that this functionality is needed Right, but in the past we've made several attempts to accomplish this But then again when you're trying to solve a bigger problem you end up, you know Somewhere in the middle or sometime, you know, you don't you don't reach the destination. So in this case what we did was We decided to divide and conquer so in Paris summit some of you were probably involved in these discussions The Maruti presented in Paris summit during lightning talks about this L2 gateway bunch of vendors got together We put we all went into one room We spent few hours and we decided that this is something we want So it was not just one person's vision. It was like multiple vendors Everybody wanted this functionality. So we we sat down in a room We spent like a couple of hours. We started coming up with an abstract idea as to what we want to accomplish and and And we decided we're gonna take a one simple use case and let's build end-to-end API for that We know build an API which is an extensible API, right and and and We we looked at ML 2 based we looked at service plugin So we we chose to use it as a service plug-in API because this gives us a lot more flexibility Okay, and then define a pluggable architecture now, which is an agent based where the agents can run on multiple compute nodes It can be distributed. So therefore we can deal with the scale issues, right? And to kickstart the project like I said, we wanted to divide divide and conquer so take Bring down the big scope of the problem into a very small focused narrow problem and we we said, okay We're gonna take VX land-based networks and and we're gonna use software VTAP to hardware VTAP integration So that's what that's what we will do and and in in order to do that We decided we'll use OBS DB hardware VTAP schema. So so that's that's our use case That's the first implementation. We decided so then we went on and and and and got into the implementation and So we're gonna Share with you what we have. So now I'm gonna hand over to Armando. He will walk you through the architecture For the great introduction I like these pictures thinking a bit at this time of day So that you know, you can all make a mental picture of this and Maybe, you know at the end of the at the end of the session we can go through a few questions and answers as to why we went on this path I can try and give you a few a few hints right now so as you can see We adopted the model where the API itself Came about via service plug-in. So as some of you may be aware The neutron architecture is is is done in a way that it can be plugged in at the several several Several points in the you know in the obstruction layer as you can see we have Core plugins we have service plugins for providing high-level of services and Some may wonder why we went down the path of choosing a service plug-in rather than Implementing these as you know as a core extension via for instance like an ML to extension extension mechanism and the reason was simple so We all know how you know how fast-paced this project is and how contentious it could be like doing doing collaborative code contributions and we realized that if you wanted to Experiment and iterate fast and possibly fail fast. We needed to Come up with that with a you know with an architecture and the contribution model that let us Develop and iterate in a solution. So we went down the again the approach of coming up with this again layer to get we you know a layer to get the extension as a service plug-in that was going to be loosely coupled integrated with you know with the rest of the core of neutron and That's you know that that's primarily you know the main reason why we went down that path and As you can see in this picture that we also have On you know on the right side and L2 gateway agent that is effectively in charge of managing the you know the gateway instances the one that are the ones that are in charge of bridging in the physical layer with the logical layer and These has been an option for our like initial implementation then doesn't necessarily to be again the only viable model and In our case in our specific case we went down this path also because in In in in deployment that you know deployment models where you have multiple Neutron instances, you know behind a load balancer, you know for full tolerance and availability We wanted to have The ability to interact with you know with the gateway instances in a way that we didn't have A symmetry in the way those nodes were going to be deployed and this will be is going to be clear like later on in in the presentation because In order to integrate with the physical world, you obviously need to dispatch certain events to the physical world to again Let me know that certain virtual machines have popped up on you know on the logical layer And you need to consume certain events coming from the physical world in case some metal node pops up on the network and in order to coordinate You know the physical with the logical you do want to make sure that again, there is a symmetric Published subscribe mechanism plays Are you guys still with me? Probably not all right, so this is in another another way to represent the architecture we went for we try to capture, you know the deployment aspects here of what we did and We Came up with solutions to support Deployments where you could have multiple gateway agents managing gateway instances again for our availability and full tolerance and we Came up with with with solutions to again Automatically failover in case you know in case in case of failures of the agents and so on and as you can see You know it looks it looks pretty straightforward the agents integrate with the rest of the you know the server via message bus and the integration between the gateway agent Nodes and and the compute nodes where the VMs lie also happens to to a true coordination between the service the server and And and you know via the message bus the communication between the gateway agents themselves and again the the gateway instances being being instantiated using using on vsdb up and through obvious db protocol and there is a this this this Bus here, it's it's a bi-directional bus. So Gateway agent nodes establish a connection with the obvious db instance that manages the gateway agent and and it also They're also like subscribed for like for events that come from the obvious db server again to to understand what's going on in the physical world In these next slides effectively, I you know, I summarize what I've you know somewhat described here I let it again stand idle for a few seconds and Now I'm gonna hand over to Maruti who's gonna walk us through some of the workflows that are That you know will explain how this this works in practice a little you know little a little in the more details Thanks, Armando so Let me take you to the typical use cases a user may try while using the L2 Gateway So let's take the first use case where You are starting everything from scratch You're already planning to use the services on the bare metal servers to be used by your Services running on your virtual machines in form of workloads. So what do you do? So the user first creates the network and now he knows that the Hardware L2 Gateway has to be deployed. So he deploys the hardware L2 Gateway. So he creates the Abstraction of this hardware L2 Gateway into logical gateways now you have the You have the workloads you have to have the workloads you have just network you have just have the Logical abstraction of the hardware L2 Gateway now what you do you just need to bind this to so what you do you just Create the connection of this network to this logical abstraction or the logic logical Gateway. So once the connection is Created it basically creates the VNI to VLAN binding now Now the time now it's time to deploy the real workloads in the form of virtual machines So the user creates the virtual machines and it establishes the connection between the workloads and the services on the bare metal servers So the second use case typically the user Has everything running in place. He has the workloads virtual machines and a network and it has been being used for a couple of years and now he wants to allow these workloads Running on the virtual machines to use the services on the bare metal servers. So what does it do? He creates the network. He has the virtual machines on this network He brings the physical L2 Gateway And then he creates the abstraction of this L2 Gateway into logical gateways And now it's time to create the binding between the two. So he creates the connection of the network to this logical gateways and that's all he has the connectivity between the Workloads and the services running on the bare metal servers So let me take you to the workflow how this is all done So as you can see in the picture And as Armando already explained the L2 Gateway agent as we can see in the orange box It acts as the OSDB client to the OSDB server either running on the physical L2 Gateway itself or it runs outside the physical L2 Gateway. So L2 Gateway agent can write into the OSDB server or it also gets the notifications from the OSDB server if there are any changes in the schema tables on the OSDB server so The CLI typically so in my this use case what I have shown in green is the tenant VM The workload already is present So now he wants to allow this Virtual machine to access the services on the bare metal server host as we can see here Now he executes the CLI which specifies The Gateway name so the Gateway name is nothing but the name of the logical Gateway that is the obstruction of the hardware L2 Gateway and He specifies the network name So this is the network to wish this virtual machine belong to and It wants to allow this workload to access the services on the bare metal server and we give the default segmentation ID that is nothing but the villain ID on the physical Network to wish the bare metal host belong to so once this command is given The L2 Gateway service plugin knows the network name So from the network name it can get from the Neutron database the VNI ID that is a VXLN ID of the network It knows what are the all the ports that belong to this network. So it builds the list of this And it also knows the villain ID that is specified in the command in the form of segment ID segmentation ID so it knows the Mapping of this VXLN ID with the villain ID and also builds the list of MAC addresses and IP addresses of the Neutron ports that belong to that network and this list is then Send to the L2 Gateway agent over the rabbit MQ message bus the L2 Gateway agent in turn sends this inserts this Information into the OBS DB table So I also note that from the ports We also know the data path VTAP IP of the compute compute nodes and network nodes so this information is written to OBS DB tables and it is the vendor or the the The OBS DB then configures The physical hardware L2 Gateway to create the VXLN tunnel to the compute node or network network node Based on the data path IP that was sent by the L2 Gateway agent So this is the VXLN tunnel that is originated from the physical L2 Gateway to the compute node or network node So let me go to the others part where how we create the VXLN tunnel from compute node or network node to the physical L2 Gateway So as we keen as we can see in the diagram When a new bare metal so now as I said earlier in the previous slide We have already written the VNI to VLAN binding into the OBS DB server table So now the switch already knows or the L2 Gateway already knows this network is bound to this physical VLAN So now when a bare metal host is detected By the L2 Gateway and that is connected to the VLAN which was already there in the binding now The L2 Gateway's responsibility is to advertise this Mac and IP address to the other side to the virtual side. So what it does is It sends this information or it writes that information into the OBS DB tables Basically the IP address the MAC address of the bare metal host and its own VTAP IP The L2 Gateway's hardware VTAP IP all this information is written into OBS DB server And as we say as we already discussed that L2 Gateway agent acts as a OBS DB client to the OBS DB server it gets the Notification in that notification we know that the VTAP IP of the gateway We know the MAC address and the IP address of the bare metal host So L2 Gateway agent in in turn takes this and sends a reverse rabbit MQ message RPC to the L2 Gateway service plugin. The L2 Gateway service plugin writes this information into the neutron DB and what we do is as we already know the L2 agent has a L2 population mechanism So which by which we can take the Macs and creates the we can create the tunnels So we have leveraged the same thing. So L2 Gateway service plugin it sends L2 population RPC RPCs to the L2 agent and L2 agent creates the Wasteland tunnel from the compute node network node to the physical L2 Gateway So because we already know the VTAP IP of the L2 Gateway, we could have we could leverage these RPCs of L2 population So I will hand over to Selva to continue further on So thanks, Maruti So I am Selokumar. I am from HP So now I am going to talk about that L2 Gateway development testing and the future roadmap that we are going to work on So the entire L2 Gateway code we have implemented in the Stackforce repo that is available in github.com Stackforce-networking-L2Gateway So and we have also leveraged this public API in the form of service plugin So we have implemented two extension API as Armando already told So one is that L2 Gateway extends an API for representing the logical Gateway And the other one is L2 Gateway connection to connect that logical Gateway to the L2 Networks And we have a third party CI that is HP networking CI That is that does continuous integration for us. So it is based on single node div stack setup So whenever if you do code checking in the Stackforce repo, so they build automatically will get generated this and also does Tempest API testing and some integration testing for L2 Gateway. So and also we have standalone CLI that is as part of that Stackforce repo That we have to integrate with that Neutron CLI. See it means after integrating with the Neutron CLI if you download the Python Neutron client package So we can create this L2 Gateway and L2 Gateway connection as like core Neutron create network subnet or port And offline on the kilo release we have transitioned this L2 Gateway project into the Neutron So it means the Neutron embraces our L2 Gateway So whenever official Neutron releases our L2 Gateway code also it released So we can also download the tar ball in this link There is a DevStack setup available. So read me that usage.rstdoc is available So we can download the code and we can try this code and test. So currently we are doing this active development There is we have bi-weekly IRC meeting at every Monday So where we discuss about the current defects on L2 Gateway So we are also we are also tracking this defect in the launchpad So launchpad.not slash networking hype on L2 Gateway. So there are some few defects available. So you are working on it so So coming to the testing we have tested this L2 Gateway solution in the 15930 HP 15930 switches as well as Arista's 7000 series switches So we have a recorded demo available in the youtube site So for both the switches, we have that native OVDB server runs instead of running which embraces the hardware VITEP schema So this L2 Gateway that kilo base code is tested successfully tested and we have achieved the end to end functionality So what's next? So these are all the future roadmap So currently in the overlay side that is on the Neutron side We have supported only the vxlan type So we had to support for other tunneling scheme like NVGRE or the TMPLS So on the physical side the legacy side, we have supported only the vlan type So so current base is a hardware based implementation. So Software based we had to implement so where entire hardware functionality we are bringing into that x86 physical server either in the compute node or any other service node So current implementation is based on only the CVR. So DVR we have not supported due to the technical depth You also tell us what you want to improve So that it would be useful for us to continue. So now I am going to Give it to Sugdev where he can summarize the overall L2 Gateway project So so the key is the last bullet here, right? So like I said at the beginning that we took a very simple use case We wanted to concentrate and we wanted to get it done And it's a really a commendable effort by that by the team here in one release cycle From beginning to end the whole thing is implemented last week. We released it We put it on the python package index like Like selva said you can pull it down and you can actually start to play with it The api is there the whole implementer. So you can do it through dev stack. You can do it through Your regular python package So so tell us what do you need? What what should be the next Features you would like to see we are here to help doesn't necessarily mean I will build them but yes Or come join us actually that's that because this is a joint effort. It's not like one person's Charter really right? So let me summarize What we're saying So l2 a gateway really helps in in bridging The vx LAN and vlan based networks So in this case we land being the physical networks and vx LAN being the virtual networks So l2 gateway Is part of killer release the northbound apis to manage the Manage and create the logical gateways and connections And and so forth. It's available When you create a connection it does create a vx LAN tunnel so it essentially What we're doing is taking a software vtap and hardware vtap and creating a tunnel between the two to achieve the the connectivity between two distinct networks And Like salva said so in the initial cut we wanted to divide and conquer we wanted to have it, you know really reduce down the The workload so we're using only hardware vtap schema so again the api the way it is developed You can cut out the back end if you don't like obvious db plug in something else you want to Make it work for your environment it'll work and We have tested it Both on hb and arista Tor switches and it's actually being demoed If you stop by at arista boot or or hb boot you can see it actually in action or we provided you the links You can pull it down and And play with it Now i'm gonna dive into a q and a one one key point I wanted to stress here is that obviously, you know the backbone of this is of us db and of the vtap schema So we have we we did this work Using you know hardware and hardware switches because our employers are so kind to give us and the hardware, but You know nothing nothing really stop us to get this to to to run on on on on virtual machines or High-wired box switching. I mean the gap that needs to be filled is really small and this is what we really would like The community interest you know Help to to to you know close this gap because ultimately we'd want You know to to minimize that the barrier of entry to you know to get him to use this Because now essentially we have done the heavy lifting so the api is there So come in and and start using it and start you know come and join us participate So every every other monday The team meets on irc Open stack meeting four Come and join us share your Feedback share your use cases and and participate in it and take it to the next level How to install You can do it through dev stack. You can just pull it in directly from the python package index There is a wiki it has a elaborate detail. You can go check that out There's a git repository for you to pull the code and play with it So with that Yeah, so now we're gonna We're gonna open it up for for questions if you don't mind stepping to the You either go to the mic or we'll try to relay the question. Yeah Yeah, it's very interesting because I did very similar thing for a different application So when you connect The external network, uh, you need to have an internal network identical to that, correct? Because that's how it tunnel is created tunnel is created from the same network on a different host Yes, when you uh, when you are connecting the two networks, right? So there's a tenant network and then there's a physical network physical network, uh, like he showed in the cli You specify the vlan id So and and internally when you connect to the network So it whatever that v and i mapping is it It will take that from the news. So you don't have to specify that on the on the Virtual side, don't you have to create a network like a proxy network? Are you tunneling from some arbitrary network to the physical network? It has to be between one network to other network No, you you're regular neutral network. So you you create a neutral network Yes, you're launching vm's and you say you say you call a blue network So you have a couple of vm's hanging out there. So you said take this blue network Create me a gateway and connect this blue network to vlan number five. That's it That's all you need to specify. Everything else is done automatically for you So if i understood your question correctly and obviously we can take this offline if i haven't The networks needs to be known to your deployment. So on one end you have an No, no neutral network id on the other end you you know me You need in this specific instance like a segmentation id, you know, a vlan id that that's uh, that's accessible from your switch and Yeah, in theory, you know, you should because what happens is uh, logically if you see On the virtual side, you have network and one let us say it has 10.1.1.1 1.0 and on the other side is You know 172 168 something something So you're actually bridging these two Well, we're bridging them at the layer two level. So well, you represent, you know, we're bridging You know them to being acting as a single product of something. Yes Unless these both are the same segment it has to go through an l3 that right, I mean and You you want to answer this question Yeah, right Yes, so yeah, there is there is no rounding involved. I mean that the address space must be the same Yeah, we have tested this. We have created a router on the network node and we have tested it. It works. You need a router. That's what I'm coming to say Yeah, so when when you're going for two different address spaces. Yes, of course So, but but when you're when you're where we're saying is when you're taking two You're extending your layer two domain So in this case you have two use cases when you're going across different ip space Then you're going to need a router in between but if you're going to the same You're extending your same ip space from from your legacy network into your virtual networks Then you don't so I mean in in in this case here actually It's like this bare metal is going to have an ip on the same subnet that these vms are gonna You know gonna have the game, you know the use case that you described it's feasible But it's not quite well, you know, obviously we built here and but it still it still works I mean you can still in theory if this used to be Happens to be on a different subnet and you have a router lying somewhere then you were still round from Like all the way through another another another Network that should be that should be the default scheme I think because the external networks always going to be on a different subnet. You can't assume that I'd say let's take these offline The later gateway agent runs on each hypervisor, is that correct? Not necessarily You may if you if you want or you can run them on on standalone Um, you know special purpose nodes and they could be for instance like the network nodes You can run a single of them. You can run more than once for you know for high availability of full tolerance But yeah, they do not need to run on the compute nodes In fact, you can run the Agent on the controller itself you can run it on the compute node or you can run on the network node Because it doesn't use any bridgey certainals underneath It just uses ozdb protocol to the ozdb server sure So if I understood your question correctly, you're asking whether we do any validation or like policy enforcement As to whether like a specific user can access a specific set of venance Uh, that's an interesting point. I don't think we we have looked into that in this in this like iteration But I mean this has come up Like in discussion and we'll definitely be looking at that because it sounds like you know a genuine thing that you want to do Right now we have allowed only the admin operator to create the connection for safety and in next iteration We'll look into that so yeah, again, it's not the general tenon that Can act effectively to create the gateway connection is the admin that needs to cooperate with you, you know So I have a question. How do you envision extending your architecture and your you know your solution Where your physical site Is a layer to vpn So it's more like mpls layer to vpn or bgp e vpn However, your your logical site could be you know, say vlan or any other type of neutron network So So there was a one vendor They were essentially trying to do exactly that That taking the taking the vpn based networks and they wanted to bring it into the virtual network So the way they were looking at was that bring the vpn network into their device And then bring an out vlan on other side and take that vlan and connect using this ultimate way So I guess there were happened to be us actually so Actually, you may be another one the vendor which I am thinking about it's a different vendor But so now we have two winners Who are looking for that that's that solution So I think if you will won't overlook like that checking out details a bit I I personally think that from a logical standpoint so long as your gateway as we move that we here has two legs Legs one, you know a leg on the virtual side and one a leg on the physical side Which ever it may be and it's capable of doing the bridging then we should be good We just need to bring basically we just need to bring in that driver That knows how to do the bridging between you know, whichever you have on the physical side And the virtual side so obviously if you're interested in the type of use case We'd love to work with you sure would love to see you do the work But yeah, I think that from logical standpoint You know the api allows for that type of mapping should do you see this thing that the segmentation ID could be You know, perhaps we over it into to address that Well, I haven't seen like a specific customer request yet. I mean obviously this is the first time that this solution has come to the limelights We'll see going forward whether you know this this type of requirement pops up And you know, we'd love to like work with other folks and then you know take this to to be you know Catching a broader set of requirements. I I I don't have enough like exposure to customers. Unfortunately To to understanding at this point in time whether it you know, there's this thing is going to be a likely requirement But it sounds like you know, this is something that obviously you've been thinking about for quite some time So I suspect that That's fine Muhammad, yeah, this is the this is the IRC June 8th is our next meeting. Please come join us. Let's discuss. I think there are definitely some, you know We intersect at different points and so forth and then I You know Maruti mentioned that basically he alluded to some MPLS side of the things and I'm interested in really knowing You know, what more can be done there? Thank you Good Another quick question. So do you guys have any recommendation for how to handle bum traffic in this solution? so so right now what what happens is when When the virtual machine max are written to the obvious db server obvious db server itself sends notification that you know The max are written. So now we know that the max are there already. So what we do we create a tunnel. So what happens is and So bum traffic in the sense first right now what happens in the first packet that is sent from the virtual machines gets broadcasted all the tunnels but because we get the response from the bare metal max We second packet will not get broadcasted. I think you're talking about ARP, right? I'm I'm seeing other kind of multicast traffic if I want to do where you cannot do a Proper Mac unique Mac learning. How do you handle that? I mean you need to have some service node or something typically So do you mind if at this time we take this offline? So can you go back a picture? Inside the data model. I'm looking at it. It has interfaces for the gateway Are those gateways? Are those interfaces on the gateways enumerated when the gateway registers with ovs db? So is it expecting to see an enumeration of interfaces for the ovs db vtap schema? And that's what populates the the data in the new tron database So so there's a in the schema it talks about l2 gateway interfaces I can ask this offline I was just wondering if the l2 gateway is expected to register interfaces with the ovs db vtap schema And that's how these are populated. So what happens is when when you discover your l2 gateway it's a responsibility of Basically gateway side to write the information into the ovs db basically the physical interface and the physical switch So this physical switch and the physical interface gets replicated into a new tron database and then when you create the Your logical gateway we try to validate whether this really exists or not So, you know every time you create a gateway and then you create a connection between the gateway and the network That's where like that that data gets Yes I think we are about time. So we need to hand over to the next session But let's walk outside and continue the discussion if you guys are interested And thanks again. I have a use case which is very similar. So probably I'd like to take the discussion I need to end over the Thank you