 Hello. So today we're going to talk about how to deliver OpenStack an FV service training at scale with the integration of networking SFC projects and networking OVM project. My name is Kathy Zhang. I'm a principal engineer at Huawei. I'm also the project team lead of the OpenStack service function chain project. Hi, I'm John McDowell. I'm the SDN architect for Pallet Networks. So Kathy, take it away. Okay. Yeah, I guess I need that. Oh, just use this. Okay. So first I'm going to go through very quickly what is service chain. So by service chain, what do we mean is that through a centralized control management and control platform, different tenant's flows can be automatically provisioned to go through different sequences of service functions. And this service function can run either on virtual machine or on container or run on the physical device. So this slide shows a service chain logical model and API. So probably some of you already saw this slide in my last session. So the service chain API consists of two parts. One part is a flow classifier and the other part is a port chain. So the flow classifier specifies the classification rules that are used to classify a flow, which will go through the chain. And the port chain consists of an ordered sequence of service functions, a specific flow will go through. So in this example, the port chain consists of a firewall service function and the IPS service function and then a video optimizer service function. So each service function could have multiple instances and they are grouped together into a group. So in this example, the firewall has two service instances, the IPS has three instances, and the video optimizer has two instances. So each instance is represented by a pair of ports, a pair of neutron ports. And this port group together forms a port pair group. So the port chain actually consists of an ordered sequence of port pair groups. So they are grouped together. Each group together is for load distribution purpose. So this shows a open-side service chain architecture. At the top is a neutron server. At the bottom are a few compute nodes. We have one or more compute nodes. And then on the neutron server, we have the open-stack service chain API. And then at the back end, at the southbound, different service chain driver can plug into it to realize the service chain functionality. So currently, we already implemented the native path, which is OVS service chain driver, which will the purple block, which will talk with OVS service chain agent, and then will program the OVS switch to properly steer the flow through the service functions. And then there are other parts. Today, we are going to concentrate on the OVN paths. So how we do the OVN switch and driver, northbound DB and southbound DB, and then program the flow properly. You can also plug in different SDN controllers. So currently, we already implemented the onus controller path to do the service function chain functionality defined by the user through the open-stack service chain API. We are currently working also on the ODL path. So basically, you have any vendor-specific controller can also plug into it. So this slide shows how the very high level, how to integrate networking SFC with OVN. So on the right side is architecture on diagram. So the left side of diagram shows the neutral API, how through the M2 plug-in, you can map to the networking OVN driver and then to map the models, the API data structs to the OVN northbound DB. And then to create the flow in the OVN southbound DB. Similarly, for service function chain is similar architecture. So you have the open-stack neutral service chain API. And then the service function chain OVN driver is going to map that API constructs to the OVN northbound DB service function chain constructs. And then it's going to program to have the logical flow into the southbound OVN DB. So on the left side, it shows the mapping. So for the neutron right, how we map it. So the neutron network will map to the OVN logical switch. Neutron port will map to OVN logical port. And neutron router will map to OVN logical router. And neutron security group will map to OVN ACL. Similarly, for service function chain, we have at the high level, as I showed in the previous diagram, we have the logical port chain will map to OVN port chain. And then the logical port pair group will map to OVN port pair group. And logical port pair will map to OVN port pair. And then flow class file will map to OVN ACL. And then the OVN northD will map these OVN constructs in the OVN northbound DB to the flow rules in the southbound DB. So this is a quick update on the project status since the Austin Summit. So we have had two official releases. One is Liberty release, and the other is Mitaka release. So you can go to the package link to download the release and try it out. So for the Mitaka release, we add the new features. So we add chain ID to port chain parameters. This is for the upper layer orchestrator to coordinate the operations between the VF manager and the service function chain manager. And then we also add weight to the service function parameter for more smart load distribution purpose. And we also add port pair group parameters, such as whether the port pair group service function is a layer two service function or layer three service function. And also we have added functionality which you can dynamically insert or remove a service function to or from a chain just dynamically in real time. And we also, of course, add some other things which to satisfy the new stadium requirements, such as API reference, admin user guide, tempest test, functional test, all those tests. And for the new turn release, we plan to do it in December this year. We're going to add symmetric chain support. We are going to also add the exact match-based flow rule creation deletion. And we, of course, to satisfy the stadium new requirement, we will move the API to the Neutron Leap repo. So now all the Neutron API and the stadium project API were all in the same repository. And then a few other work. So that's it. Talking about the change. So I want to talk about how you service function to scale. And I want to do it from a security perspective to give us some real-world use cases. So I'm going to take it from a simple service function to a much more complex model where we're actually deploying potentially hundreds of VMs and hundreds of service functions. And how does that scale operationally? Because we can talk a lot about just, oh, I can bring this up and bring this down. But as an operational exercise, how does security go with your workloads so you're actually secure? So first of all, let's define the problem. So today, most security or a lot of security breaches are east-west. This is where attackers come into your network and then move laterally through your organization to figure out how to exaltate data. This is done by criminal enterprises. And they use the word enterprise very specifically because they are doing it for a profit motive. They're not doing it just for script kitties. They're actually doing it to make money, illegally, but to make money. They want to steal credit cards. They want to steal industrial secrets. So they can take time. They pay their staff and they take risks. And they get a reward at the end of it, hopefully, for them. Not hopefully, not for us. So the first thing to do is gather intelligence. Just think of it as marketing. There's no way from marketing here. And to figure out how to penetrate your network. I mean, today, your network is probably as secure as your most gullible employee. Can you convince somebody to stick a USB drive in, click on a web link, click on an email? At that point, they can enter your network and leverage an exploit. We have sensors and we work with other security companies. We collect data from a lot of our customers and we munch it to see what's going on in the network, the larger internet. We see about 30,000 new exploits a day. This is not brand new exploits. A lot more variations on a theme. But security is much as a big business or is a high attack surface. So this whole phase where people are coming in and trying to achieve unauthorized access into your network, they're trying to attack your most vulnerable points. They don't have to get to your critical data yet. They want to come laterally through your network. Once they're in, they can establish a command control center. Stuxnet, the virus that attacked the Iranian centrifuges was really sophisticated. It actually downloaded new DLLs, depending where it was or what it was attacking. So it actually dynamically morphed itself to actually attack different parts of the network. So once it has command control, it can then start sucking your data out. This may take three or four months. A lot of the big retailer attacks we saw in the last two to three years, they took three or four months. The attackers were in the network for that length of time. So this is scary, but it's also an opportunity because each of these points is an opportunity to block the attack. You can insert things, security functions into your network to block, detect, and attack. This is where service function starts becoming really interesting because you can actually now dynamically insert functions into your network to protect yourself and to stop attacks at any point in this chain. That's the first part. The second part is a little more complex because now you have true organizations, security guys, they protect everything, dev ops, deploy quickly and fast. The CSO says, I don't want to go to the board and tell them we have a breach. The CIO wants to go to the board and say, we're using the cloud and we're deploying new apps in 15 minutes. The CSO is going, nope. So how do you solve this dichotomy? Well, with service function chaining, we can actually search functions dynamically. I'll talk a little more about this as we go through the demos and try to explain, walk through how this actually happens. So just to step back a little bit and say how we actually implemented this taking the work Kathy did to her team and building on it. So we integrated this into, we're using OVN, really OVN, because OVN scales. There's a piece of talks over this week I think about OVN scaling and hopefully you've seen some of that. It's also very logical in terms of its logical description of the network. So if a service functions, we don't really care where they live. We want to know where they actually can protect parts of your network. We want to be able to move them around and the whole thing of dynamicism is really important. If any of you are Austin last year, you saw the talk from Intel where they just deployed 10,000 VMs and 2,000 containers in two minutes. So if you think of the DevOps guys doing that then go to the security guys and say, oh, by the way, I've deployed 12,000 workloads and I need them secured in two more minutes. So what we've done, and this work was really easy actually because Kathy's team built this driver model for networking SFC. So we can leverage SFC APIs, which are common across all plugins. We built the single interface into the plugin model, which was essentially a few hundred lines of code to implement. We built a flow classifier plugin too because the model allows us to have different flow classifiers for different environments, which is really important. Then we're going to talk to Neutron OVN, Neutron networking OVN. Neutron is a source of truth. So there we actually could actually, if I give out port pair and port chains I can actually go and check to make sure the ports actually exist. Logical switches actually exist. So I can actually do validation because Neutron database is a source of truth. Neutron OVN provides API link into OVN. So we can leverage that to talk directly into the OVN northbound database. So those two pieces were really easy and great credit to Kathy and her team and the networking OVN team. For OVN, we picked OVN because we weren't a bit abstractive from OVS. I personally really want to go into a program and mess with OVS because it's hard. We'll make something easy. So luckily with OVN, we had to make three changes. One, in the northbound schema, we added four more tables. A port chain table, a port pair group table and a port pair table and a flow classifier table. Very much like Kathy mentioned in the discussion about what the API model is. So we reflected that straight into OVN. And then in OVN North D, we added a new stage in the pipeline, an ingress pipeline for port chains. So I'll talk about the rules we added for that. So we added two ingress rules and two egress rules. Any flows that come in, we can chain them and direct them into a port pair or a chain of port pairs. So this is all very simple. And at the end of this presentation, there's some links to the GitHub repositories for these changes. So the North D extensions. So if you look at North D, there's a set of tables for the logical switch. And they go through from security ports to ACLs to DHCP, et cetera. So what we did, we added a new table. This is some work in progress. It's actually location 13 today, but we're actually moving it up to be close to the ACLs. I did some rules to say if this comes in, chain these things together. The reason why we're moving it towards the ACLs, because the first time we did it, we actually put the flow classification in the port chain. Now we're moving it into the ACL table, which gives us some real power, because then we can leverage all the ACL capabilities of OVS and OVN. If they add more ACLs, we take advantage of them. And it goes back to the architecture that Kathy was talking about earlier, in that we have a flow classifier plug-in in the SSC model that matches OVN, and we're cleaning home-free. So this translates into this complicated diagram. So the red is the ingress rules, and what's the color? The blueish is the egress rules. So if you look at the first rule, the 150, they're prioritized. So the higher priority gets processed first. So an ingress path, when a packet is going from app two to app one, the first rule says if it's coming from the VNF, and it's going to app one, deliver to app one. That's the highest priority. The lowest priority is the catch-all, saying if it's coming in and going to app one, stick it in the port chain. If you have more VNFs in the chain, it's basically just chaining the things together. Chaining, chaining. On the other direction, if it comes out of app one, going to a destination, the 100 rule applies. It says if it's coming out of this app, and it's coming from the app port, stick it into the VNF chain, and all the way through. At the end, it says deliver it to the final destination. So very simple. Moving forward. So let's do some demos. The demos are recorded, except chicken. They want to try and do Wi-Fi to Santa Clara. So we'll do two demos and kind of build on them. The first one's very simple, just showing inserting a VNF into the port chain and going through the model. Okay, so those applications want to secure. So if you see up there a port pair, we can configure using the port chain API. And then we're actually going to manage that VNF through a VNF manager going off the network. So looking at the chain table, there's nothing in it. We're at empty table. If I go up now and run my orchestrator, which is a Python script, I do add port chain, port pair group, flow classifier. If I go and look now at the table, I'll see I've inserted the rules into the southbound database. So let's go and try and ping between the two applications. Oh, nothing's happening. That's not good to me. Ah, because the VNF has a block rule. Kind of showing that the VNF is actually working, it's doing its job. It's actually providing some security between these two applications. So you're controlling flows between these two applications on the same network. So now ping's working just by changing the rule from deny to enable. So you can see it here that these are the session flows, and you'll see it refresh and basically it flows. And now it all goes through the VNF manager. So the other thing you want to do is, you know, my mother always told me to clean up after myself. So we should go and clean up here and run this, you know, reverse script and pull out all the port chains and port pair groups and flow classifiers. Because if I want to do this at scale, I want to insert and remove service functions rapidly and dynamically. I don't want to have to have a rule explosion. So I cleaned up after myself. So that was a really simple demo, just to sort of set the stage. We showed all the traffic going through the SFC, you know, using SFC to steer traffic through a VNF using the standard APIs. This is coming in from Neutron, not Neutron. OpenStack APIs pushing all the way down through network SFC, through networking OVN, through OVN. So you're using standard OpenStack APIs. You don't actually see any of the underlying plumbing. We've added remote remove rules. You notice we didn't actually have to specify the location of the VNF or where it was in the network and which compute node it was on. OVN takes care of that. OVN just says you create a logical model, you're plumbing in, and the southbound database connects, the controller connects the ports together. The VNF here is working as a bump in the wire. This is kind of important to think for scaling because there's no networking in the VNF. So that means that the VNF moves around. You don't have to change VLANs, routing tables, etc. So you can scale, and we talk a little bit more when we can stop at load balancing. You can imagine load balancing a bunch of VNFs together. If you don't have any networking, it's really easy. You can just spray packs across the interfaces. And just to reinforce, I know we're using our VNF here, but we made no changes to the VNF to support this. So the key thing, one of the design goals we had was to not require us having custom VNF code to support service function chaining. We want to be able to deploy things off the shelf. And we want to encourage other vendors to do the same thing because we think there's an advantage in having a common infrastructure pushing us down to the plumbing that just works for everybody. So let's get a little more complex. Going back to the security and DevOps guys. You saw from that demo is I can go in there and I can configure security manually. So going back to that Intel demo where I have 12,000 apps coming up in two minutes. Going into a VNF manager and trying to configure 12,000 apps in less than two minutes is not possible. So how are we going to solve this? So what we want to do is move the security piece into the configuration and the design part of the application and then enable the DevOps just to orchestrate application deployment. So they can go and deploy 12,000 apps with security not having to worry about the security team. So the way we've defined it is we want to create policy tags. We want to create a cluster of tags. They describe the application, compute, database, web, where it is, staging production, maybe geography, Europe versus US depending on what the criteria you're having in your organization to construct a security policy and construct those policies independent of the instances deployed. They say this is the policy I want for my organization. This is what is approved in my organization. And those tags are the sort of link with Franca between the security team and the DevOps team. And you see from this orchestration the two red boxes, the two things I've added since the previous demo. The previous demo just went through basically creating the application, creating the port pairs, adding the ports during SSC. So now we've added two things. We've added tagging. And we've added taking the tags and instance information and pushing it to the VNF. So this is orchestrated. So if you think of, we actually need an orchestrator here. And if you go down to the show floor, there's a demo in Intel booth for the Intel Open Security Controller. We're actually implementing some of this in a real controller environment. All this demo is all just Python scripts I've written. But it's all using the Neutron APIs, which is really important. So let's look at those two things. So tags of metadata. So this is just using Nova tagging. So I've gone over and I've tagged all the VMs, all the things that I've used. Simple tags, compute production, web production, web staging, compute staging. You can imagine having a much more complex taxonomy. But just keep it simple just now. Another thing is there's an API into the VNF Manager that takes the instance of the app I've just created and the tags attached to it and pushes it to the VNF Manager. And this is all done at runtime. This is this part of this orchestration that you can automate. So at this point, you can actually do this. You can conceivably deploy 12,000 applications with security without having to get the security team involved. So this allows you to do real world deployments with VNF using service function chaining and actually scale. And I say scale, I mean scale operationally. So your teams are actually productive. So doing a little demo. A little more complex network this time. I could have made this a lot more complex, but I think the screen resolution would have been a problem. So this is already set up. So what we're going to do now with multiple networks, multiple VNFs, we have different policies. So if I look at the tables now, I'll see I've got rules in a whole bunch of different rules. I have different rules for different networks now. So there's two logical switches here. Each one has a set of rules for those two, my staging production network. Once I have that, here's the metadata. So this is the same metadata that's in Nova. It's now in VNF. This is where the bridge happens between the two worlds. As always required, it's really the agree on metadata tags. So the studio team and the DevOps team agree that this is how we want to talk to each other. And then I can create complex policies. And the policies then are driven by metadata. It's ands and or rules. And I can have rules that actually have application types in them. So I can narrow down what's allowed in various environments. And when we push down the changes, you'll see in the VNF that IP addresses start to appear. As the instances get created, I'll push the IP addresses with the tags down to the VNF so the whole process is automated. Going back to the API again. So you see these are IP addresses of those network elements. So this is all done without security guys being involved. This is all a DevOps script or DevOps orchestrated environment. And you notice the deny there. So if DevOps doesn't put any any tags, the default is deny. So the security team has this blanket. So now we're seeing an IPerfing between the two nodes. And that works. And what I've done with the rule that said all those two things can do is talk IPerf. The web can talk to the compute through IPerf. They've not talked to web. And they can't. You see the session running there. And you'll see they can't ping. In production, I want to turn off ping because going back to that previous thing about attackers coming in and looking at your network, they can do ping sweeps and go, oh, here's nodes we want to attack. So you want to lock down your environment. This is just a way of doing it. The rules you want to do are completely independent of the system. You can make your own rules. So hopefully that shows that we've actually taken a simple example and made it a lot more complex. Even though that network is really small, hopefully you can imagine we can take, based on that previous demo and building on it, we can now take service function chaining, which before we can deploy everywhere. But if you deploy it with all the same rules, it's kind of a challenge. Now we can deploy VNS with service function chaining and apply policy with it. And the policy is really defined through OpenStack metadata. We used Nova tags. I could also have used Neutron tags, so you can have this complex taggy scheme as you want. And we added a notification down to the VNF. So potential to scale this to 1,000 of apps. So hopefully we'll be able to show you that you can take this and give your DevOps team a set of scripts or an orchestrator and deploy VNFs everywhere. It does require STN orchestration. It's a key missing piece in this, so I said we'll be working with various people and you all have your own orchestrators, so it's something to think about. So to wrap this up, we still have some ongoing work. There's probably a bunch of people who are working with me in here. We've talked about the ACLs and flow classifiers in OVN. So that piece we're working on hopefully... I don't know if Flavio's here, but hopefully in a couple of weeks we'll have that done. We want to integrate load balancing because OVN now has load balancing in it, which is a really powerful concept. So you can imagine OVN doing load balancing for us for port pairs. So all you need to do is put the port pairs into OVN and OVN takes care of load balancing. And remember, it's a logical layer. It's not a physical layer, so it's really important because all this stuff is done using OVN rather than having to modify OVS. Additional networking. This has made a question for the audience. I mean, we're very much focused on having bump into wire and does NEC use cases for having L2 or L3 VNFs? To scale, it becomes very complex that we're here to listen. Continuing integration. We have this working with Docker. We can show service chaining in Docker containers. Probably we can do others. And the next thing is to start to push this code upstream. We're working with the OVN team and the OVS OVN team and Kathy's team and hopefully we're... Hopefully in the next OpenStack summit we'll have something more to report. So, conclusions. I think we've really demonstrated how service function chaining enables deployment scale of VNS. Without service function chaining and making this standard part of infrastructure, this whole thing wouldn't work. If we had to do this unique for every single SDN control or every single approach, it gets really complex. I mean, it's complex for you and it's complex for DevOps. So, scale is really implemented. It's enhanced by having standard service function chaining. The meta-driven policy then allows us to scale numerically and operationally. The ability to move rules in and out of OVN and get pushed into OVS allows us, you know, people to deploy and undeply workloads seamlessly. No requirement for networking the VNS. You know, doing this bump into wire really helps scalability because I don't have to go mess with the VNF and try to do L2 or L3 change to, you know, set next hop or, you know, do a bunch of static version tables or even worse, SNAT and DNAT, which I personally hate. And because we really stressed, you know, even though we're from parallel networks, we stressed not making any changes in the VNF to support this. So, this is open up to any VNF, you know, Flavio Rota, a really simple app that just a few hundred lines ago that took packets in and packets out. So, it shows that a standard, you know, not standard VNF, a simple VNF can work equally well. And we can use that for testing. So, hopefully I think we're showing, you know, we're solving real-world business problems. This is something that, you know, we and our customers face. This is not something that's, you know, academic exercise. We have people who want to need to solve this problem. So, using networking OVN and SFC, I think, we've solved this problem. So, thank you in questions. This is a bunch of, you know, more information you can go and, you know, SFC and OVN and, you know, some blogs and then there's repos there for the code you can download and, you know, give us feedback. So, I have... Any questions? A microphone. Actually, I was just thinking about the use cases. I mean, there's a lot of brownfields out there with all these appliances that even you guys used to make, right? Yeah. And I assume that is a use case where you actually want to utilize all that metal that you actually have and I assume people will ask for it. Yeah, I mean, it's... Whether it's, you know, a hardware box or a software box, I think Kathy mentioned earlier, you know, the server-sumption chaining doesn't really care. It's really a question of, you know, there's a compute there. I mean, the whole idea of having software just means you can move around easier, you know, moving, you know, hardware. And you can skip the other thing. But I mean, if you're actually leaving the virtual world, it can be tough to do it without L2 or L3. Yeah. What's the same? Anybody else? Not really a question, but just a comment you made me think of with the number of BMs and containers and so forth we're deploying. You often see on hand soap, kills 99% of germs, which is great. 10 billion, you apply it to something with 10 billion germs on it. Terrific. There's only 10 million germs left. Same thing. We deploy tens of thousands of VMs and 99.9% of them were developed by premier developers and 99.9% of those 10,000 VMs are secure. So you've only got 10 vulnerable VMs. That's good, right? But we need the... With service-function chaining, even the ones that are missed, we can put additional controls on top of them even if the people responsible for them slipped up once in a while, just one out of a thousand, at least it gives us an extra layer of protection. I think I need to phase that statement. It's not only your most gullible employees, but your most careless developers. I think it's what you're referring to. Any other questions? No? Thank you all. Thank you.