 Let's get started. Hi. Hello, welcome. Welcome to the fully open source smart OpenStack Cloud, now and beyond session. My name is Ash Palgett. I'm Senior Director for Cloud Marketing at Melanox Technologies. What I do for Melanox is work, develop our cloud market, and work with the partner ecosystem, as well as the open source ecosystem, and the solution marketing to bring exciting solutions for the cloud market. Telco happens to be one of that market as well, because Telco is becoming a telco cloud now. I also want to introduce my colleagues here, Frank Bodin, who is a NFE product manager. A lot of people might know him. He's from Red Hat. And also Mark Iskra, who's a technical marketing expert. And he's from Nuage Networks by Nokia. So we have an exciting topic today. And we're going to go through a lot of stuff. Let me introduce the agenda a little bit. What we all know is that the telco networks have gone through a major shift, a transformation, changing from the traditional networks that are built for service providers to now becoming cloud-based networks. So I think today's talk is a lot more about how that has evolved and has happened, what are the components of those evolutions, and then how does it all come together in the open source, and what's happened already, what's going to happen next. So we're going to break this session into two parts. One is what's happening now with the telco cloud, and why is there a need for offloads? And what we mean by offloads is hardware offloads. What are smartnicks? There's a lot of buzz around smartnicks. So what are the smartnicks, and how the offloads work on the smartnicks? What are the open source elements that are enabling these offloads? Because we want to make sure that we have networks that are now agile and programmable and flexible. So that's very important. Open source is important. And we are going to share a lot of different information on that. Frank is going to touch upon all of that. And then how does this all get deployed? So there's a live demo, but we have recorded this demo in interest of time. And Mark is going to do that demo. So that's basically to recap what's happening now in the telco cloud. And then we're going to go into the beyond, where we'll talk about what are the next set of things that are also needed in order to make all of this deployable. So without any further ado, let's get into the agenda. Let me take a little moment to talk about what's happening in the telco cloud. As we know, for a long time since 1980, almost 40 years back, we have been having networks built in a monolithic manner. And what that means by monolithic is we have been having proprietary purpose built appliances for anything and everything, from layer 1 to layer 7. Anything that's needed, a router, a switch, a load balancer, a firewall, CDN, whatever you need. We had an appliance. There's a vendor who's building it. And that's been how we have been building networks for years. That's called as hardware defined everything world, which is basically hardware is the king. Hardware is how things have been defined. And everything is put into the hardware. There's no virtualization. There's no desegregation of any kind. There's no flexibility. So it's a very monolithic model. What we realized in 2007 is that model is extremely difficult to really be agile, be faster to deploy new features, be able to recognize revenue faster. And so we have in 2007 something called a software defined everything world. And what that meant is let's move all of the intelligence from the hardware into software and make the hardware commoditized. So we can use standard high volume servers, standard high volumes storage nodes, and also commodity switches. And that's all great. And then put all of the brain software programming into the software defined networks, the network function, virtualization, all of those new ideas, new concepts came up. And that's the software defined everything world. However, if you look at it, we started pretty much on the commoditized servers with bare metal kind of infrastructure with a lot of cores available. But as we add virtualization, and we add software defined security, software defined storage, software defined networking, everything software defined and virtualized, a lot of these cores are being taken up. So the green course is what is available and the other cores are all being chewed up for packet processing, for protocols, for overland networking, all sort of things. And essentially what happens is then you have very less cores available for running your applications, your VNFs, your containerized workloads. And that's not really scalable. What you achieved basically in doing, going from the bare metal to the software defined world, which is in the middle, is that you sacrifice the performance, you sacrifice the efficiency. And you basically defeated the purpose of why you want to go software defined to begin with. So how do you really gain back this efficiency? It is something called a smartNICS on the right hand side. You can see that what we are doing is smartNICS are the regular NICS that look like regular NICS, but they have smart offload capabilities. And what that means is all of this packet processing for networking, for overlays, for storage, for virtualization is all being pushed into purpose-built silicon, ACX could be FPGAs, could be CPUs, could be GPUs basically, that can all do this efficiently than a general purpose processor. So what happens is you have a smartNICS who's at the bottom, which is doing all the heavy lifting and at the top you have the cores that are all freed up. And that's really what the smartNICS revolution is in a nutshell. So let's recap basically what happened. This is a pendulum. What we see is started on a hardware-defined everything world, which is pretty much like a bare-medal kind of scenario. We don't have any virtualization purpose-built everything. We went to the software-defined world, which everything becomes extremely flexible. But then you sacrifice the performance, you sacrifice the efficiency of the infrastructure. Then you figure out, oh, the light bulb went up. We said, OK, we need something that's in the middle. Let's find an equilibrium in the pendulum, which says, OK, let's have a hardware-accelerated software-defined world. And what happens in that story is basically you have the smartNICS, the blue components, that are doing the heavy lifting for packet processing, for networking, for storage, for virtualization, that helping you to get the efficiency of your software-defined world. So you get the best of both sides, a flexible infrastructure, but also high performance and high efficient. One of the key components here is OpenVswitch. As you know, OpenVswitch is a very popular virtual switch out there in the open stack world. It is part of the hypervisor. And I think an example of why virtualization or desegregation could really be or programmability, how that really impacts the performance, although it always is really massively deployed, it has issues. And we all know that, basically. There's packet performance that you can get less than 1 million packets per second if you dedicate two to four cores. It consumes a lot of CPU. Even with 12 cores, you can't get less than up to 30 gig of throughput. So that's 1 third of a 100 gig lake if you're using that. You cannot even realize the line rate on that. And you have a poor latency. And because the latency issues, you have poor user experience. So you really cannot really rely on a virtual switch that's kind of dropping the packets. And essentially, this is a key component of a virtualized infrastructure. And that's what we are trying to see. How do we accelerate this whole thing? So the talk for today is about how do we accelerate a key component like OBS or Open Virtual Switch, but could we also apply to other virtual switches and other virtual routers? But let's take OBS because that's what OpenStack community is using really heavily. So how do we do this all? So folks are familiar with OBS. I'm assuming I'm not going to go into all the details of what OBS does, how it works, how it programs, how does it manage the flows, how does it classify the flows, et cetera, et cetera. But essentially, what we are doing is we are offloading the flows from an OBS into a e-switch of the NIC. And that basically is the way to achieve what you're doing, a hardware-accelerated and SDN controlled faster data plane. So essentially, in an OBS offload world, what happens is we get the Open Virtual Switch as a standard SDN controller. And we also get the OBS data plane offload to the NIC embedded e-switch. And that's using its RIOV for the data path. So data path is really fast, but it's also programmable because you're offloading the OBS rules into the e-switch of the NIC. That's really the beauty of it. And if you look at it from a packet flow standpoint, you first normally, if you do not have the hardware offload, what happens is every packet that's coming in on a slow path, the orange one, goes up all the way to the OBS Switch D when you don't have a rule available. You program the rule from the OBS Switch D to the kernel, OBS module, and then you're basically always slow path, basically switching in the kernel. But what happens with the hardware switch, which is offloading the OBS rules, is all of that rules, those are in the OBS kernel, also get programmed into the e-switch all the way to the bottom, like it shows in the diagram on the left. The flow offload happens there. And then what happens is all the subsequent packets are all hardware switched. So it's really fast path, expressed path, and that's the green route. So you don't have to go to anymore to the kernel or to the user space. All of the packets are quickly classified, match and action happens, and then you're right away switching the packets quickly. So this is a quick example of how the pipeline works. This is the open flow example, for example. See how you can take the flows, you can classify the flows based on classification criteria as you might have five tuples, for example. And then you can put the actions on that. So this is something that's programmable in the e-switch today. So the e-switch can let you classify on five tuples, for example, and then you can take actions, like drop the packets or forward the packets, rewrite the headers in the packet. You can do mirroring, you can do overlay and cap, decap, telemetry, increment, decrement, counters, all kinds of things are possible. And that's really important for applications such as natting, load balancing, ACLs, firewalls, basically DDoS, all of that stuff can really take advantage of that. That's all offloaded into the switch. It doesn't have to go in the software path. It can be offloaded in the hardware path. One of the questions that comes up is what about DPDK? And we all know what is DPDK, it's data plane development kit. And it's a technology that is actually helping to improve the performance. However, it's a software exolection technology. What that means is it consumes your CPU for the PMDs. You lock up the cores from the CPU and then you are using the CPU to do the work still, right, in terms of really pull more drivers using and pulling it on a certain time intervals. So that gives you the performance efficiency not as much as what you can get with a hardware offload, but you still are gonna consume CPUs. So I think Melonox has really great performance for the DPDK. If you're doing it in the application, really great. Using DPDK for your VNFs, for your containerized workloads is great. However, do using DPDK for OBS is not the best solution. You should really go to the hardware offload solution. That's called Melonox brand name for the OBS hardware offload is ASAP squared. That's accelerated switching and packet processing. That's what we brand it as. But essentially it's eight times to 10 times better than in terms of the performance. It's better than the OBS DPDK. And all of that has zero CPU utilization. So the cores that we talked about before, those get freed up, is basically what we do when we have the hardware offload. We don't use any CPUs unlike with DPDK. Here's a quick example comparison of the OBS DPDK performance versus the performance of OBS offload. We can see that with about two cores dedicated for DPDK, for running the OBS over DPDK, you can get about 7.6 million packets per second. And you can pretty much use zero CPU for doing 66 million packets per second using the hardware offload. So that's an order of magnitude almost, right? I mean, a little bit less, but 8 to 10 X is what I said. Your mileage will vary. But essentially, it's a huge difference in terms of performance. And this is for all small packets. You're talking 64-byte packets, which is really a very good performance for a lot of networks that need high bandwidth. And all of this is not done in isolation. So although I talked about Melonox a little bit here and there, but all of this architecture for OBS hardware offload is an open architecture, right? We of course have an E-switch on the NIC, but it's all enabled in the open source and open community. So we'll talk about that more in a second, but essentially it's not a vendor-locked solution. That's what all of the telcos, all of the cloud customers want. This is exactly what we're doing. So with that, I wanna hand off to Frank to talk about how does this architecture come together and what are the different components that we touch and then how do we all bring it together as a solution? Hi everyone. Yep, is it working? Yeah. So having a great technology is very important. And then we need to have it integrated end-to-end to have an end-to-end solution with open source components. So here, what I'm going to show is that people that know SRIOV or people that know OBS will feel at home because OBS hardware offload looks like from an open stack and system perspective as a mix. You take the best of both worlds in order to get your open stack deployment for NFV. So one thing which is really cool about this design is because the first packet is coming to OBS. OBS takes the choice to offload or not offload. And when the flow is offloaded, it's also in the kernel data pass. This is something I'm going to show, but if something is not available like a feature which is in OBS, not in TC flower, you have a fallback in the kernel. So you are full feature from day one. The feature which are accelerated are accelerated. The feature which are not accelerated are going into the kernel. So all of this is available since OpenStack Queens. So we are post rocky. So this is something that is already there. The principle of this architecture in OpenStack work with other V switches. And the point is the key elements in this picture are OBS. So OBS is programmed by Neutron via Open Delight. And the OBS does not see the virtual function that are exposed to the VM. So in the VM, what you see is a VF driver like SRIOV. And OBS, what it sees is a representative port for each and every VF. You have a virtual device that OBS is seeing, which is a net dev in the kernel. And from an OBS perspective, it looks like OBS. From the VS perspective, it looks like SRIOV and you have all OBS features. So to get there, in Queens, developments have started years before because kernel has to be extended with driver plus TC flower. Then OBS had to be extended in order to support TC flower. QMU, this time, was not touched a lot. Then LibVirt, you have a new kind of interface. So you need to add this new kind of interface in LibVirt. And once you have key elements that work well, you need to integrate them into a distribution. So let's say CentOS. Once you have a Linux distribution, you need to have the proper SDN that knows about OBS, but it need to know about the little thing about OBS DPDK, sorry, not OBS DPDK, but OBS hardware float. Then you need to modify OpenStack, mainly Neutron, Nova. I'm going to show that on the next slide. But once you have OpenStack, you need to install it. So you have to modify your installer of OpenStack. So here is Tripolo, OpenStack installer. And then, once you have all of these components, you need to have them in a distribution. And once you have a distribution, you need to test it end to end. And you can do that with a PNFV. Just to say that this is the accomplishment of outstanding upstream work in plenty of communities. This is not OBS. This is not kernel. This is not OpenStack. This is all of this community together that can provide this feature. Just for OpenStack, Nova, Neutron. So those are the links, and you have also others. Networking ODL, Puppet Neutron for the installation, all that is required to implement this feature. And again, this is only the OpenStack part. Now, let's talk about an NFV typical deployment on top of hardware offload. So on the left, a quick DPDK 101. So DPDK is the fastest way that we know today in software to process packets. So what it does, you have an active loop running at 100% taking packet, processing packets, sending packets. With the OBS hardware offload, you don't have DPDK anymore on the host. But in the guest, the VNF are implemented on top of DPDK. So you have active loops. So here, I've represented four active loops. So four CPU running full steam, getting packets from a VF, processing it, doing kia grade not, whatever, and sending it back to another VF. So in order to deploy that properly, you want to make sure that you have a proper end to end deployment. Because NFV, I like to say that NFV, when I make a comparison between cuisine. In cuisine, you have ingredients. These are the components. NFV is not a component. This is a recipe. So you need to put everything together and configure it properly in order to have an end to end solution that stands. And this is the tricky part. So a typical deployment of NFV, you need to dedicate the first core of each NUMA socket to the operating system and the host to run OpenStack services. So here is a real example of a triple load deployment for this kind of CPU. So these are real numbers. And the yellow CPU is a CPU that will run OpenStack services SSH logs that will take care of interrupt for the disk. So you need to have one CPU to run OpenStack services. And all of the rest, this is for the VMs. And typically, you will dedicate, you will run Flavor with isolated CPU in order for your vCPU not to be preempted. So this is how you deploy it with Tripolo. Now, assuming that you all know Tripolo in and out, there is a lot of good material. So here in Tripolo, you have your first deployment file where you specify what components you want to deploy. So it looks like SRIOV. So if you take an SRIOV deployment, you just need to add, OK, here you need to enable your OpenDelight ML2 driver plus after the VF parameter, switch dev, switch dev, just two little things. And with that, you're good to go. The compute dot yaml is on your compute node how the OVS bridges are laid out. I've got one bound on the eth0, eth1, et cetera. So this is this file, zero change. You take your OVS deployment layout as is. And in your deployment command, you just add, OK, please enable OVS hardware offload. So when we talk about transparent offload, it's really almost transparent from a configuration perspective. Now that you have deployed your OpenStack with Tripolo, with your SmartNIC, how do you boot a VM? So you create a port exactly like you would create an SRIOV port, but you need to add this extra little parameter, binding profile capabilities, switch dev. And you boot your VM, which is accelerated, transparently. Under the hood, if you SSH on the host, what you're going to see is the two first lines are the base PFs, the SRIOV devices. Then you have two VFs created on this example, so two virtual functions. It looks like regular interface, except that if you look at the eth2 capabilities, you have hardware tc offload arm. This makes the whole difference. And if you look at the VF, one of the VFs, it looks like another VF. So again, people that are used to SRIOV should feel at home. Assuming that you have booted your VM, Nova is generated LibVirtXML. And this is exactly, exactly SRIOVXML, zero difference. In OVSDB, OVS does not see SRIOV devices, right? But OVSDB, for each and every VF, has what we call a represent or interface, which is in OVSDB. And the driver is MLX5 rep, rep for representor. So this is the trick. And now, assuming that we have a flow which has been offloaded. So how does it work? OVS choose to offload the flow. So it pushed the flow to the kernel and also offloaded it in the NIC. So the first dump is a regular OVS dump cattle dump flow with a minus M option, which is very important. And you see that offloaded, yes, data plane TC. And then it's TC filter showed to dump the TC flower. So this is one more framework people have to learn. And you dump the flow which are offloaded, and you have a one-to-one mapping. And if you want to know more, you just have to look from Queen's documentation. You have all of the details. And with that, demo time. Thanks, Frank. So I have prepared two videos today that demonstrate capabilities of switch dev functionality. Let me just switch gears here and bring that up. Bear with me one second. I forgot. Let me just set some context here. So let me introduce, first off, the Nuage Network SDN, product called Nuage Networks Virtualized Cloud Services. This is a three-tiered SDN type solution, very typical of many of the other SDN solutions available in the marketplace and in open source. We have OpenStack integration through an ML2 plug-in. That ML2 plug-in contains drivers that are both from open source and also done by Nuage that provide the advanced functionalities that Nuage delivers in addition to the open source capabilities. The control plane is actually part of the code that's used in our service routers that's been virtualized and enabled to run in a virtual machine. So this is instantiated in as many instances as you want. And then you can use BGP to exchange routes between different controllers. So you have a very scalable architecture in this design. And then each controller talks to OVS instances. And the OVS that we're going to use in today's demo also contains all the TC flower interfaces that have been upstreamed into OVS 2.9, 10, and probably even to 11 at this point. So integrated, this provides the complete capability to run OpenStack with software-defined networking and take advantage of the extremely high performance you can achieve by having the switch dev or switch on nick capabilities. And I'd also like to call out a lot of hard work that's been done by a variety of people in the community to enable all this to happen. So what you're seeing today here is really the culmination of a lot of hard work by companies like Milanox, Red Hat, and so on who have put all this together and made it easily consumable for anyone who wants to put together a working solution. So I have two different videos. As I mentioned, the first one will show OpenStack integration. We'll basically follow the life of a VM that's being set up to take advantage of switch on nick capability. The second one, we'll take a look at remote mirroring just to show you some of the richness of functionality that's being enabled in this context. So now, give me a second to change gears here. So this is the OpenStack integration. We're having a sync problem here. Yeah? My heart's beating again. All right, so this is a life of a VM controlled by OpenStack. I think this is open screen, isn't it? I mean, full screen? Better. OK. All right, so this moves fairly quickly. But it follows all the steps that Franck has just showed us. And it's done in the context of a real system, so you'll see real behaviors, not just stuff that might have been written out by hand or something. And you'll see it in a slightly different context. So Franck was doing open daylight here. We're using the Nuash SDN controller instead. But all these things exhibit the same behavior, and they're relatively simple and easy to do. So with that, let's roll it here. First thing I do is run LSPCI to find the Melanox NIC and identify both as vendor ID and also as PCI bus slot and function. I'm looking here both at the PF and the VF. Now I'm looking at, by malice of forethought, ENP130S0F0, which is the net dev for the first PF on that ConnectX5 NIC and verifying that it has the right bus slot and function. And again, here I just printed out the vendor and product IDs. We're going to edit the nova.conf here to whitelist that particular set of VFs for use and switch dev mode. So you can see here nova.conf has been edited to contain the right vendor ID. You have to restart nova services after that. Now we have to prepare each one of the compute nodes. And there are essentially five steps I go through when I deploy these. First, you create the VFs. Then you unbind the Linux device net devs for each one of the VFs. Then you enable switch dev mode on the NIC itself. That's 8200, if you remember, from the previous slide. And then we turn on the hardware offloading capabilities and then restart OpenVswitch. Now we're going to jump to OpenStack and actually create a network. I'm calling this the Berlin net just to show you how this works. And then after we create this, we'll attach a UV port, a virtual port, that has the switch dev capabilities. And I'll take you through that. So here you can see the integration created by the ML2 plugin from Nuage. This is actually included with Red Hat's OSPD 13. And you have either networks that are managed by OpenStack or by VSD. But either way, the ML2 plugin maintains consistency between the two different environments. So with that now, I'm going to go ahead and create a V port that has the net dev capabilities. Sorry, the switch dev capabilities. And here you see it's a port create. And the capabilities are switch dev. This is exactly what Frank showed us here earlier. So that binding is essential. I have a script here that does the actual function. So I'll use the Berlin net that we just created in OpenStack. And I'll give it the name FB1 for the port. And there we're successful. So far, so good. Now let's launch an instance and attach that V port to it. So I'm going to call that VM 11. Oh, wait a minute. I lied. Anyway, we're going to go ahead and create a VM and attach that V port. Now the important step is coming up here in just a moment, because you'll see where we're actually binding that particular V port that we created. This is just setting the flavor. We're going to skip attaching a subnet, because we're actually attaching a V port. And there you see it. It's FB1. And then we'll launch it. And then once it's launched, you'll see it show up. There it is, VM 11. Now let's take a look at that. So there was an existing VM 22. That's a vert IO VM. Both of these are running on the same hypervisor in this case. And I did that to underscore the fact that you can have both switch dev nicks or VMs and also vert IO VMs running in the same environment. So what we need to do here is verify that the XML for the VM actually has the correct bus lot and function for the VF that was assigned to it. And sure enough, it's there. So now I'm going to log into each one of these VMs, one's vert IO, the one's using switch dev, and generate some IPERF traffic between them. And then we'll just take a look at the flows and you'll see how those actually show up. And again, using the same dump XML, sorry, dump flows command that Frank showed us. We'll just get the traffic started here. I've also taken a look at the IP addresses. These are virtual IP addresses on a virtual L3 network. So this is all being done through the orchestration of the Nuage SDN controller. The assignment of IP addresses and the ACLs and the ability to communicate. All right, so now our traffic's flowing and we do a dump flow with the minus M option. And here I'm just going to, you only see the offloaded flows that are using hardware offload. And so you can see here there's only a single flow and that had the hardware offloading. That's because the responding flow from the other VM is not using the hardware acceleration. So with that, you see end-to-end functionality. Now I wanted to just bring you through the performance stuff in a hurry here so that we don't spend all of our time just talking about performance. This is a slide we actually shared at OpenStack Summit in Vancouver. And it shows you can achieve well over 50 million packets per second for small packets in a single direction using this capability. So that's over 100 million packets per second, which is enormously fast and probably typically about an order of magnitude faster than what you might achieve with a DPDK type deployment. Then we took a look at more and more functionality because that's really where it's at. You need to harness all that capability. So here we do service chaining. And the flows I'm showing you are actually dumped from the NIC itself. So this is a little bit different than using the OBS DPK, DP Cuddle command that Frank showed us. But the flows here are offloaded and you know they're offloaded because they're actually coming out of the NIC itself. Then the next test that we did was just take a look at the virtualized services router. That's a Nokia VNF, which is actually a core to a lot of our other commercial VNF products. So our mobile gateway, I believe, is based on that. And it's based on VX work. So I was a little bit apprehensive at first about whether or not this would actually work because what the host passed through, you have to have the right device driver to talk to the underlying device. You're actually attaching the PCI device directly to the VM. But sure enough, it all worked. And we set it up to do simple forwarding, which is really kind of underutilization of a service router. But it's the most basic function you could get from that kind of environment. And fortunately, it all works. So this is really good and it really looks very promising in terms of being able to actually put this into production type work. There's still a huge amount of testing going on to validate all this and work out all the bugs and little issues that come up. It's a long path to get from just having a technology preview to actually having something that's robust enough to actually deploy in real life. So let me show you one other functionality where a lot of work has been done to enable this, and which I think is quite interesting. And that's remote mirroring. In this case, I'm showing remote mirroring to the underlay. See if we can get back to the start here. And roll it. Okay, so here's a setup where we have two hypervisors plus a third system that's acting as a destination for a underlay mirror. So on the left, you see the remote destination for the mirror. VM 11 or VM 11 is sending traffic to a remote VM, VM 22. And the ingress and egress of VM 11 will be based on a match criteria also forwarded to that remote destination on the underlay. So here are the two VMs. You can see that each have V ports, the VM 11 and 22. Now I create an ingress match criteria. And this one is very liberal. It's basically saying everything. Normally you would only wanna like pick out a particular V port or maybe a class of V ports to forward but here we don't need to worry about that. And then for the action, once we match that underlay to the mirror, do an underlay mirror and then pick out the destination which I called lab one. So I'm gonna bind that for the ingress and then apply that rule so that creates a new ACL. And then we're gonna do the same thing on the egress. So we're gonna edit this rule, use the same match criteria and then select an action of mirroring. And again to lab one. So now once that's all in place, we can generate some traffic and actually see what's going on. So I've logged in here to VM 11 and VM 22. I've got IP addresses, these are virtual IP addresses that are connected through a VX LAN tunnel. There's also mirroring happening through a GRE tunnel to the remote destination. Somebody's a slow timer. Okay, so now we have traffic flowing. Now let's take a look at the flows that are actually active and we'll just use the OBS DP cuddle dump flows command. We'll search here particularly for our GRE tunnels. So you can see our very complex flows being created here. So this, the bottom one is a decap. So the VX LAN packet comes in, it's decapped and then it's sent both to the remote destination through a GRE tunnel as well as up to the VM itself. And the preceding one accepts packets from VM 11 and then sends them out on a tunnel. So that's pretty much the end of our demo here. And with that, I think I'll turn it over to Ash who will talk about futures. So we go through one more quick here. We have four or five minutes to finish up the future, features coming for the obvious hardware offload to the next slide. Very perfect. So some of the things I think, so this is work in progress. So, and I think Frank already showed the demo of the port mirroring, but there'll be other features as well. So we'll just touch upon what those features are. I cannot come into the timelines yet because there's a upstream process. We upstream Melanox, upstreams all of this stuff into the community for Linux.org, OBS communities, et cetera, et cetera. So there's a process to review and then it gets upstream and in certain kernels or OBS versions and then it gets integrated into all the distributions, right? Linux distributions, open stack distributions. So let's just talk about the features quickly. The demo that we just saw, I just wanna recap what Mark showed. Essentially, we can do mirroring as a local mirroring or we can do remote mirroring. So let's talk local mirroring. In this case, there's traffic between VMA and VME that's going on and VF virtual function zero that's attached to the VMA is the probed interface. And the traffic from VF zero is being mirrored and it's passed to the VF three, which is connected to the VMD. So that's basically the mirror port. Now this is local mirroring because both the VMA and the VMD are on the same host, right? But essentially the traffic is being mirrored to another port. So you can do this, you can classify all of your traffic based on certain tuples or you can just say mirror everything that's coming into VF zero, egress, egress, ingress or egress and then put it into VF three. So you can do that basically in this case. In the remote mirroring, what is happening is you're just basically doing same scenario more or less except that the mirror traffic is not going to a local host or VM on the local host is going to a remote host. So that could be on a VXLAN, basically an overlay network, it could be across data centers or it could be with a different tunneling like GRE. So Mark just talked about GRE tunneling. So basically we have already implemented all of this thing that's proof of concept that you just saw is going to the upstream process. And same thing, you can again mirror all of the traffic to the remote host or you can mirror certain parts of your traffic based on how you classify the flows. High availability is another important feature. As you all know, SRIOV already supports VF lag. However, for the OBS hardware offload, we need to make sure that the VF lag is also supported for OBS hardware offload. So essentially you can implement the OBS hardware, the VF lag in the VMs. If you want to do that, have two VFs bonded and then each of the VM can make the decisions on how do we want to use those VFs by the bonded VFs. However, that's not scalable. Each of the VFs would have to have that, each of the VMs would have to have that support in the software or in the driver. So rather what you would do is bond the VFs, the virtual functions in the naked cell and create a lag. And that lag can be exposed as a single interface, VF interface to the VM. And then the lag and all of the VF lag management happens in the naked cell. So what we can support in that case is different modes. You can do active and passive. So you'll have two VF functions, but only one active at a time with a single bandwidth, single port bandwidth essentially. Or you can do active-active, where you can have two port bandwidth or you can have an ACP to auto-configure the modes basically, right? So all of those are working progress. That's all working progress happening right now. And we are going to the upstream process of reviewing it. QoS, another important feature. And what that is is essentially, ability to control the bandwidth that's going at a VF level. So you have to be offloading all of this to the TC API as well. That is TCFlower API that Frank talked about. And essentially in the bandwidth limit, you have two options. One is that you can do great limit per VF. And that means you can limit the bandwidth to a maximum limit. Or you can do bandwidth guarantee per VF. That means you have to allocate minimum bandwidth per VF. So both those max and min available options are basically being, we are trying to push those into the community. Also you can do DSCP marking. And the DSCP marking you can do it, you force the DSCP per VF essentially. And you can do that in two ways. You can either take the inner DSCP from the inner packet and copy to the outer packet or the overlay packet. Or you can just set the overlay packet DSCP different than the inner packet. So all those are two different modes that are being also implemented, being in the process of implementing. Connection tracking, very important feature again for stateful tracking of the flows and taking actions based on specific connection state. So this is a feature that, we support today stateless, stateless ACLs in the product, right? We have the e-switch that will do stateless ACLs. We have also a support for now, support for connection tracking in the e-switch as well. However, that support has to be again upstreamed and given all the features into the upstream for support for the TC to offload the connection tracking. And so how it works is essentially in current scenario we have all of the connection tracking state management is done in the OBS in the kernel and it's being pushed by the TC flower into the e-switch for maintaining the state in the hardware. So we are not kind of building the state but you're maintaining the state in the hardware. So you can see the CT state block in the e-switch that basically maintains it. So any packet that is coming in once the state information is available will be fast-switched. So that's something that we have implemented already and it's again being going through the reviews right now. And essentially, like I said, ConnectX 5 is the product that we have where we support state tracking based on five tuples or we can do TCP flags, SynanAC and also we can do aging of the flows for the state based on the state information. So all of that is supported today. Again, it'll be all kind of coming into the mainstream upstream thing and then get integrated. We also have a product called Bluefield which is a SmartNIC. Essentially the SmartNIC for us is a not only just the ASIC version but also a version that has an ARM processor next to the ASIC and that's the Bluefield product and that can do all of the connection tracking in the software if you just put the whole OVS into it because it has an ARM processor to run all of the OVS on it. So that's something that we can do as well and that's also the work in progress where the TCP handshake and the entire connection tracking state machine would be put onto the ARM cores in the SmartNIC called Bluefield. Last but not the least, I wanna make sure that, we talked about one of the key VNFs out there. Nokia has called VSR, Virtual Services Router but there's also some other commercial VNFs that are starting to work and take advantage of the faster data path and essentially F5 is another vendor that we're working with and F5 has a big IP portfolio of VNFs. There's the firewalls, load balancers, one of them is CGNAT, the carrier grade NAT VNF and we have offloaded using the hardware offload we are giving a faster data path to this VNF to realize about 70 gigabits per second of throughput on F5 CGNAT VNF. And all of that is without consuming any CPU for the packet processing. Of course you will use the DPDK in the VNF to be able to process those packets for the NAT function but you don't have to have any of the packets in the OVS for pushing those packets to the CGNAT VNF. You don't have to use that in the OVS at all for the CPU. So that's basically, and this is just one of the VNFs in the F5 portfolio, there's other VNFs which all can take advantage of what we have today. So I think essentially with that I want to wrap it up and kind of just show, tell you guys that the smart cloud with the smartNICs and the open source we are doing it is all coming together. There are customers who are looking at it, deploying it at least testing right now and then looking for deployments. With that I wanna thank you all for your time and for listening to us. Thank you. Thank you.