 Okay, can everybody hear us okay? Excellent, right, let's get started. Yeah, welcome to Thursday, the last day of the conference and thank you for watching our presentation about smart DBUs. So we're going to talk a bit about the enablement we have done in OVN and OpenStack in the yoga cycle. Let's start with some introduction. I'm Froda, I'm an engineer in the OVN engineering team. And I'm James and I work alongside Froda and our OpenStack engineering team. Okay, so let's start to talk about OVN. What is OVN? OVN stands for open version network. It provides a logical network abstraction for virtual machines and container environments. It's an open source project with multiple contributors and as such there is no single vendor behind it. It can be used with multiple cloud management systems such as OpenStack and Kubernetes and LSD. So why is OVN important in the context of ADPU? So the core functionality of OVN is to take higher-layer network abstractions from the CMS and transfer that into lower-layer information that can be programmed directly into the data path. This allows for acceleration of everything from layer three routing, ACLs, NAT, and even node balancing. The previous iterations of OpenStack networking, ML2 OVS, is bound to the software data path and it's not afloatable. And today, 100-gig networking is normal in the data center and people are already deploying 400-gig and the network industry is steadily marching towards turbid ethernet. And in this world, doing pure software data path for networking just wouldn't work. You would just burn all your CPU cores for the network. Okay, thanks Froda. So let's just talk briefly about what we mean by a smartNIC DPU. So this is a network interface card with silicon that allows us to accelerate the data path in hardware, but it also has additional cores and memory to run an additional operating system directly on the card alongside the host operating system that's actually running within the server itself. Plugs in via a PCI Express slot into the server and we can really think of it as an additional server inside the server sitting right in front of the network and as such we're able to have control of all the networking to and from the server from a second control point within the infrastructure. Not every smartNIC has DPU capability. So for what we term a kind of classic smartNIC, we have to have to push some of the control plane for the networking directly onto the host operating system rather than having a dedicated operating system to run us on. So in this example, we're having to run OVN and OVS alongside the workloads on the host operating system. So they're chewing on the same sausage, they're eating the same cores and memory as all the instances that are running alongside them. So for an OpenStack hypervisor, we have the potential for contention of those resources between the control plane and the workloads in the form of instances that are actually running on the cloud. OVS and OVN can still program all of the data path directly into the silicon in this case. So we get a hardware accelerated set of interfaces to present to instances, and that's done by using virtual functions in the same way as an SRAOV virtual function is presented, but that's a fully steered virtual function from the point of view of the software defined networking. The host operating system is in charge of the resilience of the uplink ports on this type of card and uses the standard Linux bonding interface to the underlying network interface drivers to achieve that. When we introduce the DPDU concept, we obviously have additional cores and memory that can be used to offload some of the tasks that we've had to run on the host operating system before. So we can push all of the network control function down onto that operating system that's running on the card, so we can deal with the uplink resilience via the bonding interface via the Linux kernel that's running on the card, and a single physical function is then presented to the host for its networking, and we can again use OVN and OVS to program the SDN side of what we're trying to achieve, and again, virtual functions are presented up and then plugged into instances using PCI pass route. Key thing here is that our workload and our control plane are completely separated, so there's no potential for contention, so we can make much better guarantees about the latency of any up calls from the silicon to OVN and OVS in terms of how quickly that is then programmed back down into the hardware. I'm now gonna hand you back over to Frodo who's gonna talk about how we achieve this trick and the coordination we needed to do in OpenStack and OVN to do this. Thanks, James. So what you see on this slide is an overview of the implementation. On the left hand side, you see the data path components with the DPU itself, where we run the OVN controller, OpenWee switch, and on the whole side you have the Nova compute, which manages the virtual machines. On the right hand side, you see the control plane components, so you have your neutral API, the Nova API, and of course the central OVN databases. And the implementation was done in Nova and Livered to support identifying the DPU to figure out which DPU is connected to which host. And we do that by looking up PCIeVPD information, which Nova then provides to Neutron when binding the port. Neutron in turn uses this information to find out which OVN chassis this port should be bound to, and then passes on information about which virtual function, et cetera, to OVN. And the OVN controller then in turn uses this information to look up the representative port on the DPU and plugs the representative port into the OpenWee switch, and then, you know, networking can work as normally. And this is in contrast with how it used to work, you know, when everything was on the host, where all of the components were running on the same machine. All right, but instead of looking at more schematics, we've prepared a demonstration where we can see how this works with the OpenISTAC CLI in action. So let's look at that instead, if the demo gods work. Yeah, I've prerecorded it because, yeah. But we'll talk to what's here, let's see. Right, so what we are going to show here is a Judy-deployed Charmed OpenStack Yoga. It's deployed on metal using mass. And for the purpose of this demonstration, we only have two machines in there. So the first machine named Amontons there is a physical machine with the DPU in it. And that machine, the only application deployed on that is a Nova compute, which we have called untrusted. This is just to demonstrate what the security use case for this would be. Because since we're not running anything else on that machine, if somebody escapes the hypervisor, they can't really get to anything because all of the control plane stuff is handled on a different machine. So the second machine there is for regular instances. We've essentially deployed the entire OpenStack control plane in LXD containers on there. Of course, you wouldn't do it like that in production, but this is just to have a demonstration with two machines. The manual provision machine you see there is the DPU itself. At this point in time, mass, our bare metal provisioning system does not support automatic the provisioning of the DPUs. But until that is implemented, we can use the manual provider like this. And the mass team is working on implementing a MVP for this. So we hopefully will have support for that soon. And yeah, let's first, we will create a regular instance just to show how this scheduling works with the different types of computers in there. And so this is just the basic server-created command you would use. You see it's already created. And if we go and look on the instance itself, we can see where it ended up. And it was scheduled to the machine called NodeMes. And this is the machine without DPU. So this is for regular virtual instances. And to confirm that, that's the fact we can add a floating IP and log into the instance as well. We should have made the video quicker. I see. And if we look at, yeah, you see this is a different machine. And here you see a regular virtual interface. Just to confirm the pumping. And after this, we, of course, we can create an accelerated instance. Which we'll do now. To create an accelerated instance, you first need to create the port with the VNIC type remote managed like this. And then we create an instance and refer to that port in the instance creation command. This is the same type of workflow as you would do with regular SRAOV and also with the hardware offload support. But the difference is the VNIC type, of course. You can see that starting the accelerated instance takes a bit longer time. This is because Nova has to allocate a virtual function and touch this instance, et cetera. But let's see where it ended up. Yeah, this ended up on the machine with the DPU. So that's great. You don't have to have special aggregates or do special things to have the instances placed where there should be. There's pre-ints in Nova, which does the magic for you. And this time as well, let's confirm that we're not just having smoke in theirs here. Let's log into the machine. It looks like the plumbing looks like on the inside. And if we do a LSPCI here, we'll see that the networking interface is, in fact, a virtual function on the NVIDIA blue field card. So the next thing I wanted to show is, let's look at what this looks like under the hood. So we've now logged into the Hackervisor with the DPU. And as you can see on here, there is no OpenWee switch demon. There is no OVM. There is no Neutron agents. There's only Nova. And of course, the cameo process for the instance we just created. And on the flip side of that, we can log into the NVIDIA blue field to DPU, which is connected to this machine through PCI. And you can see this is an ARM system. So that's the ARM of course, used for the control plane functions on the DPU. And here you have the OpenWee switch. You have the OVM. And if you look at the log file for the OVM controller, you can see it looked up the representative port and plugged it into the OpenWee switch as part of the instance creation. And as it normally does, when the instance boots up, it provides the HTTP to the instance, et cetera. And what we can do in addition to this, we can go into the instance and generate some traffic just to see what that would look like on the DPU side. So we need a MAC address because that's what shows up in the flow table. And yeah, let's pin the open to archive. And this command here dumps all the active flows in the system. And we also have a filter called offloaded. So this will only shows the flows that are actually offloaded into the Connected Six Silicon on the DPU. And there's the actual traffic. So, and this is available in OVM 2203 and the OpenWee switch yoga. And it's in the archive today. So if you can get your hands on the DPU, you can go do this in your lab at home. And you can also see the topology of the network agents, et cetera. See, there's no agents on the hypervisor with the DPU. And that concludes the demo. Okay, thank you for listening. We have like one minute for questions. If you have a question, please come and use a microphone so it gets on the recording. And if we run out of time, we'll be on the booth for the next hour upstairs. So you can always come and ask us a question now if you've got something to ask. Which vendors or which, you know, what smart nix are supported initially? That's a very good question. So we used a bluefield tube from Nvidia in our testing, but we made it, you know, an important part of the design we made is to use, you know, standard interfaces to do it. So we use DevLink to do the lookup of the representative ports. And of course the interface between the DPU and the machine is PCI, that's standard virtual functions, et cetera. So if a different vendor has similar layout with the representative ports being looked up through DevLink, et cetera, it will also support that. And the component doing this lookup is called OBMvif, which is a project hosted on their Oven org on GitHub. And if we need to do additional things for different DPUs, we could add support for that there. So it's created in a vendor agnostic way. So multiple vendors are doing this very similarly. And we chose to use the DevLink part because we saw that other vendors had that in their kernel code. But of course, you have to actually test those cards. So we want to know for sure if it works. But everyone is moving in this direction. So I feel pretty confident that this will work for all of the vendors. Okay, thank you. We are out of time by over 30 seconds now. So if any got any more questions, we'll be upstairs as I said for the next hour. Thank you for listening. Thank you.