 All right, well, welcome everybody. This is the last session of today, Wednesday, so I'm standing between you and Beers, which is never a good idea. So Chris and I are going to talk to you a bit today about Nova Scheduler. We've been doing a lot of work in Nova over the past number of years. We're going to really be focused in on optimizing, configuring, and how to deploy NFV, VNFs on OpenStack. And we've been working on OpenStack for probably the last four years. My name is Ian Jolliff, and I'm product architect for Wind River's Titanium Cloud product. I actually live in Ottawa, Canada, which is a short one-hour flight away from here. And I'm really happy to be talking to you guys today about what we've been doing with Nova. And this is Chris Friesen. I'm a senior member of the technical staff at Wind River, and I contribute upstream, mostly in the area of Nova. All right, thanks, Chris. So just a quick blurb about who Wind River is. We've actually got our technology in all sorts of devices from very small devices on the planet, but also off the planet. We actually have some software that runs on the, or ran on the Mars rover. And we have automotive solutions, networking solutions, and telecom cloud solutions, which is our Titanium Cloud product. We've been working in this telecom industry and really since the early days of NFE. And it's amazing how far it's come so quickly. I think it's got a long way to go yet, but we've really been focused on solving problems that are critical to telecom, looking at IoT applications, network appliances. We've done a lot of work with Radio Access Node and CRAN Technologies. Certainly seen a lot of interest this week about edge technologies and virtual customer premises equipment, and starting to do some work on mobile edge or multi-access edge computing. So a huge challenging and diverse array of applications that have a really unique set of requirements that when people were starting to work with OpenStack originally, they never really envisaged any of these workloads. So it's really focused on pure cloud native applications. So really our Titanium Cloud product is focused on private cloud for critical infrastructure. So let's talk about what we've been doing with OpenStack and Nova. And the main focus of today's talk is really all about predictable performance. And how do you get that for NFE applications? And so the way that trains can run, high-speed trains can run today is just amazing. And they run on time with extreme predictability at very high speeds. So what we're going to cover today is really the requirements of a typical NFE application. We're going to dig into a little bit about how the Nova scheduler works. It's a very capable scheduler, but also very complex. So there's lots of tuning parameters that you can leverage and helps you get out of some pretty sticky situations. And so we're going to show you how to max those performance dimensions. And we've actually got some benchmarks that are relatively easy to digest and give some people some interesting things to think about as we go through the talk today. So let's do a quick review. Probably most of the people in this room have a different set of requirements for NFE applications, but I thought I'd pick off some of the high-level ones. Really, we're talking about deterministic network throughput and latency. So typically, in an NFE application, you need a V-switch that is DPDK-based. A kernel-based V-switch probably won't get you the throughput that you're looking for. Packet latency is below 50 microseconds and also maybe even a specialized compute profile for ultra-low latency. Certainly, I saw a really good talk yesterday from some folks that were really talking about how to get RT performance on Linux and how to use that in an open-stack environment. So that was a really good talk. Another key dimension is predictable access to CPU performance and also the same thing for memory. So how do you get predictable access to the memory that you have on the servers? These servers that we're seeing in telecom applications are typically dual Xeon. You got two NUMA nodes. You got memory split across those NUMA nodes and there's a lot going on and you have to understand a little bit about the hardware topology when you're trying to get the best performance out of your application. Also see a lot of apps that are running very large number of VCPUs and how do you manage that and how do you actually get access to those cores in a predictable way? Luckily most applications are running Linux in the guest and so that's at least one common baseline. So let's talk a little bit about the performance issues and some of the solutions and then we'll move into the benchmarking. So Intel has been doing some really good work upstream and we've contributed to some of this as well around enhanced platform awareness. So the EPA solutions allow you to have better awareness of NUMA topology. So again, if you've got to dual socket solution, you can leverage some of the NUMA visibility. Also memory requirements for huge pages. Also NIC support, so being able to align your NICs and your NUMA and your workload. So another really great technology that people have been able to leverage. Also with PCI pass through and being able to host acceleration engines, typically for encryption like Coletto Creek from Intel and people are also using that same technology for GPUs for some machine learning applications. And lastly, we're gonna talk a little bit about hyperthreading and how that impacts your performance. So really all these things allow you to configure your application and your flavors using extra specs for the best performance. And these are some of the tools that you have in your toolbox to be able to leverage and configure your flavors and make sure once you've developed your application, done the tuning, done a bit of testing, you can then predictably deploy your application on a cloud and get access to all these technologies that are available to the applications. So drilling down a level, PCI and network contention is a critical problem. So in a dual socket solution, each PCI bus is connected to a specific NUMA zone. And so you really need to know which PCI buses are connected to which NUMA. And in the scheduler, you can actually leverage that and get access to make sure that you can distinguish which PCI buses you need to attach to. One learning that we learned a long time ago is really you wanna be very careful about crossing the QPI bus. Chris will show some data on that a little bit later. So we really wanna make sure that if you have a high performance application that that virtual machine is on the same NUMA node as the V-switch and also the NIC. And with some of the scheduler extra specs, you can actually get that in a predictable way, which is just fantastic. Also, a lot of people are using solutions where they're not leveraging a virtual switch and they configure PCI pass through or SROV, but sometimes they forget that, oh, I've placed my workload on the opposite NUMA and that has an impact on performance. So if you want the best performance, even when using PCI pass through and SROV, you wanna make sure that your VM shows up on the correct NUMA node. And also network contention fairly obvious. It's a good systems engineering practice to make sure that you have the network bandwidth available to your applications, but at the end of the day, all the instances on one host will still leverage the same host NIC. So again, being able to have enough bandwidth out of your box is absolutely critical as well. So we've done a fair amount of benchmarking on different NIC types. Emulated NICs like the E1000 are slow. Certainly the next layer up is a pair of virtualized NICs like VertIO and so that gives you a NICs boost up and then PCI pass through and SROV gives you direct access from the NIC into the VM and that comes at a cost though because you need to have the right driver and the guest, you still have to manage interrupts, you still have to manage the security, you don't have any firewall, which you may have on your virtual switch and it brings a little extra burden to that virtual machine that if you were using a virtual switching technology, you could actually avoid some of that overhead. And lastly, since you are tied to a physical instance, you can't actually live migrate your VM from one host to another. So we've been doing a lot of work upstream to try and make SROV easier to consume. We're pretty happy with where we've gotten it to but it's still more complex to configure than we would like so we're continuing to move the yardsticks forward there. So looking forward to more progress in the community for that. And lastly for DPK based guests that really gives you the highest performance, most flexibility, DPK has a rich set of drivers that you can use, pole mode drivers. And this gives you the ability to deploy very high performance data path applications and DPK is a great technology that we've been working with for a long, long time. One of the downsides though is since your app is running a tight loop, you're gonna draw more power. So it's engineering, so there's always a trade off. So I'm gonna talk about memory contention next. So by default, the memory is configured with 4K pages so that can run if you have very high demands on your memory. The TLB cache hits can go down considerably and that's how OpenStack is configured by default. So if you have very high memory requirements, memory bandwidth requirements, you probably wanna configure for huge pages and Chris is gonna show some data on the different huge page sizes and the various impacts on performance as well. So this really, and when you configure huge pages you actually turn off memory over commit by default because you're pre-allocating all the memory huge pages ahead of time and we've shown here some of the commands that you can use to actually set that up on OpenStack. Again, since NUMA is at play here in a dual socket solution, your memory will be split across both NUMA zones and so if you allocate memory that has crossed the NUMA zone, you will take a hit from crossing the NUMA topology as well. But by the same token, if you have a guest that needs more memory than can be satisfied with one physical socket, you can spread your guest across multiple NUMA nodes in order to access memory from both sockets at the same time. The downside of that is that when your scheduler is looking for a host to place that instance, you then need a host that has room on both of the NUMA nodes simultaneously. So it limits the scheduler options a little bit. Yeah, super, thanks for that. Yeah, sure. So the question was whether or not splitting the memory across the NUMA nodes would result in splitting the CPUs across the NUMA node. You can leave it up to the NOVA code to do that split for you and it will just split it right down the middle automatically. You can also explicitly state how much memory and which CPUs you want to be and which guest NUMA node. And NOVA has the ability to let you get right down into the fine-grained detail about how you wanna split it up yourself. But typically you wanna have the CPUs doing the work on the same NUMA node as the memory that they're accessing. Yes. So the question was, can you specify the guest NUMA node with the host NUMA node? Right now in upstream, you cannot do an explicit mapping but you can specify I want to guest with this many NUMA nodes and this much memory on each of the NUMA nodes. But you can't say I want guest NUMA node one to map to host NUMA node one. That level is not there. So the question was about the PCI being associated with one of the NUMA nodes. By default, NOVA will try to put if you're doing PCI pass through and if your PCI device reports what NUMA node it's on, by default, NOVA will put the instance on the same host NUMA node as the PCI device that's being passed through. This can actually cause problems in that right now it's very strict about that. And if you have an application that doesn't actually care to be that strict, there's right now no way to tell it, like just give me the PCI device even if it means crossing the NUMA boundary. So there was a spec upstream that didn't make it into this release, but they're trying to add some of that flexibility in there. Thanks for the questions, guys. All right, so Chris is, as you can tell, NOVA and CPU experts. So Chris is gonna walk you through some of the CPU stuff. All right, so by default, upstream NOVA will give you 16 times over commit on CPUs. So this is legacy from very early days, sort of typical batch jobs for web stuff. If you care about performance, this is probably not your best bet because you're allowed to have up to 16 virtual CPUs running on the same host CPU. If they're all trying to do something at the same time, you get like 7% of your actual real CPU. So if you wanna get better performance, you need to reduce the over commit value. You can set this either on a per compute node basis or in newish versions of OpenStack, you can set it on a per host aggregate basis. And so you could reduce your over commit down to something a little bit more reasonable. If you really care about ultimate performance, then you wanna make sure that each of your host CPUs is basically owned by a single guest CPU. So this means that you would set the flavor extra spec or the image properties, you would set the CPU policy to dedicated. And when you do this, each guest VCV CPU is associated with a single host CPU and it gets exclusive access for that host CPU so that nothing else is allowed to run on it. This will completely disable over commit, obviously. It also reduces the host scheduling overhead because that virtual CPU thread is basically the only thing running on the host CPU. And so the kernel scheduler on the host has less work to do because there's nothing else that's trying to run there. The one cat or there's a couple of caveats with this. The first is that if you are using dedicated CPUs, you cannot put dedicated guests with dedicated CPUs and guests with shared CPUs on the same compute node. The accounting is not there to handle that right now. So generally the way that you would do this is by grouping your compute nodes via host aggregates and then in the flavor, you can specify whether you want the host aggregate for shared CPUs or for dedicated CPUs. The other caveat is that right now in the upstream Nova code, if you have dedicated CPUs, live migration is not guaranteed to work properly. So it will often appear to work, but you can end up where your guests are actually running on the same host CPUs as other guests. And so there are some patches that have been in the works for actually a couple of years now where they're trying to fix this up and get the resource tracking working properly. There are, it's a technically very tricky problem because you basically have to recalculate all of your resources on the destination side in order to make sure that you get appropriate resources allocated for you. And then you have to adjust your livered XML on the fly prior to doing the live migration. So it is a little bit tricky. The patches have been in review for a while, but it's taking a long time to get them in. Yes, yes. So if you don't specify, so the question was, if you don't specify the dedicated policy, what happens? Okay, okay. So the question was, if you launch it without specifying anything and then manually go and pin it. So if you do that, there is nothing stopping the other instances from ending up on the same one. Now, if you go in and afterwards and manually pin everything and then never change it around, then that will basically give you the same kind of effect in the end. But it means that you've got a lot of manual work to do. And I mean, the whole benefit of having cloud is that it's really easy to launch new instances and kill them. So if you're going in and manually adjusting things all the time, it means that you lose a lot of the benefit of cloud. What's up on the same stage? Essentially, yeah, because when you specify dedicated, what's happening behind the scenes is that Nova is picking a CPU and is telling your hypervisor, I wanna run this virtual CPU on this physical host CPU. So it's using the same underlying hypervisor mechanisms as you would use if you just manually pinned it. It's just doing it automatically for you. And it's making sure that nobody else is running on it at the same time. And it makes it repeatable as well. You define it in the flavor and then that flavor is defined forever. And so right now, if you set it as dedicated, you can do cold migrations and resizes and evacuates and it will be properly handled for you. It's just the live migrations that are problematic currently. And the live migrations are also problematic for huge pages as well. So anything that results in a new metapology for your guest potentially can cause live migrations to behave unreliably. So we've talked about multiple guests contending over physical cores. The next that is when you have multiple guests contending over separate threads of the same core. So if you have hyperthreading enabled on your host, each of the cores is exposed as more than one thread. So typically now it would be two threads. And application performance is maximized when you have a single guest running on each host physical core. So generally speaking, you do not want different guests running on hyperthread siblings of the same physical core. It's likely to cause performance problems as we will see a little bit later when I show the benchmark results. In most cases, your application performance will be maximized when you have a single guest VCPU per host physical core. So to get this behavior, you would set the CPU thread policy to isolate and this will reserve any other siblings from that host core and make sure that they stay idle that nobody else is allowed to use them. So you basically get the entire physical core associated with one virtual CPU. In some cases, the application performance can benefit from running on hyperthread siblings. So if you have multiple VCPUs in the guest, they get put onto siblings of the same host core. There's a couple of reasons why this could give you better performance. You could benefit from the fact that those two siblings share a cache. It could improve your efficiency of the underlying pipeline so you could have more virtual CPUs in your guest, but because they're using siblings, because they are siblings of the same host cores, they can give you more overall throughput for the same number of host cores. If you want to try this out, you can set the CPU thread policy to require and this will tell Nova to ensure that all of the VCPUs from your guest are placed on siblings of the same host core. It does require that your guest is set up with a number of VCPUs that is a multiple of the number of siblings per core. I suggest if you go this route, you probably want to actually test it and make sure that you get the performance increase that you're hoping for. There's a third option. If we just go, yeah. So there's a third option other than this, which is the default, which is prefer. And so Nova will try to give it to you if it can and if it can't, then it doesn't guarantee any particular kind of behavior. So I would suggest avoiding prefer. If you care about performance, you should probably set one or the other in order to get predictable deterministic performance. Yes. So it depends, prefer is different. So prefer will try to give you require, but if it can't, then it will fall back to whatever it can give you. It could be all over the place. So try, I would recommend that you pick one or the other in order to get predictable performance. Yeah. So what you are getting by enabling hyperthreading is that if you have other applications that do not require the isolation because they're not as performance sensitive, they can still run on multiple siblings of that same compute node, whereas if you disable it completely, then you just get that many cores. And so you can potentially end up being able to put, you won't be able to put as much work onto that compute node. So by allowing you to specify isolate, you get essentially the same performance that you would if you disabled hyperthread at the host level, but it gives you the flexibility to pack work more densely if it's not performance sensitive. So the next thing I'd like to talk about is CPU cycle stealing by the host. There's a couple of reasons why you may do this. So basically this is where the guest is trying to do work and for whatever reason, the host has decided that it wants to do something on those CPUs. So the first reason why this might happen is a system management interrupt. And this can cause significant latency spikes. This is where the BIOS on the host decides that it needs to run some work on that CPU. And so it basically interrupts everything that's running, including the host OS, to do whatever it thinks it needs to do at that point. So you may be able to avoid this by tweaking the BIOS. In the worst case, you may actually have to select different hardware if the hardware is particularly badly behaved. The other reason why you might get CPU cycle stealing is host processes or kernel threads that are wanting to run on the host CPUs where the guests are running. It's possible in Nova to dedicate specific CPUs to host management and then reserve the rest for the instances. So there's an option in nova.conf that you can set called vCPU pin set. And this is the set of host CPUs that you want to run the instances on. So anything that's not listed, any CPUs not listed in that set are ones that you are reserving for the host itself, either for running networking or overall management functions or like Nova compute itself. There are another set of kernel boot args that can help you with isolating all of the standard Linux functions and keeping them from running on the CPUs where the guests will be running. So I saw CPUs tells it to, tells the scheduler not to place anything there. Automatically, you have to explicitly place it there. RCU noCBs is related to the RCU callbacks. No hurts full turns off the timer tick so that there's, you don't have the regular timer tick interrupting the guests periodically. And generally you would match these so that they're set to the same CPU set as you specified in vCPU pin set. So this is an example of CPU cycle stealing in practice. If we look down here, I know it's really tiny, but what we're seeing is top running on the host and up in the guest. And so we have four instances of KVM, each of which is running on the same host core. So they're all contending against each other. And we can see that each of them has 25% of that core. So they've all got a CPU hog running in them. They're all trying to get as much time as they can. They're each getting 25%. If we move to the next up in the guest, we can see that top is showing 25% roughly for the CPU hog in the guest, which is what we would expect. And if we move up to the top on the upper right, there's a ST number. And this is the steal time. So what this is showing is that 70% of the time has been stolen from this guest by the host for other purposes. So in this case, the other purposes are the other guests that are trying to run on the same host core. So this is time that the guest was trying to use the CPU, but the host stole it from that particular guest because other things were going on. So next we'll move to some of the actual practical benchmarks. This is the test topology that we're using. So this is kind of a standard compute node. We have two NUMA nodes, each of which has one processor and a certain amount of memory associated with it. We have our maintenance running on core zero over here on NUMA node zero. And we have our networking V-switch running on cores one and two. We put some VMs in various positions within the two different nodes depending on the test. The red lines on that previous one are showing the logical network path between the two. So the test itself was pretty straightforward. We have each of the instances has two VCPUs. They are dedicated CPUs. They've got the isolate CPU thread policy that we talked about, and they're running a basic CentOS 7 image. In all cases, V-switch is running on two host CPUs on host NUMA node zero. And it's a really simple iPerf test where one of the VMs is acting as a server and the other VM acts as a client. And it just does a simple straightforward unidirectional traffic dump and measures how fast it can transmit data. So in the first case, we have two meg huge pages set and we start out with the E1000 VIF model. And the only change that we make is to change to the Verdi O VIF model. And that one change gives you four times the throughput that you had before. So as you can imagine, you want to avoid emulated hardware if at all possible. This is the benefits of using a para-virtualized network driver. Next one. So in this case, we start out with Verdi O because we learned something from the previous one. And we start out with 4K pages and we move and we just change it to two meg pages. That's the only change that's being made. The result is over four times faster. We changed the page size again from two meg pages up to one gig pages and there was essentially no performance change. So the key takeaway here is two meg is the sweet spot for your memory size. Yeah, I wasn't expecting that result but that was really interesting to find out. So this test we use for learning from the previous two. We use two meg pages. We use Verdi O and we start off with the picture that you see here where the two VMs are on the other NUMA node from where V-switch is. So all of the traffic is actually crossing the QPI bus between the NUMA nodes twice. So it crosses over the bus to go to V-switch and then it crosses over the bus again to get back to your destination virtual machine. And if we move those VMs over onto the same NUMA node as V-switch, you get just about twice the throughput. So this shows you just the overhead, the cost of crossing over between your NUMA zones. It's a significant overhead. So the takeaway is that you wanna minimize the cross NUMA traffic as much as you can. So what you would generally wanna do is if you have V-switch on one side, if we could go back, yeah. So if we have V-switch on one side, you wanna pack your throughput sensitive applications on the same NUMA node as the V-switch. And then if you have other management functions that don't need the same kind of high level of throughput, you can pack them over on the other NUMA node. Or you can put V-switch on the other one. There is a caveat there in that right now it is very difficult to ensure that you get placed on a V-switch that is on the same NUMA node as the provider network that you wanna connect to. So there's some work that could be done in NOVA to improve some of that. Yes. Absolutely, yeah. Yep. Yeah, so the comment was that in this case, they're both on the same compute nodes. So if they're on different compute nodes, you still need to worry about your neck being on the same NUMA node as the V-switch. I think he's turning the mic on. Just from this particular case, you're using some, the traffic that you're using, you're simulating traffic. Is there any ability to do a mix on the sizing? Does that have any effect? Because obviously when we go throughput to the neck with small packets, we get much different performance levels. If we go to huge, large packets, obviously we see some much different things. Yeah, so I mean this is a really simplistic case. So obviously, yeah, you can make it arbitrarily complicated. So I just ran the simple case for this one. Well, I guess what I'm trying to do is there was a discussion yesterday where we were talking about performance capabilities and how we get a library. So you did some testing here, which is great. And you've got some performance metrics. You're sharing it, which is great. As we go forward and we try to do this each time we do something new and open stack, sometimes we reinvent what we already invented before. And being able to understand, okay, can we share this out? Is there a way to let's say test and automate this, keep it, bring it back to a central area so people can learn from it as they start to modify new systems, right? I think a test like this would be simple enough that you could actually run it in infra. I mean, it's just IPerf. It's just like it doesn't need anything fancy. If you were gonna start getting into PCI pass through, there's a serious limitation on upstream testing involving multi-node and pass through and anything that requires real hardware. So it is a bit of a tricky problem. But I think you're also proposing maybe some data based on varying packet sizes and from small to larger packets. Whatever we can get as a data set so that we don't go back and reinvent it each time you do a deployment. Right. Yeah, yeah, yeah, sure. Okay, so are you familiar with the OP NFV yardstick and vizperf? So they have several benchmarking tools and tests and lately we augmented also for SRROB testing. So it was contributed. So that may be like no need to reinvent the test and it really goes through all the packet sizes. So yardstick and vizperf, is that running in a Ferros lab as well? No, okay, great. I'll look into that for sure. That sounds good. So I think we're on to the next slide. Okay. So this one maybe takes a little bit of explanation. So previously we were showing the individual cores on the different compute nodes. In this case, the cores have been broken out into the hyper thread siblings. So we have two thread siblings for each core. And as we can see, the two VMs are running on siblings of the same cores. So they're basically competing with each other for the resources of the physical cores. And so when we run the same test on this case, the performance drops by about 37%. So that shows the cost of contention between the two VMs that are running on the different siblings of the same core. And the reason why we see this is obviously that the back end of the core only has the one execution pipeline. And so it does not have enough resources to keep both of the siblings running at full speed. So when they're both trying to access those resources, there will be a slowdown. That's a really good point, Chris. Both threads in this case are really busy. So the benefits of hyper threading don't come through. Whereas if you had, maybe if you'd done a test where one thread was very busy and the other one was less busy, you could actually see a performance increase. Yeah, okay, great. All right, so in summary, we talked about some of the requirements. They're very stringent. And Chris did a great job of walking us through some of the performance benefits that we saw. CPU isolation gives you a 16X performance improvement just because you're getting dedicated access to those CPUs. Using huge pages gives you a 4X performance boost, making sure you're using VRTIO, for example, gives you another 4X improvement. And then Numa's own awareness can give you up to a 2X improvement. So our tests here for today's discussion were all done independently. But when you're doing your systems engineering in your analysis, you could actually run into a bottleneck or have a set of configurations that could result in a bottleneck elsewhere and you won't see this kind of boost. So when you start moving to all these factors together, you have to make sure that you do a bit of testing to make sure that you've not run into or inadvertently introduced a bottleneck that you may not have anticipated. And for NFE adoption, I think predictable performance is one of the key things. And we're working a lot on OPNFE as well as in Nova in particular and SROV to help move that forward. We've done a little bit of work on storage. It wasn't ready to present at this summit. We're gonna keep working on that and we're gonna, Chris and I will propose another talk for a future summit to talk about some of the considerations for storage performance as well. And lastly, we have another talk tomorrow afternoon, which is really taking it down to another level and we'll be talking about what happens under the covers with Nova when you boot an instance and how some of these scheduler things actually work. So that's at four 10. I think that's again the very last session of the summit. So I don't know how we managed to get to end of day talks, but hey, that's what it is anyway. Thanks everybody for your time. Yeah, absolutely, please. Question about this very virtual as Nick. So I mean, testing with Vertio. Yes. If you compare Vertio with some other like, what is this? With some other, I mean, you know, some other sort of either proprietary or this shared memory mix between a virtual machine and the host, the performance can be much better. So do you see any future about like, replacing the Vertio model with something else which is more performance, because it looks like that's actually the bottleneck at the moment. Yeah, I think some of the really promising technology is V host user as a backing. And that really boosts, I don't have the data to share today, but we've done some experimental work in that area and V host user can back backing Vertio can give you a very significant boost when you're using a DPDK-based V switch as well. So that's the promising technology I see coming down the pipe. So that would allow us to keep Vertio because it gives you such a nice performance. Yeah, I was just thinking about sort of drawing the Vertio a work completely out of the set. That's an option. I don't have one off the top of my head that I could say, yeah, we could switch to that. So we, I mean, we have done a bit of work with DPDK in the guest. And so, so you're basically shipping memory directly up into the guest and then processing it with a Polmo driver and DPDK. So it, I mean, it does get you a lot of performance but having Vertio does give you a lot of flexibility. So if you can get the performance that you need with Vertio, it's a better route than trying to go all proprietary. I think usually, I mean, the usual rules apply, right? Is try and get it working first and then see where the pain points are. So optimizing prematurely is maybe not the best path. Just as a data point for the Vertio numbers that we're getting here, we're talking about like 25 to 30 gigabit per second. So that's the overall numbers that we're talking about and that's with Vertio. So it's reasonably respectable. Any other questions? I think we just did, yeah. So. I hadn't really thought about that, but if there'd be any. The talk itself will go up on the summit website, I believe. Yeah, but I think if there'd be interest, we'd be happy to do that, yeah. Any other questions? It's getting close to, well, it's past 6 p.m. So thanks everybody for sticking in. Appreciate it.