 Welcome to the PTL webinar series. These series evolve from sessions that were held by the PTLs regarding updates to their projects at each summit and we converted them into webinars to extend that reach of these events beyond the summit. And today, Michael still is here. He is a Compute PTL. He's going to update you on what may be new for Juno as well as detail any items of note for our users and operators and what they may be looking for in NOVA. So I'm Margie Caller with the foundation joined by Allison Price. And I think that is about it. Let me put your presentation into presentation mode, Michael. And then you can take it from there. Cool. Thank you. So I'm not muted. This is working. It is working. Excellent. So thank you, everyone, for coming. I know this is a slightly unusual time. And so thank you for coming along in your evening. And also thank you to people who watch this video later on YouTube. I really appreciate people showing an interest in OpenStack in general. So next slide, please. So this presentation is about OpenStack Compute. Effectively at this point, OpenStack Compute is largely the NOVA project, which is a hypervisor management system. So users can make API requests that are like, hey, boot me this virtual machine. And NOVA will take that request to find an appropriate physical machine to boot the virtual machine on and then bring the virtual machine up. It can also configure networks and attach storage volumes and all of that kind of stuff that people would expect from a system like that. The reason the program is called Compute is because there is a long-term intention, I think, to be wider than that. For example, containerization doesn't always fit well into a set of APIs intended for hypervisors. And so it's possible that in the future, we might end up with a project that focuses more on containers and NOVA might choose to focus more on hypervisors. At the moment, both things happen in NOVA, but that isn't necessarily always going to be true. Next slide, please. So I wanted to start off talking briefly about where we ended up in Ithous. One of the interesting aspects of the project in these webinars is that it's relatively early in the development cycle for Juno. And because this is an open-source project with a bunch of volunteers and individual companies working on the bits that are important to them, it's a little bit hard to predict exactly what will land in the Juno release. So in recognition of that, I want to start off by briefly talking about where we actually ended up in Ithous. Now, the other thing is that the releases are quite big and this is only a 15-minute presentation. So it's going to be a thousand foot overview and there are better resources available to dig into exactly what happened in specific areas if people are interested. There were some themes for Ithous that I think are worth mentioning. We wanted to improve our continuous integration to have a more reliable experience for operators. So continuous integration is the set of tests that we run when a developer uploads a proposed patch. These tests run before humans look at the code review. So we have this formal process where before a piece of code ends up in Nova, we run these automated tests, humans look at it and review it and approve it, and then we run the tests again and then we merge it. The reason we run the tests again, by the way, is just in case, you know, if humans take a couple of weeks to look at the patch, then, you know, the underlying code might have moved underneath. Now, at the start of Ithous, we knew we had gaps in our continuous integration coverage and we wanted to improve the testing we were doing. For example, we wanted to make sure there was at least some testing for every hypervisor driver that we were releasing. Ithous was pretty successful at doing that. We have a lot more continuous integration coverage than we had before, and we do have continuous integration for every hypervisor driver now. There's always more work we can do here, so I think this is a continuing theme for Juno. And frankly, you know, the tests are important because it is so critical to identifying bugs before operators find them in the field. So this is one of the ways we try to produce a good experience for operators. There was also work towards live upgrade in Ithous. That work is still ongoing. There are some interesting features in Ithous, but, you know, there's more interesting stuff we can do, but there's a lot of work work we need to do on the way. So we'll talk about that a little bit later. And we did some cleanup work on our APIs, although we're going to tweak exactly what that looks like in the releases, and we'll talk about that a bit more too. Next slide, please. So in Ithous, we landed 65 blueprints. A blueprint is effectively a feature. We fixed 650 bugs, which is a lot of bug fixes. We had nearly 300 developers involved in the release, although I want to highlight that there's a smaller number of those who are regular contributors. We only had 42 developers with at least 10 patches. So one of the things we need to get better at, I think, in Juno is we have a large number of people who will drop past and fix one bug that's bothering them, and then move on with their lives. And sometimes we need to pick up those reviews and run with them ourselves because the original person's moved on. But these are just numbers. Let's talk about some of the features we saw in Ithous. So I mentioned live upgrades before, and this is very exciting, I think. So we have limited support in Ithous for upgrading your cluster without upgrading everything at the same time. So what you do is you can upgrade your controller nodes first, so things running, you know, the APIs and the scheduler and things like that, specifically not the hypervisor nodes. Then you then set a flag when you restart those binaries to say, I'd like to be Ithous-compatible, and then you can upgrade your compute nodes from Havana to Ithous slowly in a more controlled manner because we expect that a large number of your, oh, sorry, the largest proportion of the machines in your Nova cluster will be compute nodes, so they're going to be the hardest bit to upgrade. Once you've upgraded the compute nodes, you can unset the compatibility mode, and then everything is running with the Ithous version of the APIs. This is also more documented more in the OpenStack documentation, so I just want to highlight that it exists and is an option. I don't really want to dig into, you know, the exact way you'd configure it. Next slide, please. There was also a lot of work around the API in Ithous, and now at the start of Ithous, we said we should attempt to do a version three of the API that users talk to, so our external-facing API in Nova. And so a lot of work was done to, you know, make the API more consistent and predictable and, you know, fix bugs, basically. It's interesting because as we went through that process, I think we learned that Big Bang API changes are really hard on users. So whilst this API is considered experimental in Ithous, we're actually going to change how it's presented in Juno. So I'd recommend that users not use the V3 API. It's turned off by default, but don't turn it on and port all your code to it because that would be a bad experience later on. But this work was very useful, I think, because it taught us a lot about what we wanted to do to the API. In non-experimental changes to the API, you can also permanently remove decommission compute nodes now, and XML support has been deprecated in the Ithous release, and will be removed in a later release. Now, that change should be transparent to most users because most users should be using an SDK of some form to talk to the OpenStack API. If you have code that you've written that hand produces the XML and passes it off to our API, then you're going to need to look at porting. And if that's a big problem, we'd like to hear from you because that will affect the timeline for deprecating the XML support. We also disabled file injection by default. File injection is one of the several ways that you can customize an instance once it's booted to have things like the correct root password. We have performance and security concerns with file injection, so we would rather that users use metadata server or config drive. Now, we believe that metadata server and config drive are supported by all of our guest operating systems. At the moment, you can turn file injection back on if you really need it, but if you find you need to do that, again, you need to urgently file a bug with us because we will be removing that support in the future. And, you know, we're not aware of a case where it doesn't work. Hypervisor specific flags have also been moved into configuration groups. So it's now clearer that a flag that you're editing is specifically for, say, for example, VMware. So it's slightly less confusing for operators. There was also an experimental Docker driver released in Havana. That driver has been removed in Icehouse because we'd like it to stabilize more. The driver still exists and is still available. It's released by Stackforge, and it's relatively trivial to use that driver with Nova. It's a, you know, one or two-line configuration change. There are plans to bring the Docker driver back now that it's a bit more stable, but we haven't actually done that work yet, so I don't want to promise that it will be finalized in Juneau. And the power VM driver was removed at IBM's request. They believed that no one was using the driver and that it wasn't ready for production use, and they had to change the direction in exactly how they wanted people to use the power architecture. So that driver was dropped. So what are the themes for Juneau? So, as I mentioned before, we want to continue to improve our continuous integration systems. There's some mechanical stuff that's not of a lot of interest to users there, I think, but we want to make it clearer in code reviews, for example, what the results of these CI tests are. We want to expand the coverage of the CI test as well, which is definitely interesting to users. And in general, we continue to take continuous integration very seriously. There's also more work towards live upgrade, and there's a slide about that later. And the experimental V3 API is going to become a series of micro versions in our V2 API, and we'll dig into that more in a second too. So the other big change in Juneau is there's a new specifications project for how we design features. Now, this is currently considered experimental in the sense that there are a few OpenSack projects playing with this, and it's not clear exactly how we'll tweak this before it becomes a thing that we do in general, but I think the experiment has been very successful so far. So the new process is when you want to add a feature, you write a formal design document about what you want to implement, and then that goes through a separate review process very much like how our code is reviewed. Now, that's important because it's a really clear signal to operators, for example, about things we intend to work on that aren't bug fixes, and operators can comment and say, well, you know, hey, in my deployment, you know, this would need to be tweaked like this or, you know, this thing would be really important for me and would solve a big problem I'm experiencing. So it's a really good signal for where we think we're going to go. And operators are very much encouraged to participate. We're also hoping it will speed up code reviews for actual implementations because previously we would debate what the implementation should look like once someone had already written it and submitted it for review. And by bringing that discussion, you know, closer towards the design phase, it means that we're hopefully stopping people from going off on tangents where we're ultimately going to have to say, hey, we need this changed a little bit to work with the general direction of NOVA. We have a lot of specifications under review. We had a specification review day earlier this week, but we have over 100 specs still out there in the design phase. And so I think I should also set expectations that I don't think every single proposed specification will merge. That was also true in previous releases. It was just less obvious. So if operators find a specification for something they think is particularly important, for example, they now have an opportunity to let us know that during the review process. There's also a summary of currently approved specifications on the Wiki, which should be helpful to people. I'd be very interested in feedback on that because it's new and I'm sure we're going to tweak the format. Again, an approved specification doesn't mean an implementation will land in Juno though. So it's a stronger signal of intent than a specification being proposed, but it's not a guarantee. Next slide please. So there's further work on live upgrades in Juno. We want to be able to support live upgrades where you don't have to do the database schema upgrade in a big outage window, but in order to do this, we need to move to an internal object model that supports versioning. Now this work has been going for a while now. It was happening in Ithouse and it continues to happen in Juno. It's a lot of work and it's not a lot of, it's not work that adds visible features to users particularly, but it positions us for this very, very important feature. So I really appreciate the work that developers have been doing on that. It is in fact, pretty cool from a developer perspective because it cleans up the code internally a lot. For example, as a developer, I don't really need to know how a conductor is configured anymore. The objects know that for me. So it's good from a developer perspective and it positions us well for features we want to add in the future. Next slide please. Now the V3 API I mentioned in Ithouse. So in Ithouse, this was experimental and then we had a big discussion at the Juno design summit and we're going to change how we present the V3 API. One of the things we heard clearly from deployers and users is that Big Bang API changes are painful and so instead, what we're going to do is we're going to take what we learned from the V3 API and pull it apart and present it as a series of small changes to the current V2 API. And we call these small changes microversions. Now the first of those microversions will be stricter type checking and we're calling that microversion V2.1 so black is a better name at the moment. And so for example, the first change we're going to do is make sure that the body of requests, the values passed during the correct format and make sense. There hasn't been a typo in a parameter name, that kind of thing. So it's possible that if you have an application with a typo and a parameter name, the app is working at the moment and you'll start getting warned about that and then your API request will be rejected. But we think that's important because ultimately you asked us to do something but because the typo, we didn't do it because we didn't recognize the parameter. And then there's a series of improvements to the API that will appear later microversions as well. Now clients will be able to negotiate what microversion they support with the API server. So backwards compatibility should be easier as well. Basically the client will connect and say, hi, I know how to speak this version and the server will know how to gracefully degrade to talk that older version. Ignoring the new API work, there's also better support for cross project request IDs and there's a fair bit of work happening with the EC2 API at the moment. For example, there's better tagging support but there's also a group at cloud scaling that has done a general reworking of the EC2 API and we're hoping to see a code proposal from them in the junior release. There's also continuing scheduler work. This work was also happening in Ithouse but it hasn't completed yet. The overall intention is we'd like to be able to split the scheduler out into its own service that other OpenStack projects could use as well. So for example, if your cluster was configured to allow this, you could say schedule this instance on the same machine that its volumes are hosted on so that the connection between the two is low latency and local. Now in order to do that, we have to rearrange a fair bit of our code because for example, the flow of how an instance boots currently involves the scheduler whereas instead the scheduler should come back to Nova and say, I picked this node and then Nova should use that node. So there's a bunch of refactoring that needs to happen and it's not super exciting for users but again, it positions us for really interesting stuff we wanna do in the future. So it's work worth doing. There's also a proposal to add DB2 support for SQL database as a SQL database storage engine. Now this is obviously interesting to people who come from a DB2 shop but it's also having some interesting side effects that I think are worthwhile. DB2 is a little bit stricter about key uniqueness than other SQL database engines. So for example, there's a proposal that we enforce that instance unique IDs are actually unique in the database and there's some cleanups like that happening and I think those are generally good for our database hygiene. And so even if you're not particularly interested in DB2, this work is interesting to you because it's going to clean up our databases more. I also wanna call out to drivers specifically because they're the most visible in the specs process. The Libvert driver is doing some really interesting work around booting LXC containers from block devices, so volumes specifically using Libvert storage pools which should have some interesting effects on how life migration works. And there's a large group of people who've come to us and are interested in a thing called network function virtualization which is where people like telcos can say instead of deploying a dedicated firewall appliance, I would like to boot an instance in OpenStack that acts as a firewall and then route the traffic through that. But they need a bunch of performance improvements before that will work at the kind of data rates they're interested in. So PCI, FRIOV, pass through support is a new PCI, I don't know, new might not be technically correct. It's a new way that we can support virtual hardware that's implemented at the PCI layer and we can pass that through to instances. And Libvert is also working on numerous scheduling. So scheduling your instance on the machine in a location that's close to the PCI device, for example, to reduce latency for calls. So this is work that will improve performance for all instances. So it's good for all users but is specifically requested by this use case that has come up recently. And the VMware driver has a lot of work happening as well. So the VMware team has been very active recently which is really good and I applaud it. They're doing a large refactoring of their driver code to make it more maintainable, which isn't an exciting thing for them to do but is really good because it helps us review their patches faster. So it's a good thing for an active community member to do. They're also proposing to add support for hot plug network interfaces, ephemeral disks, vSAN data stores and booting OVA format images. So there's a bunch of functionality work happening as well. But yeah, I just wanted to specifically call out those two hypervisor drivers because they're the ones most visible in the specs process and the other hypervisors are working on features as well. And so that brings us to the end. Are there any questions? Thank you. Not a problem. I don't think so, not at this time. Although I, if you don't mind, would like to ask you about the PCI pass through. So that was, I gave them a telecom background and so PCI, then the DSS council was always a big item in terms of what side is compliant and all that. So the use case is specific to telecom asking for that. Is that what you were referring to? Yeah, so telcos need this to reduce specifically latency. Although I think it improves throughput as well. But it should be interesting to anybody who's exposing PCI devices inside the instance. So if you were doing some sort of scientific computation with GPUs, then PCI pass through is exciting to you too. So it's a feature that's good for everyone, but specifically telcos really need this thing. That makes sense. Okay. I think the other thing I should say is if people see this video on YouTube and they have questions, you know, I'd love people to send email to the OpenStack mailing list, which is opensnack.list.opensnack.org. I think, you know, I'm gonna be honest and say sending me personally addressed email is not fantastic because sometimes I get very busy, but we have a very active community that's very helpful. So sending emails to the mailing list or using our forum question and answer system on the OpenStack website are both really good ways to get in touch and talk to us about this stuff. Great, thank you. And also just short term at the end of this webinar, people who are on the call will receive information just saying thank you and you can send us questions as well to the Foundation, we can get them back to Michael too. Awesome. Great. Well, thank you, Michael, for your time. I know you're very busy and I know it's not the best time for you either, but we really appreciate your time. And... Now I really appreciate people showing an interest in NOVA. Like it's flattering that people, you know, use the code that we write. Absolutely. Absolutely, I agree. Well, thank you though, we do appreciate it. And I will, with that end the recording and if, for those on the call, this will be on YouTube in the next few days. Thanks so much.