 Welcome to the PTL Overview Series. These series were initiated to go over the projects and their upcoming features that will be in the new release, which in this case is Kilo. We are posting these now on YouTube so that you can have the information right at your hands and watch it at your own leisure. But of course each PTL will offer their contact information and you can jump on IRC and ask them any questions that you may have to their specific project. Today we have Michael Still who is the NOVA PTL on the line and he will be going over any project updates that will be coming in the new release Kilo and of course he'll be available afterwards for any questions that you may have. So without further ado I'll change it over to Michael. Thank you very much. So first off thank you for coming along and watching this YouTube video. Hopefully it's useful to you. My name is Michael Still. I'm the Compute PTL. What that means is I'm the main cat herder for OpenSnack Compute. Next slide please. OpenSnack Compute currently consists of one project, NOVA. For those of you who are new to OpenSnack, NOVA is the virtual machine management system. So if you make an API request to OpenSnack for a virtual machine, NOVA is the system that actually builds that virtual machine and gives you access to it. To do that it might need to orchestrate other systems. So for example if you ask for an interesting network configuration, we might go off to Neutron to set that up. Or if you ask for a persistent block device we would ask Cinder to do that work for us. So NOVA is interesting in that it gives you access to compute resources but also because it ties into a lot of other OpenSnack projects. And so probably the majority of OpenSnack deploys NOVA is part of the deployment because most people want to use compute. Next slide please. So how did the Kilo Summit go? So the OpenSnack release process is based around a six month cycle and the start of that cycle is a physical design summit in a different city each time. Kilo's design summit was in Paris and Paris felt really, really productive to me. So it was the second summit where I was the PTL and whilst Atlanta was, you know, a really good summit, I felt like Paris was us hitting our stride. We have an increasingly clear vision of the things we need to do to NOVA to meet users' needs as best as possible and we understand that there are big architectural improvements we need to make. But more importantly people signed on to do that work and we have a plan. So it just felt like we were meshing well, we know where we're going, we know what we need to do, we now just need to execute on it. Next slide please. Now, one of the things we introduced in the Juneau cycle which was the previous release was the specification process. Now this started off as something that NOVA and a small number of other projects did but it's now kind of OpenSnack-wide. Although each OpenSnack project is implemented separately so we, you know, the exact form it takes is different across projects but we're working on syncing that up at the moment. A specification is a formal design document which defines what is being implemented. It's reviewed separately from the code and it's reviewed before the code. So the idea in Juneau was if you wanted to do anything that wasn't a bug fix you would propose a specification. We'd review it and we'd say, ooh, we love this feature but you know, you need to think about tweaking it like this so it works well with, you know, the supervisor or whatever. And then, you know, that would get approved and then the code would be worked on. Now that was a really good process I feel but we'll talk a little bit more about that later. The other thing is that operators are encouraged to participate in that review process. So we have a few operators at the moment who will come along and say, you know, hey, this is a cool feature but you need to think about these things that will hurt me. And I think I'd like to encourage more of that. Operator feedback is very important to us and finding out that we're going to, you know, need to tweak things early is way less expensive than doing it later. Now there's a few places you can look for information about specifications and all of these URLs will be put in the YouTube talk description so don't, you know, frantically write them down from these slides. First off is a website that has approved specifications rendered to HTML and that's that first URL on the slide. But, you know, if you want to provide feedback before we approve something then you'll need to use our review system which is the second URL and that gives you an opportunity to comment on things other specific lines or a more general comment and, you know, be part of the interactive development process for reviews. Next slide, please. So how did specifications go in June? Well, first off there's a summary of the specifications that were implemented on my personal blog. This summary was written before the release was made but it should be, you know, basically correct. It was the basis for the release notes. So that's the other advantage of specifications is now we can say, you know, we've released a feature and here's exactly how we intend it to work. So that's really cool and hopefully useful for operators and employers. We did learn some lessons about specifications in June over. Next slide, please. Specifically, specifications did slow us down. Now some of that was intended, right? We needed to be more deliberate about the features we were writing so that we got them right the first time and, you know, so we needed to just slow down and pause and work out what we were doing. And I think because of that we're seeing higher quality features now. We're definitely rewriting features less than we used to. But we did also learn that that's not something we always have to do. So we've made some tweaks to the specification process for Keila. We now have the concept of fast track approvals. A fast track approval happens when you had a Juneau specification but for whatever reason it didn't manage to get into Juneau. So we have this very lightweight, a single reviewer can look at the specification and say, yes, this still makes sense to us to go. And that's been cool because it meant that we didn't have to pause the development cycle at the start of Keilo to get a lot of paperwork done. The stuff that was previously approved in Keilo got approved very quickly and people could start working on it straight away. We also have a trivial blueprint process. So as I said, in Keilo, absolutely everything had to have a specification. Sorry, in Juneau. In Keilo, that's not so true. So if it's something really simple, like, hey, I want to add a flag that changes the location of this temporary directory, that wouldn't require a specification. What you'd do is you'd write up several sentences, blueprints on Launchpad, and you'd bring it to one of our weekly NOSA meetings. And at that point, you get to advocate for it being trivial. We either agree or don't. And if we do agree, we just approve it on the smart. So that's unblocked a lot of the smaller changes as well. We also have the concept of black backlog blueprints. And this is an idea we stole from Keystone. And I had a couple of conversations with large deployers at the Paris Center and they seemed to really like this idea. So the idea is if you're a deployer but you don't employ a lot of software engineers, you can say, hey, I've got this problem in OpenSack or I need this feature. And you can use a truncated form of our template for the specification. So you might write a user story and say these are the use cases. You might describe the bugs that you're currently seeing or the problems you're having, that sort of thing. And then you can stop and you can send that off for review. And it gives Nova Core and the user a chance to talk about the feature and what it would look like and its relative priorities and things like that. And then we can approve it into a thing called the backlog. Now, the intention there is that new developers coming along looking for work in OpenSack therefore have a well-defined set of things that they could work on. We've already agreed on how it would work and what the user interface would look like and that sort of thing. They can just pick it up and complete the rest of the template and then implement it. I think there are some other side effects there that will be interesting if they happen. For example, if I was a deployer and I wanted to go out to a contract company to implement the feature, once I've got an approved backlog blueprint, that means I have a very defined scope of work. So I could go and get a couple of companies to close on it and then pick which other one I thought was best and go from there without ending up with two possibly very different features if I got two quotes from two different companies. And the other thing we did is we started defining project priorities which I think is a really interesting move but we'll discuss that on the next slide, please. So the idea with priorities is to address the architectural problems we see in NOVA. Generally, it's intended for wide-reaching problems or changes that touch the entire code base. So things like we need to be better at live upgrades or we need better scalability support. So I think it's unlikely that you'd ever see a single hypervisor driver listed as a priority. Whilst we love all of our hypervisor drivers, they're not project-wide and so they're not what priorities they're intending to fix. So to give you some examples, the next slide has a list of the priorities for Kilo. So first off, we have CELV2. CELV is our scalability story so if you want to take NOVA from, say, 500 hypervisor nodes up to 50,000, what you do is you break your NOVA deploy up into a series of sub-NOVAs and then they form a tree and talk to a parent NOVA and these sub-NOVAs are called CELVs. Rackspace, CERN, Nectar in Australia have all deployed CELVs. It is in use in production. The problem is it's not feature-complete. So there are users who would like to use CELVs but can't because they need support for a feature that doesn't exist in CELV. We've decided that that's the thing we need to address. So there's active work at the moment on a version 2 of CELVs that is feature-complete. It's going to tweak the exact way CELVs works but I won't try and describe that because we don't have time at the moment. You could read the specs if you're interested. But it's cool. I don't know if this work will be finished in Kilo but I think we'll make a good start on it and that's important because we are seeing a lot of large deployments of NOVA now. We also have the continued object transition. Now this is internal, not particularly sexy work but it's very, very important. So this is the transition of all of our data structures from something that was tied very closely to the SQL schema to something that is more custom to our internals. So for example, this allows us to decouple a schema upgrade from a code upgrade. So you could, for example, upgrade your NOVA services without doing the schema migration. And then the idea is that these objects handle updating the data to the new format on the fly when they read it from the database. But to do that we need to touch basically everywhere that reads from the database or writes to the database and so that work's been going for a couple of releases now. But it's continuing and I think we're getting close to the end and it will be very exciting in this one. The scheduler is clearly an area that people are interested in developing. We have a lot of people who come along and say, hey, I need to implement a feature a bit like this in the scheduler and the scheduler just isn't flexible enough at the moment. So we've got a good team now. They're working on that. They've got a good plan. The ultimate plan is to pull the scheduler out so it's a separate open stack service project, I guess. But we're not there yet. The first step is we have to pay down the technical debt on the scheduler because it wasn't intended to be like that and make it more flexible so we can implement these cool features that people propose. The V2.1 API is what the V3 API became if you've been around open stacks for a while. So it's become an effort to strongly version the API that was presented to users. So your client, when it connects to open stack, will say, hey, I understand this version and the server will degrade its responses to that version which means that, you know, for deployments where they don't want to upgrade client libraries every month, it means that they can now, with a reasonable amount of certainty when they start using the V2.1 API, be able to say, you know, this client maybe only needs to be updated every year or every couple of years depending on the support cycle for your desktop platform. We're also working on functional testing. We have always believed in unit testing and integration testing. We haven't been good at functional testing. But there's a team that's excited by fixing that problem and they're working on it now. And I think any form of testing is always welcome in open stack. We believe strongly in testing everything as much as humanly possible. The NOVA network to neutron migration is more about the neutron teams than the NOVA team. But we have this priority here so that we can unblock them whenever we need to. So if the neutron team comes to us and says, hey, we need to make this tweak to NOVA to make this migration work well, then what we're saying here is we will take those seriously and we will prioritize getting those things into the SOPE. I'd really like to see us have a really solid NOVA network for neutron migration storing in Kilo. And I think we're close to that now. We have a plan. We just need to implement it. No downtime DB upgrade is similar to the object transition work I talked about earlier. But it's focused on the SQL schema. So in many database engines, you can add a column to a table without rewriting the table. At the moment, what we do is whenever we change the schema, we do the adding of columns and the deleting of columns all at the same time. And that means that you are pretty much forced to take a database outage which will result in an API outage, for example. So what we're doing here is we're attempting to break out to the additive changes that you can make live from the contraction of the database by removing the bits we don't use anymore. So the idea here is that you'll be able to move to newer schemas in a live environment without affecting users. And then later on, you can take a planned outage to delete all the columns that are no longer used. And you might bundle up the delete from more than one migration and do them all at once. Bugs are obviously important. Every time a user reaches out to us and tells us that they're having a problem, we need to take that seriously. We have had a continued effort to focus on bugs and close them. We actually closed a fair few in Juno. There's always more bugs to close. So this is more recognizing this is a continuing effort that we need to not forget. And then finally, continuous integration. We've always believed in continuous integration. We've always believed in testing, like I mentioned before. There are some aspects of NOVA that aren't well continuously integrated at the moment. And I'm thinking of things like some storage drivers. Some plugins don't have the level of testing that we'd like to see. And so again, we'll continue to work on expanding our continuous integration coverage so that we hand our users the best possible quality of software. Next slide, please. Now, there's been approximately 40 specifications already approved for Kilo. So what I've done on this slide is I've tried to pull out some of the things that aren't priorities but that I think look interesting. Now, you know, this isn't an exhaustive list if you want to see that. I'd recommend going to the URL I mentioned earlier. The first one, again, isn't a super exciting new feature, but it's something I really want the players to know about. So the NOVA code has always assumed that the unique IDs we give to instances are in fact unique, but we haven't enforced that at the database layer. So it is theoretically possible that there are databases out there where that's not true. What we want to do is we want to turn on that enforcement in the SQL database, specifically because dv2 when we add it requires this. And so, you know, we've gone to a bunch of larger deployers and we said, do you have any duplicate instance of your IDs? And no one has reported one. But I'm highlighting this because I want the players to check. So we're going to provide a tool that will scan your database and let you know if you have problems. And then, you know, a migration in Kilo will add this enforcement. So it's just something to be aware of. I would be surprised if anybody has a problem. If they do, I'd very much like them to reach out and let us know so we can fix it early. There are also a bunch of not the 2.1 API improvements. So these are things that aren't tied up in the versioning project but are still interesting things we're doing to the API. We're going to improve the semantics around instance locking. At the moment, if you want to find out if an instance is locked, you do that by locking it, which is a little bit strange. Better result pagination, tagging for instances. There seems to be renewed interest in working on our EC2 compatible API. And so there's always a lot of API work happening, but especially in Kilo. IPaZ and Libvert are looking at adding support for SMBFS as a block volume. So this will mean that you can mount SAMBA or Windows file shares as volumes inside your nether instances, which makes a lot of sense in the Windows world especially. Ironic is going to add config drive support, which will continue to bring Ironic up to the level of compatibility with all the other hypervisor drives, which is good. And there's obviously continued work on network function virtualization, which is a big deal for our telco users. This work is mostly happening inside the Libvert driver, but it's things like Numer support, CPU pinning, continued work on PCI device workload, offloading that sort of thing. So I expect to see a fair bit of work done on that and we'll see a fair few features landed on it. And VMware is doing a bunch of interesting work around supporting OVA, which is an image format, ephemeral disk, vSAN, that sort of thing, which will be of interest to the VMware users. Next slide, please. And as I said, there are a lot of other specifications already approved for Kilo, but there's even more that are out there for review at the moment. So the URL on this slide, which again I'll have put in the description, is a summary of the specifications that are currently being proposed or have been approved. And I try to write an update to this every month also. And so I haven't actually counted, but I think it's in the order of there's 140 proposed changes to Kilo at the moment. And so if you're interested in a specific hypervisor driver or, for example, the things that we're talking about doing to the scheduler, this blog post would help you find out what we're currently proposing. Next slide, please. So anyway, thank you very much for coming along and watching this video. I hope it was useful to you. As always, I'm happy to chat. You can email me, I'm on Twitter, and I'm on FreeNode IRC in the OpenSnack channels. I do live in Australia, so sometimes there won't be a perfect time zone at the lab, so maybe email might be a better way of contacting me, but whatever works for you. And then I am also just one person. So if you ever have a question and you feel like you want to ask it in a more general manner, please do email our users mailing list at opensnack.isks.opensnack.org because there's lots of people much smarter than me who can also help you out. But anyway, thanks again for coming along and watching this video, and that's all I've got. Awesome. Thank you, Michael. And like I mentioned earlier, his IRC contact information will also be in the description of the YouTube video, as well as the links he included in the presentation and a link to his slides on SlideShare. So of course, feel free to reach out to Michael or us at OpenSnack if you do have any questions. Thanks.