 All right, good afternoon, everyone. Welcome to the final session of the day, which is a storage session. We had several storage sessions earlier today, so it's a good way for us to zoom in and see what we've done in our cycle. And with me, I have today, by the way, my name is Sean. Those of you who doesn't know me, I'm a principal project manager in OpenStack, focusing on storage across the board. Today, I have with me John Bernard, who's one of our senior software engineers, focused on Cinder. I have to apologize. On behalf of Flavio, he's actually in the next door room. Flavio is also the soccer PTL, so he's running as a design workshop in parallel, along with another glance design session. So we'll be joining us, I hope, at the end. And if we look at today's talk, the title is very overwhelming, right? The Enterprise Ready OpenStack Storage. And the main question you need to ask yourself is, yeah, you can ask yourself what its enterprise mean, right? But I think, what is, why road? Why are we not there yet? And are we there yet? And for us to actually, I want to take you throughout the journey for the next 40 minutes, throughout some of the stations that we find at our Enterprise Ready stations, including high availability, volume management, business continuity, which is a very fancy word for backups, disaster recovery, to those of you who missed my yesterday session, do the where is my volume, and deployment and volume upgrades, which is a big area of pain today in OpenStack. And we're going to finish with some of where we are in security. And to those of you saying, yeah, storage, but yeah, we have all these concerns when we talk about storage. Let alone when we need to deploy storage in production in OpenStack. So the theme we're after is, what do we need to do in order to get production grade ready in enterprise environments? And this is the session we're going to go in the next 40 minutes. And without further ado, let's kick the tires. So we'll start with high availability. When we talk about high availability, we still need to define what high availability is all about in the context of storage. But in a nutshell, all services that powers OpenStack API should be always on and able to always respond during failure, but also during massive stress. So when we talk about high availability, it's also in the context of scale. We need to provide protection against both hardware and software single-pointer failures. And with that in mind, let's go back a second to see where we are today. So there are cases, and we're going to start with Cinder, which is the block service for OpenStack. And there are cases today where volume is actually left in an unrecoverable state. And it's not actually possible even to delete the volume with Admin or Intervention by the Cloud Admin. For example, if a Cinder volume node dies, for many reasons, natural dying, during volume create request, that is, the volume will be actually unresolved. So this is an area of concern. And where we are today, so we have single volume service running in active-active state. But we find it, as you saw the truck earlier, we don't think the current implementation of active-active is actually safe, right? Yes, it's true. So for active-passive, I think it works fine. For active-active, there are, at the very least, there's some classic race conditions in the volume API service where the volume state is queried from the database. Some local state is created and not shared with the other Cinder volume nodes. And then later, the database is updated to reflect the current state. And this opens the door for Cinder volume nodes to race against each other in an active-active configuration. So to be clear, an active-active Cinder is certainly not safe. I think some shops even run it in this configuration. But if you push hard enough on it, it will break, I'm sure of it. And so it's something we're working on now. In fact, a lot of visibility has come on to it in recent weeks and months. And so people are coming together to try to offer a solution. There's a lot of pieces that are involved in order to get Cinder to a place where it can be run safely in active-active mode. And so there are some blueprints posted and even some design sessions tomorrow to cover these things. So we're moving there. And we're aware of the shortcomings. But at the moment, this is the current state. Thank you, John. So as you see, I wanted to start today's discussion with where the problems are. And this is a big problem, right? So if we want to deploy OpenStack in production with BlockStoried, we need to fix these things. Some of the progress we've done in the last cycles in Kilo, so ice-cazi multipuffing. So Nova Compute supports multipuffing for ice-cazi volume data path. However, some backhands only respond to discovery with a single portale address, even if it's a secondary portale that are available. And the work was done during this cycle actually enable Cinder for this driver to return multiple ice-cazi puffs information so to overcome this problem. However, this is only one step from this block side. This still work enable me to be done in NovaSide. So as we see there, as we're moving on, storage is not an island in OpenStack. It has to integrate. It has dependencies as there's a lot of cross integration between other projects even to enable a single feature. So this is, for example, in the ice-cazi multipuffing. Horizon. So the reason we're bringing it up because it's in the context of high availability, so now we have a front door, what we call Migrate Instance button that allows the administrator to use a simple way for preparing hosts for maintenance or upgrades. So it's very useful for upgrade scenario also to test and perform manual disaster recovery. So now we have a push button so we can automate this process. But are we there yet? Well, so for live migration, certain configurations do work. There's still some work to be done. There's an existing bug in Nova with the way that the block device mapping is maintained that can cause bugs if the lung changes, which is something we should be able, we identified earlier in a design session, I think we should be able to knock that out pretty quickly. And further multi-attach in Cinder should also help with that. And so certain driver backends that can participate in multi-attach and should be able to benefit and allow live migration to be less of a dice roll and more of a something you can rely on. All right, with that, let's talk about some concurrent resource mutation and now we're gonna kill them with some new initiatives in Cinder. So the road to active active, that's how we call it. And there's new initiative called Cinder State Enforcer. By the way, we are including links in each one of the topics we're gonna cover today, so you'll be able to just download the slides at the end and go into the relevant specs to have a full spectrum of what's going on. So few words, longstanding work on improving Cinder volume state, management and reliability allows us to improve failure tolerance, that's the key, in order to mitigate the concurrent resource access problems in Cinder. Work was done in the last cycle to refactor the concept of lock, right? So we can actually have a set allowed and this allowed transitions using the new Enforcer model. So we're basically trying to get rid of those mutation in Cinder. More on the road to active active. So this is actually a link to a very active etherpad on the topic and some of the progress and things we still have to untangle. For example, local file locks in Cinder volume need to enhance the lock reporting to Nova base, so the volume active state. So as I mentioned area, there's very tight relationship between compute and storage and that's exactly it. So if we have an unresolved state in Cinder, what do we want Nova to do about it? Right now Nova doesn't even know that we have a problem, so how let alone how we can take an operation upon this. So that's one area. DB access in drivers. Today we have a lot of direct access by Cinder back end drivers directly to the Cinder database. This needs to be minimized or limited at all. Why? Ah. I tackle them. So this database actually maintains the consistency and if we have somewhere external driver writing to this database, we're done. So that in the context of consistency, right? We're trying to improve that. Yeah, Cinder objects will should address this as well. I think we have a slide on that coming up. All right, yeah, good. So nobody's expecting the internal state of Cinder volumes to be determined to take actions, rather actually properly delegating the detach for example. Other work being done in this area to mitigate active active is the task flow for managing create volume tasks. This is an active spec work being done. The improvements of state management can get a step closer by leveraging state management, what we call task flow. So this one is just to create volume tasks. If we actually be able to nail it down and it's proven, then we can expand it to the rest of the states. So that's another way we can actually untangle this problem. Moving to volume management. So I think you just mentioned the multi attach, right? Do you want to talk about it? Yeah, I mean, the ability to attach a volume to multiple hosts at the same time is a feature that was being worked on in Kilo and it's landing, there's still some work to be done to expose it in Nova and in the Cinder client. And so it's moving forward, but it certainly hasn't crossed the finish line yet. So this is the small detail lines, right? If you read the label, it's landing in Kilo, you cannot use it. We need Nova, right? So we're just explaining. The main use case is we have clusters. We want to run a cluster at the RPVISA level, application level. We need to map the same volume to the two hosts, right? So obviously we need compute to play a role here. So we only done the Cinder implementation side. With that, let's talk about volume migration. That's another key. Volume migration has been here for a while, also volume retype. But what's the confusion around that? Thanks, Sean. I'm glad you asked. Right, so for migration and volume retype, Cinder provides these abstractions and underneath under the hood, it tries to do it as intelligently as it can. And so it makes some choices and it gradually falls back to things that should work if the most intelligent choice fails. And so for migration, it will throw the operation down to the driver to see if the driver can do a migration such that it could benefit from references or copy on write or something like that. And if that fails, if it says, no, I can't support that operation, then it will fall back to generic migration, which will cause the volumes to be attached to the host and DD will be exact to move the data. And so what you get when you ask for migration varies depending on your configuration and what exactly you're doing. And retype is the same way. It will attempt to do the retype within the driver. And then if that fails and migration is required, it will call into the migration routine. And so there's a lot of paths and when things fail, it's not always obvious. Sometimes it can succeed or fail depending on the parameters that you send to the command. So this is an area that if you are doing volume migrations, please take a closer look also how to use it, right? So it's not my default. For example, you may use volume retype to kick your migration, but you have to specify it. And with that, let's talk about what's coming up in Liberty. Right, so in Liberty, I'm working on a feature that will allow drivers that don't support local attachment via iSCSI or fiber channel or something similar to participate in volume migration. And so I mentioned earlier that if all else fails, it will fall back to attaching the volumes locally and then using DD, which obviously requires a file path to the block device. And so for something like RBD, we usually have RBD and there is no path to the block device in that case. How many of you use this app in the room? Raise your hands. All right, so this is an interesting thing for you, right? So just scooping in that you're still with us. All right, thank you. Moving on, business continuity, right? So as I said, it's a very fancy work for backups and let's talk where we are with backups. So progress and this release, we were able to actually land incremental backups. And since the backup PPI was extended to actually support snapshot-based backups where the volume can actually remain online and in use during this operation. And the target can be either Swift or NFS. The enhancements also include performing a backup from Snapshot, right? So this is a typical use case. You don't have to actually do your backups on the original volume because there's performance cost for that. So it's actually a better practice to do it from Snapshot. New Cinder CLI was added, so you can actually use it. In Swift, it uses the Swift PI so you can actually calculate the deltas between the Snapshots. And during restore, differential backup needs to be restored. The restore process actually first restore a full backup. Similar to what we've done in tapes. In all days, you have to do a full first and then incremental to follow. Other improvements on backup. This is actually a very strong theme that we see in Keele release in Cinder, right? So we're closing a lot of gaps. And some of the gaps is, till now we only had backups for block. Now we have NFS as a target. And POSIC as a target, so it's very important. Backup support for encrypted volumes, right? So what do we do in encrypted volume? So there was no support for encrypted volume in the backup. Now we do have support for encrypted volumes. So as you see, we're moving up the ladder of actually getting more robust in backup. However, there's still gaps. We're gonna talk about it when it comes to scale with the service. This is a nova progress, but I'm mentioning it here in the context of block. Why? Because backups are a topic, as you can see. This is why we group these topics together because some, as you already figured out, this is not just work in Cinder or in Glance or Swift. It needs to happen across the board to enable. Nova, in KVM specifically, is leveraging the Qmo GA guest agent that can support out-of-the-box quessing of the file system. Why it's important? Because if you're running backup, you want backup to be consistent. So that's the first stage that was done. It's very useful, of course, to taking, when you take a backup, specifically if you run upgrades or maintenance, so you better utilize this. In the future, we'll be able to leverage things that's already available today in Qmo GA, in KVM, but we need to expose them in OpenStack, such as allowing hooks so you can actually stop, not stop, a quess like MySQL database in hot backup mode, so you have a full application consistency for your backup. In Windows, you can use VSS, for example, in the guest agent, et cetera. So there's still room for improvement, but that's a major step forward when you're taking backups, consistency. Not just crash consistency, but VSS and consistency in later application consistency. Swift, one of the biggest features have landed in this release, which is Erasure Coding. Erasure Coding allows you pretty much, by the way, Erasure Coding is one of the storage policies, and as you remember, storage process was a big feature that landed a release ago. So now we're able to leverage this feature. In a nutshell, it allows you to reduce storage costs associated with massive amount of data, such as backups. And of course, it's very useful when performing volume backup to Swift, which is a typical scenario. We have to then open SAC, and of course, this can be compressed and allows us to read once and written for this system. So I think that's a very nice feature. One red label, right? As we mentioned earlier, data. This is a beta feature. Don't use it yet in production. You may lose your data. Yes, we're talking about backup and restore, so this can actually work against you. So, but however, if you want to test it, we won't get it mature enough to use it. So yeah, this is where we are. More backup improvements in Liberty. So Cinder scaling, as I mentioned, we came a long way in Cinder backups, but we still have a scale problem. Currently, the Cinder backup service actually has to, the drivers need to couple and run together in a single node, and we're trying to actually decouple this so we can actually scale out this service and also improve the performance problem we've seen lately here. And there's design sessions actually just to address this topic. Last one on the backup side is Swift with fast posting when pretty much where you post to an object will trigger a container update and to guarantee data consistency in the container. So we had a busy business continuity release, and now let's talk about disaster recovery. So in Cinder, we introduced Consistency Groups a few releases ago. This several enhancements that were done in this release just for Consistency Groups added the ability to add and remove volumes from existing Consistency Groups as well as added the ability to create Consistency Groups from existing Group Snapshot. So these two enhancements, we're gonna talk about Consistency Groups later in the context of replication as well. What else have we done in Liberty? So this is looking ahead, we're here at the Liberty design session and one of the things we're gonna introduce in Liberty is actually Cinder import of Snapshot. So today, I can, using OpenStack Snapshot mechanism, I can create the new images and it's very convenient for me when upgrading base images for taking published image, custom-made local use as I need to but one use case we haven't thought about is what about external use? So this is why we need this feature. It pretty quickly allows us to import volume snapshots from one city to another. Think about all the different use cases when this comes along as well as allow import non-Opus stack snapshots. Think about snapshots you have actually residing already on your storage backend. Now you can actually export the snapshot in a similar way to we're using in expert volumes. So this is two themes that was added, ability to actually import from snapshots from the existing backend. More as I promised replication. So replication has been here, we're pushing replication already from high cells. Currently the API state is we have several vendors backend that have done implementation but it's not unified API, meaning we still have a way to go to expose the right are we doing synchronous replication and asynchronous replication. Not all drivers are written the same way. So this is why we're seeing a slow adoption and this feature in terms of vendor because there's still a way to go and we have a design session actually to tackle this topic. But specifically I wanna talk about two things we missed in version one of volume replication in Cinder. One of them is actually replication between cinders, right? Till now we're only limited to a basic replication in a single deployment. I mentioned consistency group. So we added volume replication. We have the consistency groups but we haven't connected the two. So we need to align the work and the design with the relevant volume replication spec. So that's also work ahead to be done. Looking at deployment and rolling upgrades. And this is actually another topic that we see from the enterprise, right? It's not just the high availability. It's not just the business continue. We also need to work on solving problems around deployment. This is actually work we've done in Glance. You wanna cover this one? Sure. So I've been talking for a while. Yeah, you're very good at it. So Glance has actually seen a lot of cool features for this cycle. Just two of them to cover our introspection, the ability to query metadata about the image without actually downloading the entire image in order to do so. Which is great, it allows you to know whether or not a compressed image is going to expand. Within the amount of space that you still have remaining without actually trying it first and watching that fail. So this is good. And also image conversion, the ability to convert an image during the import process. And so for instance, if you're using something like RBD for both Glance and Cinder, you can take advantage. If the images are in raw format, you can take advantage of copy on write in that case. And so this allows you to ensure all of your images are ready for that, if that's your configuration, which is really nice. You don't have to re-upload images or keep copies of them or different formats or so on and so forth. So it's really simplified the deployment. Yeah, cool. All right, so we've spoken already about the plan. Let's take a closer look at the next thing, which is deployment and rolling upgrades. So let's talk with a small utility, but it's very handful. I've been working with a lot of customers that this is an area of pain. So very long-lived, open-second installation will carry around database rows for years and years. It's a good story. Operating it is to have a way to ability to purge deleted rows, possibly on a schedule like a Chrome job or needed before upgrades or prior to maintenance. So the new utility allows you to clean up these rows already marked for deletion. So you can actually specify a certain age. The age is calculated as a timeline, a time delta in days, which can be given in a command line. So that's a very nice utility for admins and the link is there. Implementing forced detach to allow self-cleanup of a stock volume. Where we had about stock volume, yeah, we talked about like 10 minutes ago, right? So we have all the states where we have unresolved states with stock volume, such in attaching or detaching. And there was no safe way to clean up this involved back in storage. Now, with this feature, we can use the Python CLI Cinder State, but it's not full because it allows us to, it will only change the Cinder database and may leave some volume exported to the computers and may leave entering Nova. So that feature is actually there to resolve that stuff. And as I said, the work is done in Cinder and in this case as well, we also have work to be done in Nova to finish it. With that, we wanna talk about one of the biggest features coming up in Cinder that's gonna really, really simplify the rolling upgrades, which is the Cinder objects, supporting upgrades by using actually Virgin objects. These objects are isolated from the schema and contain required information and communication about the operation. The objects can be sent over our PC. The work is started actually already in Kilo and this is a big work to be done and we have actually designed sessions specifically on this topic. So look at your schedule if you wanna attend it. And of course, there's compute with Nova objects already there. We're pretty trying to simulate the same thing in Cinder. More on deployment and rolling upgrades. Cinder storage policies. So as you know, if I'm a storage vendor and I have a unique capabilities to my backend, I can write a driver and use extra specs to create a volume type to match this characteristics. However, we ended up in a state where we have a lot of vendors exposing very nice features but with no consistent way that the cloud admin can actually use. So we need to improve the visibility of storage policies to the cloud admin either using the CLI horizon and this of course, think about capabilities like quality of service, replication factor. One vendor can have gold bronze, the other one can have IO bound limits. So we need it to come up with more standard API and to be able to expose it all the way up to the user. So this is more admin facing feature. And with that, we're gonna talk about the last topic which is security. And one notable feature that was added is the private volume types. It's a new sendability to define private volumes based on the volume types. Now it can actually be restricted based on this new flagging. Private volumes are special needs when most of you should not be able to select this volume. Think about like upgrading a new high performance volumes that I only wanna expose to a specific development group or production group that I don't wanna expose to any other users to make because they cost me tons of portion just to send them up and connect it to my cloud. So I need a way to restrict that volume types to my user and allow private volumes and of course I can control it by removing a project from. More security and liberty. So glance, image signing, encryption. This is actually one of the sessions that's going on in the next door right now. And right now there is no way to guarantee that the image you asked Lance for is the image you actually got in Nova because we don't have a way to sign the images. So this feature has been actually discussed in the past and it finally looked like it's gonna land in liberty. Image signing and encryption by Barbican as the key manager. So we're looking, to those of you who haven't looked at the Barbican, Barbican is a new service of the last to do centralized key management in OpenStack and we're trying to leverage, as you can see it in the context of images. And of course the goal is to provide image integrity. On a rise in volume encryption. So volume encryption were already done in previous cycles. Now a rise in is following. So support for volume encryption for a rise in almost there. Some of the work is actually done as I said but it will continue to work in this cycle to finish it. Let's talk about object storage. Where we are with Swift in terms of security features. So we have new features, actually two features. Encrypted at rest. Encryption and rest. So currently objects are typically stored on a disk file in a static POSIC file system. Providing option to Swift operators to have object stored in encryption form and think about the life cycle, right? So when this speech end of life, they can be discarded if not properly wiped, right? So you have left of information that someone can access. We better want it to be encrypted. And if using object storage in OpenStack, which you do because you're storing your backup to, we want this feature. Moving on, another important feature is actually around Swift composite tokens and service accounts. If it's ringsable, because we implemented it in Keystone lately and we're actually making use of it already in the context of Swift, we might as well utilize it in the future and send there, et cetera. Composite tokens allow other OpenStack services to store data in Swift on behalf of the client that neither the client or the service can update the data without both parties concerned. So you need two keys to this save in order to open it. Examples, user request that Nova save a snapshot of a VM. Nova passes the request to Glance. Glance writes the image to Swift container as a set of objects. And then the user cannot modify the snapshot without also having the token from the service as well as the service can update the data cannot update the data without a valid token from the user. As I said, if you want more information, each one of the services just use the link we attached. And with that, we want to actually zoom into some final thoughts. As we saw, we came a long way in this release specifically to get more close to enterprise grade storage in OpenStack. But as you saw, we went through a ladder with a lot of steps, all the way from where we are in high availability of the services and how much work we still have to do, just in active-active, for example. We looked how progress we came in just to provide backups using the Cinder backup service, but they're still including incremental snapshots that was added in the last cycle. NFS, POSIX, all the great improvements, but still we have to work on the scale of the service because otherwise we have enter performance problems. We saw a lot of improvements around volume management. You saw how many things we had in volume migration, for example, the simplifying the work there, as well as rolling upgrades, management, all these aspects of deployment. So as you can see, the topics we touched today go extremely away from simple utilities to make the admin lives better all the way to have a full disaster recovery capability. So with all this moving forward, I think one of the examples I think we use today is like a turbo race. So we have a good progress in each one of these areas, but we are getting progress, that's the net, right? So I think with every release, if Gino was pretty much the OpenStack Enterprise ready V1, I think Kilo is already V2 definitely, and looking at Liberty, we're really, really getting there. And there's still a lot of work that need to be done in each one of these areas, as you see, and this is why we're here in the design sessions. Any final thoughts from you? I think you covered it. I think we're moving all of the pieces in the right direction, and I think it'll be a busy summer, but hopefully we'll land some of these. Because you're not taking the case. Yeah, yeah. All right, and with that, we want to open the mic for questions and anything you just saw, and I will leave you with this, Parker, as well so you can actually download the slides and get more information as well. Any questions? Please use the mic there, yeah. I was just wondering if you support the Pixie Boots for images in Glance? And so basically, we want to manage our own images on the appliance, and we want to do a net boot out of the standard Pixie server. So what we would store in Glance would just be an iPixie client that to bootstrap the real image that's somewhere else. That should work. Yeah, so that works today, right? Yeah, I was trying to look for him and I mean, I haven't tried it firsthand, but those pieces should fit together the way you would expect. So one other question. So when we store today the images, they're not encrypted, correct, in Glance? That's correct, correct. Okay, so that's something that's coming up next. Correct. So sorry, I came late to the, all right, thanks. Sure, more questions? Hi, so when you talk about the QoS, are we planning to provide commands to do a control of flow or any of the properties on the array side? So quality of service actually come a long way in Cinder, and it's pretty much today up to the backend to report, as I said, using extra specs. One of the things you see is trying to simplify the standard around there. So please attend the design sessions to impact if you want more things to include it there. But I think already now there's a great variety of storage drivers that can expose great functionality of quality of service. So I think if we need more functionality, this is why we have these sessions so we can impact. Thank you, and just one more question. So what about remote replication? Anything we are planning? So that's a good topic. As I said, volume replication this came a long way. We actually gave a talk yesterday about disaster recovery in OpenStack and things you can actually do today out of the box. But again, it's still a work in progress if you're talking about between sites, right? So as you saw from my presentation, one of the things we're trying to deal with, the V2 of replication is actually be able to do replication between cinders. So that's another, but there's a lot of things you can actually do today. I would not recommend using any stretch cluster topology because of the latency, but there are a lot of things you can do today with different back ends like SAF, et cetera, with RBD. Thank you. Can you expand a little bit on what is missing for volume encryption in Horizon? And then my understanding is that there's two types of encryption is a front end, which will be handled by Nova on back end, which will be, I guess, handled by Cinder. And do you support both? So what does state exactly of? Volume encryption. Encryption. Is it exposed to Horizon? You may know this better than I do. Yeah, that's exactly what we're working right now, the expose. So if I understand correctly the question, can you maybe repeat it in a different way? You were mentioning that there were some issues that you didn't fully finish up volume encryption, so I wanted to know what are these issues. And then depending on your answer, I wanted to know if both front end and back end encryption are supported or what the state basically of volume encryption is. So Cinder encryption is already there today. What is missing is a way for us to expose it out. So that's where you're question, this is the current way that's taking place right now. So it started in the last cycle, it's been going on and hopefully we'll be able to finish it. But yeah, this is exactly what we're tackling right now. Okay, thanks. Thank you. Last question. And if not, I want to thank you all for surviving the day with the last session and I want to thank John. Thanks man. All right, thank you.