 time to start. I want to talk a little bit about Keeper I.O. and some of the impact that's going to have in the I.O. stack. I want to start with a refresher of what encryption exists today and the way that it works. Basically, you've got a self-encrypting drive that's taking data in, encrypting the data at line speed, saving it out on the median encrypted form. So if you lose your device, somebody steals it, then there's encrypted data. They can't really just get at it very easily. But there are still some issues with that. You can see that all the encryption information is actually stored in the drive itself. The drive is the thing that manages the keys. The drive stores the keys, generates them, it stores them, and keeps track of all of that so that if you lose the drive, you have lost not only your data, but you've lost the keys with it. So if somebody knows how to get at those keys, then they could still get at your data. So there are some ideas about how we can do that better and get those keys out of the drive. The other thing that happens is that there's not a lot of subdividability within the device in terms of the keys. Most of the time, it's the whole device that's encrypted with the same key. If you happen to set up a couple of partitions, you've got three partitions set up on your drive, you can have different keys for each of the three partitions, but it's divided up by LBA space. So you have to know ahead of time how much space is going to be used by each of the keys. It's not very dynamic. You can't really change those partition sizes on the fly and change the keys on the fly. So this is sort of the way the self-encrypting stuff works today. I have a comment. Android phones use a different model. Encryption keys and Android phones are stored outside the storage medium in a separate chip. I'm sorry, I couldn't hear. An Android phones follow a different model. And Android phones, encryption keys are stored in a separate chip outside the storage medium. He said that Android phones store the keys on a different medium, not the storage medium. For this technology, the keys are still stored in the device. They're persistently stored in the device somewhere to be able to decrypt the data. The keys are not external to the device. The keys never leave the device. They don't come in from the outside and they can't be retrieved from the device through the APIs. You'd have to figure out some way to break in to get the key. But the point is the key is in the device. It is co-resident with the data somewhere in that physical device. So the goal with key per IO is to be able to have an external key manager, which can keep track of the keys. Keys can then be injected into the device to use to encrypt certain pieces of data. You can have a large number of keys that get managed externally and they can be inserted as they're needed. They can be removed when they're not needed. They're not persistently stored anywhere in the device. So in this case we've got three different examples. Their object A, B and C, which each are going to have their own streams of data to go out to the device. And you can see when the data gets stored on the device, it gets stored based on the source that it came from. The encryption key that's used to encrypt and protect that data is based on where the data came from, not the LBA range that was preallocated. So each of those objects is going to have their data individually encrypted. So at that point it's much easier to delete that data because all you have to do is delete the key and all of a sudden that data becomes unretrievable. It also allows a much more dynamic use of the capacity of the device as each users, each of those three objects, A, B and C, use different amounts of storage. They will just dynamically increase however much capacity they use and their data will just get encrypted with their key as it uses that space. So basically what it looks like in the device is that when the keys are injected they're associated with a tag. They get a value that references the key. Keys are huge. When they're stored in KMIT databases you've got 512 bytes or more of key data. Sometimes there are multiple keys that are used in differing ways through differing algorithms to protect the data. So the keys themselves are not going to be something you're going to send with every single IO. So the injection process associates that key with a particular tag. And then the IO, each read command, each write command, will reference that tag so that the device will know how to go find the key and use that to encrypt the data for a write or to decrypt the data for a read. So that ends up in being some kind of change through the read write IO path to be able to specify that tag. You can see in this case we've got a write going out with the reference to the orange key tag. Key number three goes out to namespace one. When the read goes out to namespace one it's using the green tag which is key tags zero but that's referencing key number two inside that namespace. You can see each IO has gotten its ability to specify its own tag to be able to get to its own particular key. So there's an example of a multi-tenant use case where we've got four tenants storing their data, retrieving their data out in a fabric, an array controller where all the data is stored on a bunch of different devices where each one, their data is individually encrypted with their personal encryption key. So I'm going to look at, so Fred in that diagram does each tenant still know the key encryption key? It's the same for each of them. Each, well that's going to depend on the layering which must be present on each tenant. It depends on the key manager but that presumably depends on the tenant. What layer of the key management is happening at and who knows about the key and that's what I was going to look at was start, get off a little bit of the API and start looking at some of the use cases. So here's a case where we have three tenants. They're talking out through an NVMe fabric to a controller, an NVMe controller and that array controller would like to be able to independently protect the data for those clients. So that array controller has a built-in key manager as a payment database and it sends keys to its back-end devices to encrypt it and protect the data from each of those clients. So if a system admin comes along and says I want to delete tenant number one and I want to make sure that their data has been securely erased, I can go out and delete the green key from the KMIP database within the NVMe controller in that fabric device. Now that green data cannot be retrieved again off of any of the multiple SSDs that may be in that RAID set and I've been able to do that without having to overwrite the data, without having to do any kind of scrambling and using up the where life out of my SSD. While the data for the yellow data, the red data, that's still intact, tenant two and three. And it doesn't matter which LBAs it's in or how much data has been used or not used, I didn't have to pre-allocate or decided that any of that ahead of time. In this case, those tenants know absolutely nothing about the keys or the key tags. This is strictly a storage feature, one use case. So if I elevate that where I have a set of virtual machines running in a hypervisor, I now have a case where the hypervisor has its own KMIP database keeping track of these keys, it extracts the keys out of the KMIP database, injects them into the devices, in this case the single name space. So VM one, since it's data the hypervisor, the hypervisor attaches the green key tag to it, the key tag associated with the green encryption key and sends the data out to the namespace to be encrypted with that particular key. The right commands from the other VM, VM two, that's the yellow data, goes down to the VM, the VM attaches the key tag associated with the yellow encryption key and sends that onto the device. You can see that the hypervisor is managing it all, and these virtual machines know nothing about the keys or the key tags. So over on the right hand side in that next picture, now we have some applications that have become key tag aware, encryption aware, where those applications have within their system access to a KMIP database, and so I have multiple applications that are running within a particular VM, and maybe I have three different colors of data within each VM, where now that VM is aware of the encryption key that was sent to the device, it's aware of the key tag that was associated with it. And in that particular example, the hypervisor is totally unaware. The information is just getting passed down through. The reference to the key tag is part of the SQE for the read command and the right command. So that's going to depend on what those interfaces are between the virtual machines and the hypervisors, what the interfaces are into and out of that hypervisor and how that key tag information can get passed along. And in your diagram on the right, how does each of the three separate KMIP databases get the key wrapping key, the KEK? That's going to require some coordination. They can each, because they each of those VMs thinks it's a machine, and they can go into their KMIP database and get their key out of the database and they can pick a tag to associate with it. But because they're all going to the same device, there has to be some coordination in there. So they have to use the same wrapping key? Well, they have to make sure they don't overlap key tags, for example. Right, understand that. I'm concerned about the security of the wrapping key. Because we have media encryption keys and we have key encryption keys to make sure we have secure transmission of keys between the hosts and the device. So the algorithms to do that could be set up differently. It would be possible to do them differently. But the easiest way is what you say would be to do it the same. The hypervisor could set it up once. And the VMs could just be the users of that then. It's going to depend on what those interfaces are between the VM and the... If you have the hypervisor set up, the key wrapping key, the KEK, that means that anybody getting into the hypervisor can intercept the key and then they can intercept your media key on the way down. Right, and that's why this is a complex topic. And where do you put the KMIP database management? Who do you allow have access to that information? What do the APIs look like as you're going to be passing the data through? What is the layering? And that's the next example is that if you've got applications in a VM that are trying to use it through a hypervisor that is aware of it onto a storage controller which is doing its own encryption, then there are some complications there. And so that's where I want to make the group aware that this is well along in the committees and that there are different kinds of use cases and such which can use this. Some of the easiest ones, this one is totally transparent to all of the applications and to the host. It just exists. The one on the left here is strictly within the hypervisor and the VMs are not aware of it. So you've got a trusted hypervisor in this case where this got a common KMIP database for all of its VMs. So the common security case for this is to use an ephemeral key for the wrapping key. The common security methodology is usually to use an ephemeral key for the wrapping key. So you use one wrapping key per transaction that goes down to the disk. Once it's been used you throw it away and create a new one. It sounds like you're using more long-lived wrapping keys which then have a theft problem. They can be. They don't have to be. The protocol allows either. Or I should say the protocol at this point, it's not done. This is not a done deal. This is still a work in progress. They are not persistently stored external to the KMIP database. So if the device falls off the back of a truck then yeah the data is still there but the keys are gone. You lose power. You lose the keys to the host when it powers back up again. That device it has to get re-injected because there's no keys there. And yes, you can have a new set of key encryption keys. Every time you have to send new media encryption keys to the device. Okay that's what I was more probing on because that's the security best practice. We're still looking at the different ways to do that. There's pre-shared keys. There's secure manufacturing where you have some of the public private keys set up ahead of time. There are certificate algorithms. All of those are in part of the discussions that are happening within TCG who is working on this protocol. So it's the security people that are trying to figure out what the best ways are to do that. And it is still being worked on. It is still a work in progress. This is sort of a, you know, we have to think about the APIs and how we're going to get this key tag for each IO, each read and write command is going to get tagged down through the IO stack. So that's an FYI. We have something very similar to this already in FC because we're tagging traffic on the FC fabric for a VM ID. And so we have a C group attribute that was added which is effectively a UUID for the VM that we up call on the IOs to obtain and then translate to a tag. This sure has a lot of the same aspects of that as a possible way to do this. How? I mean, so for the... Mike's not working. What if the green gets, I mean, the VM 1 gets a hint of the VM 2's tag. It does do a plus 1 and I guess it. And then suddenly I can randomly try to read from the namespace. Is there any protection against that? That would have to be protected at the VM layer, at the hypervisor layer. The hypervisor is going to have to make sure that it's going to have to be the one that allocates the key tags to each of the VMs and that it doesn't allow key tag references coming from non-approved VMs. The security layering as you look at different use cases for where this could possibly be used have different sets of implications and who's going to have to do the checking and authorizing and all of that. For enclave technology we definitely want an untrusted hypervisor. So a hypervisor can manage the tags as long as it doesn't know what it's doing but it can't manage the keys if it's not trusted. In the model on the left, the hypervisor is managing the keys and the model on the right, the hypervisor is not managing the keys. The keys are being managed in each of the VMs. How about making a list with kind of threats? This model wants to protect against. The answer my colleagues came up with was to store the encryption keys and a separate chip such that these are not in DRAM and not in the storage device. I didn't get all of that. It was very quiet. So I'm in the Android team and the answer my colleagues came up with security experts. I'm not a security expert. It's to store the encryption keys and a separate chip such that these are protected from any software that runs on the whole CPU and then twice to steal the keys. Yeah, that doesn't work in this case because an Android phone is a single user device. This is a multi-user device, multi-tenant device. You want one Titan chip per VM. So there's many, many, some are moving towards having like direct access to the namespace. And so if we have passes these, the hypervisor will get bogged down looking at each IO. So if we just like open up the gates and kind of here's the drive and you can access it. What is there any thoughts about it being authenticated when we create the keys? Like let's say I have a VM and I pass in the full NVMe device and there has been thoughts and like we do that, we create the keys, we provide the key and then all the IOS that goes through that is kind of pre-tagged with that key. And then we could kind of have hard, I mean so, and then we would do the security checks at that point and then IOS can just flow. But in NVMe any queue can send any comment to any namespace, right? So the authentication thing wouldn't work or would necessarily need to break the current NVMe specs. So is, I mean what's the major advantage to having this capability? Is this if you had a bunch of short lived containers or something that provides you a fast way to sort of ensure that whatever data they happen to leave behind nobody else is going to be able to read when you kill the instance? Or is this that you're really trying to protect the integrity of somebody stealing the actual back end device? And how does this interact with in a fabric environment if you wanted to encrypt the data in flight over the wire, you presumably would be encrypting it twice. This is for protection of data at rest. If you want to protect data in flight then you're going to use different technology. You're going to use IPsec, you're going to use TLS, you're going to use something different stuff for protecting data in flight. That's not this. This is sending data in the clear over the wire and making sure that when it's stored it's encrypted there at its at rest location. So the other thing that the host that is key aware is going to have to deal with is the size of the key cache. That if you think of some of the most extreme applications for this, every user in the world gets their own key for every one of their applications. They have their own private key for bank number one, a separate private key for bank number two, third private key for a Facebook, a fourth key for Twitter and they decide oh I want to be forgotten from Facebook, erase my Facebook encryption key and boom all their data is gone. But that means an awfully lot of keys and you're never going to fit all of those in any one device at any point in time. So the host that's doing that interaction is going to have to be aware of which keys are in the device, which keys are in the KMIT database, be able to figure out which keys are being used the most frequently to keep those loaded in the cache and then take out the least frequently used and to inject a new key if someone needs to do I.O. to a particular key that hasn't been referenced in a while. So there's a whole management aspect of keeping track of where the keys are and which ones are being used at any given point in time. This is just the beginning of this very new how far will it go will it ever get to the point where we'll all have a couple of hundred or thousand of our own private keys for every person all around the world. Probably not but that's sort of some of the extreme cases that people are thinking we may eventually want to get to at some point maybe not with this technology but at least those ideas of being able to secure your data. Sounds like a startup idea. Thousands of keys you need to manage and your startup can build your tool that makes it easy to manage those thousand keys. Well it sounds like a storage manufacturer came up with the idea of storing all of those thousands of keys and maybe a CPU manufacturer came up with the idea of doing all the computes associated with the encryption and decryption and people even conspiracy theories there's lots of them that you could invent. Is there anything in this standardization that's talking about the actual encryption algorithm or is it just the keys? You said something about the standardization. Is it any of this talking about the actual encryption algorithm or is this just the keys? Right I mean does it specify what you're going to use to actually We've got three different groups involved we have OASIS involved because they own KMIP, we've got TCG involved because they own the security send receive protocol where this information is exchanged we've got NVME involved because of the SQE place where we have to get the key tags inserted. I think the earliest slide had AAS XT which is a standard encryption algorithm. That is the algorithm that we've picked so far is the XTS AES 256 but there is no requirement that that be the one that is used forever and ever. The protocols that TCG has designed include the ability to have different algorithms so as better algorithms come out as they come out with algorithms that what's the news? The quantum computing secure algorithms are going to eventually be coming along and we went there was an elliptic curve that came out at one point as a new style of encryption algorithms so the capability to use new algorithms is all embedded in the protocol because those groups are well aware that algorithms don't live forever. Anything else? Did he say how big the key tag space was? The key tag space is a 16 bit value so up to 65 K at once. Now if you've got a payment database with 10 million keys then you just page them. That seems big enough. That was the hope. It was also we didn't have many more bits left than that in the read write SQE. There's a list. Here we go. It's not just read and write compare copy verify read write write zeros on a pen. The command that does IO has to have the ability to stick a key tag in it. I have a question. So you set the space is 64 K and I think someone else asked this and if there's a good answer I didn't hear it but what prevents me from just trying keys reading other tenants data? If it's that small of a space what stops me from doing that? Is there any sort of extra authentication that is literally just a tag being passed down? The authentication happens during the connection process when you're injecting the keys so that assumes that the host will allow you to do that. If you have an insecure host that will allow you to send tags that you shouldn't be sending so that you can attempt to use that access that data then yeah that tag would go through and that encryption key would be used to decrypt the data and also ensure that yes you're allowed to use this tag. This is for protecting data at rest after it falls off the back of the truck. It's not for protecting from an insecure host that's allowing unauthorized access. So the one diagram that you've got with the VMs passing their own keys into the storage array how do you prevent the key index overlap? Ask the hypervisor vendor. This is not substantially different. I mean if you go back to the case where we had the self-encrypting drive the drives that exist in systems today the drives that are in probably all of our laptops today are encrypting data and we've never done anything with any keys because that's not the use case for that form of encryption. So what this encryption does is allow you to have more keys. It takes the existing algorithm that is per LBA based and it extends it so that instead of having to be pre-allocated LBA based it is now per IO so that each LBA could in theory have a unique encryption key associated with it. But that's the intent of this encryption and this process this project to create that type of security. It's not a be all end all of all of the other security problems that are still out there. Alright then we're getting pretty close to lunch.