 All right, so good morning. Be very helpful if those of you standing take your seats. My name is Gilad Ben-Yosef, and today, I'm going to talk to you about how to use encryption keys with Linux in a way that makes it exceedingly difficult for someone to steal them while still allowing you to use the keys. But first, let me introduce myself. So I'm a principal software engineer working for ARM. I work in the security IP division. I'm the Linux kernel maintainer for the ARM crypto cell device driver. And I also dabble with security-related changes to the kernel. I've been working on Linux specifically in open source for a long time. This picture over there is from this same conference the last time it was in France, in Grenoble, 10 years ago. And some of you, those were exceedingly long memories. Maybe remember me as one of the co-authors of building your bed in the Linux system, the second edition, now a long defunct book. So let's start a little bit by describing the Linux crypto API. Linux has a cryptography subsystem as part of the kernel, which offers cryptographic services, encryption, decryption, verification, and so on, both to the kernel itself and services within the kernel that need encryption. And it's also available to user space via a user space-facing API. And it generally looks something like this. We have one of the clients of this subsystem. Could be something that handles data at movement, like IPsec networking. Could be something that handles encryption and security of data at rest, like DMCrypt. Or it could be one of the user space programs via the crypto user space API. And these entities address the kernel or the crypto subsystem inside the kernel and ask for some security or encryption service, encrypt this, decrypt this, and so on, and so forth. And inside the Linux crypto subsystem, there are various what we call transformation providers. Now think about some code entity that implements, say, AES encryption. It could be more than one. We can have a pure software implementation, just read and see. We can have yet another software implementation, but written in handcrafted assembly, maybe using some specialized instructions without specific to a certain CPU. Or we can have a model, a software model, which provide the same services by basically going out to some other hardware, which is not part of the CPU itself. And we can have more than one of these for a specific cryptographic cipher, say, AES. In the crypto subsystem lingo, we call these transformations because, well, they transform data. You can actually have transformation for stuff which is not purely encryption, like compression, for example. But we're not going to discuss this here. And so we have various transformation providers, as I mentioned. For example, a generic software one writing purely in C, some transformation provider using specialized instructions, and possibly dedicated hardware. How do we use these services from inside the kernel? I'm not going over the user space API this time. So here is a very simple usage example. We start by basically allocating sort of a handle to one of the transformation providers. We specify exactly which kind of cipher while we're interested and in which mode. In this specific example, we want AES operating in XDS mode. And we continue once we get the handle by setting the key. This is the key we're going to use for all the operation, be it encryption, encryption, with this specific handle. And we follow up by allocating one or more requests, which you can sort of think about it as sessions that allow us to actually operate on the data, provide data for encryption on the encryption. And by the way, the reason for this relationship between one transformation handle but many requests is for something like you can think about something like IPsec sessions or IPsec different packets. They all use the same key, but there are different sessions from the standpoint of encryption. And then, of course, after we request that operation to be done, encryption or decryption, we have an option if we want to receive a notification, a callback when the operation is done. We can wait for it to be done. And we end up releasing all the resources that we needed. So nothing very out of the ordinary or fancy. Now you might ask yourself, well, if I have more than one transformation provider providing the same service, say three of them providing AS in XDS mode, how does the kernel choose which one and how do I even know which one are available? So the answer is that when each of these transformation provider registers itself with the kernel, it also registers a number, a priority. And the kernel simply picks the one with the highest priority as the one that will be fulfilling the specific request, assuming it matches, of course, the cipher and some other attributes. And that specific transformation provider can either fulfill the request on its own or further delegate handling the same request, or request, some of the request, usually some of the request, to other transformation provider to handle the same thing with a lesser priority. So that is helpful in the case that we have a very efficient transformation provider that doesn't necessarily is able to handle all the side cases. So it will handle what it can and pass over those it can for, for example, generic software provider. And we have a file, PROC Crypto, that simply lists all the available transformation provider. To each of them, it states the name, which is actually the name of the cipher in the mode we're operating on, which means there are more than one of these typically in the system. There would be more multiple entries that states I am XDS over AES. And it states what is called in the PROC file the driver, which is a driver-specific name that designates itself as opposed to all the other transformation provider providing the same service. So we have the ability to either request a cipher and a mode with a generic name, say AES in XDS mode, or we can also request specifically, we want XDS, AES, AES, and I, for example, if you're in a system where that is registered. And we'll see later why this is important for our talk. And of course, you can also see the priority and via that know which transformation provider will probably handle your request based on the specific ciphers and attribute that you ask for. So far, so good. Very simple, right? Wait, let's back up a little. I've shown you before this line. Crypto is the cipher set key that says, OK, here's a pointer to the buffer in memory with the key that I want to use for the encryption or the encryption that would have the operation. And this, of course, implies that the key sits simply in regular system memory like anything else. At this point in the presentation, if you're not panicking, it means you are not paying attention. Because thankfully to the attack of the cute logo clones, the possibility that some entity ran some code on your device that were able to gain access to memory illicitly and copy that key is something which is very real. What's happened? And so the questions that is raised by this is, can we do something about it? Can we allow cryptographic operation to be done using keys which are in the system but are not necessarily as exposed as simply being in RAM? And hardware protective keys are basically the answer to that. And the idea is really very simple. It says, look, one possible transformation provider is really a device driver for some hardware that hopefully can perform the operation we want, AES in XDS mode, for example. And hopefully does that in a sufficient level of performance, not necessarily by the way faster or better than the pure software implementation, especially not the one uses special instruction, but good enough. And since it's a different hardware entity from the CPU core itself, and it may have access to storage which is different from the system memory from the RAM, we will simply put the encryption keys inside this dedicated hardware in some sort of dedicated key store, which is part of the hardware. And when we want to use these keys, we will ask the hardware to do the operation on our behalf. We'll give it the data. We'll say, please use this and that key. We'll see exactly how in a second. And then we'll get the hardware to perform the encryption or the encryption without having the actual encryption keys sitting in the memory all this time. Now I will deep dive further into the details in a second, but I want to make it a version a little and speak a little bit about the history of the development of this feature in Linux. So this is not a new idea. You can think of something like TPM chips, which are kind of the same idea in different contexts. Maybe some of you have heard on the Google Titan chip. It's not something you. However, when I, as the maintainer of the ARM CryptoCell IP, wanted to add to the driver the support for using this concept of hardware protecting keys, I wasn't aware, I didn't see anywhere in Linux, some way to support this via the Linux Crypto API, which is what's stuff like the M-Crypt, for example, uses what we were targeting this on. And so I let my manager know that, well, I have to think about it. And it's probably going to take a while to come up with a way and convince the community and find an API. And I started to advise something in my head how to do this with the Linux keys subsystem. That was the general idea. And then I went to a conference. Actually, this conference, not this specific one, embedded Linux conference US that was held in Los Angeles a couple of years ago. And just like this one, it was held in conjunction with the Linux Summit. And while there, I happened to find a presentation entitled Using CQ Keys for this encryption from this guy, which I apologize, but are probably not going to pronounce his name right. From IBM. And it turns out that IBM support this notion of how to put the keys in their mainframe. Moreover, these guys did the work of finding what is a good way, better than what I envisioned, by the way, of adding this supported inside the Linux kernel. And they did the work. It was already upstream. It was there for actually a couple of years, something like that. At this point, you probably realized that I was in a dilemma, what should I do? And for a moment, I did consider to let my manager know that I'm working on implementing this and go to the beach for six months, because he wasn't part of the conference, so we'll never know. Unfortunately, I'm a really, really bad liar. And quite frankly, I was really nervous that he'll notice the tan. So I came clean and told him that, yes, somebody already did the work. So I know how to do this. I just need to implement the same thing that these guys did. And so I proceeded to do just that. So after adding the code and sending the commit to turn off, that the story does not end there. Because when Herbert and Sue and David Miller, which are the two cryptographic API maintainers of the linear kernel, saw my code in the commit description that says, yes, I've implemented this hardware protection thing just like the IBM guys using this API. They were very surprised, because it turns out they were not aware that the API supports this. And you might wonder at this point how did this came to happen. So it turns out that the IBM guys added the support for this in a really clever way. But they added it inside a platform driver, which is not technically a part. It has not seen the same directory as the normal cryptographic driver. So Herbert, Sue and David Miller were not aware of this, which is a surprising, although rather sad turn of events. But it seemed they did do a good job. So Herbert was happy with the general notion. It was just surprise. And so it was okay with living it as it is and act my patches. And so Linux basically gained the second implementation of how to protect the keys, this time not for the IBM mainframe, but for how embedded device or any, could be several of them, seven socks that use ARM Cryptocell. And as a side note, I found it really interesting. This is the kind of thing I really like about working on Linux specifically because you need to do something for this relatively small embedded device and you find out some IBM guy that's already added something for the mainframe and it's the same operating system and just use it. It doesn't normally happen with other software projects. It's kind of nice. Okay, so let's delve a little bit into the inside of how to use this thing. So basically let me iterate again, how does this work? So we have our, say, ciphertext that we want to decrypt. Of course it can also work the other way around. And we have the key. And normally we would give, okay, the ciphertext and the key and ask for the play text. And the way it works with protected keys is that we have our ciphertext, but instead of giving the key that we don't have access to and don't want to have access to, we're giving some sort of a tag to the implementation of the transformation provider supporting this, which in some sort of a secure domain, typically in a separate hubble, but it doesn't have to be, takes the tag, take the ciphertext and through some magic understand from the tag what is the right key to use for the decryption, in this case or encryption or what have you, does the operation and returns the data after decryption without ever basically divulging the key or even let it sit in the system memory, make it, at least in theory, so much harder to steal, because even if you magically have access to the system memory, the key is simply not there. Now, nothing in this description says what this tag is, and if you think about it, there's at least two things that it can be, one of them is it could be something like an index, use key number three, please, which somehow the hardware and questions know how to associate with a certain key. The other thing the tag can be is the actual key itself, but encrypted, possibly with some other attributes that let the hardware know which key to use to decrypt the real key that is used to encryption or decryption. And both options are completely valid and in fact, the two implementations that we have of this idea in the Linux channel right now, the IBM one and the crypto sales one each use a different approach, right? So this is something to keep in mind that we have these two possibilities. And basically we don't really care because the tag that we provide is some sort of an opaque object that sits instead of the real key and in itself it is meaningless from a security perspective to an attacker, hopefully. Okay, before continuing with the details, I must stress out that of course this is not a silver bullet, right? Things can go wrong and they will go wrong. Usually when interrupts are disabled, but that is different talk. The actual security of this scheme is greatly depends on the security what is called a secure domain, right? If we implement it in a secure domain be it hardware or software, which is vulnerable to some other form and attack, some other side channel property and so on and so forth, we of course can still leak the key. There's no magic here. So we're hoping that the secure domain is indeed more secure than the system. This is not necessarily the case, but it can be. And of course if somebody attacked us and gained access to a system, well, even if it's hard for him to very hard or even hopefully impossible to get the key, he might not need to get the key, right? Because it does have the ability to encrypt and decrypt. Still, it is interesting to note that it means that the attacker gains the ability to encrypt and decrypt using the key only material which is present in the system at that time. It does not give the attacker the ability to encrypt or decrypt future versions of the data that may be transmitted or gained later. And it does not let the attacker decrypt or encrypt all the versions of the data that might have been dumped before, but we're missing the key. So it does extend the measure of security that encryption provides. It's not meaningless, but it's not a silver bullet. And of course, key provisioning and management, it's always a problem. This doesn't make it easier, probably makes it harder. How do you get the key material inside the device? Who does it? How is it secured? And there can be many variations here, right? As I said before, one of the schemes available is to actually provide the key inside the tag, but encrypted that's a little bit easier. The other one can be, for example, during manufacturing of the device to put it in some safe IPROM memory which is not accessible from outside and so on. It is, however, a good component in the defense in depth strategy. And this is something which is very important to keep in mind. This is not a silver bullet, but it can allow us to make safer systems. Okay, so a few more details about how the interface work. And here I must warn you that some of the details, obviously, are specific to on-cryptocell. So first, the idea is that we simply use the name of the cipher, say AES, SM4, what have you. And we use the prefix of P for protected to denote that this is a protected hardware key implementation of this cipher. So instead of using AES, we will ask for P, AES, or P, SM4. And because, as I mentioned before, the tag value is really implementation-specific, most normal operation at least in this time will use the driver-specific name to refer to the transformation. So instead of asking P, AES in XDS mode, we will ask specifically for XDS, PES, CCRE, CCRE is the moniker for cryptocell. Because I want to use the hardware protected keys, this requires that I provide a tag which is in a specific format. And this is something to keep in mind. It's a little ugly, but it is what it is for now. And then instead of a normal key, we provide the tag in whatever format which is appropriate for the implementation that we use. For example, for uncryptocell, it's basically a small blob that has which index from the internal storage has the key. We have an option for two key. And what is the real key size? Because part of the API, when we provide the key, we need to specify the key size. But because of the mechanics of the thing, we're actually passing the size of the blob. So part of the tag becomes what is the real key size in the slot? And again, this is of course cryptocell specific. Now, in the end, this is all nice, but we really want to use it with something, right? So I'm shown here a small example of how to make use of this feature if you have it on your system. And by now, you should realize that you can tell if you have it on your system by parsing pro-crypto and seeing if you have P-A-E-S, for example, or P-S-M-4 enabled. And this is an example of how to use dmcrypt to encrypt or protect storage device with hardware keys. So basically, this is the normal evocation. You can do it with crypto setup, but it involves a little bit more details. So I choose this way because it's a little bit more clear. And basically this is the normal evocation and the only difference is that in the place where we specify, well, actually there's two differences. In one of them is that in the place right here where we specify the cipher and mode to use, we use the format of dm setup that lets us specify a crypto API name because dm setup supports two different way to specify the cipher and mode. One is a legacy one, which only knows a specific list of ciphers. And the other one says, just give me the crypto API this string. And then instead of saying XDS over A-E-S, we say XDS over T-A-E-S. So that's one difference. And the other thing that when we provide a key, instead of giving a real key, we're giving one of the blobs that hits format I've shown you before, assuming you're using crypto cell. In this case, specifying, okay, this is the key size and I want to use lock number one, right? And this means that when dmcrypt will do its magic and come up with a block from storage when it wants to encrypt or decrypt, it will call the crypto API. And in this case, because of the priority in other cases, I could use the driver specific name, it will use the P-A-E-S cipher over XDS instead of A-E-S. Instead of giving the real key, it will give this blob and what the driver will do, the transformation provider will do, will hand over to the hardware, the buffer to encrypt decrypt. But instead of giving the key, it will use the information in the blob to say, oh, and by the way, use slot number one, the key in slot number one, without really knowing what this key is at all. And of course, something needs to happen and that is of course, implementation specific in order for that key to actually have something and be meaningful. For example, in my specific example, it means that the manufacturer has burned during manufacturing with the right tools, the keys into the OTP, which memory, which is part of the hardware, right? But of course, this is completely implementation specific. So we get the general idea how this works. I would like to say that there are two things I want to get out of this talk. A, obviously not enough people know about this. I mean, if the maintainer of the crypto subsystem didn't know about this feature, I guess nobody does, right? So there it is, me letting you know, hey, this feature exists. If the system that you're using happened to have hardware supporting this, and it makes sense, of course, by all means considering using this. And the other thing is that I'm pretty sure that crypto cell, actually I know, but I can't tell names. I know that crypto cell is not the only hardware that has the ability to support this notion of protected keys. And of course, it doesn't have to be a dedicated hardware. One can think about, I don't know, securing clays, trust zone, zone, whatever you, anything goes really. I would like to encourage you to make use of this API and offer this ability to your customers because that will help us make it more mainstream and enable the system builders of the world to build better system, more protected system, which is sadly very important in this time and age. I'd like to conclude before getting to questions. A few thoughts I had about what next, right? It's not really something which I necessarily plan to work on or have a specific idea of what should be done, but something I would like to share with you in the interest of discussion. First is this PIS thing is really the right way to represent this. I mean, originally I would thought that something like a template, something protected hardware keys that specify the cipher and the key format will be slightly better, especially if we can get into a situation which I think is more interesting when you have more implementation that share key formats, right? Because if your hardware, for example, uses this notion of a key slot and key length, which is pretty sane and normal, then why would you want to invent some other key format? And if so, then all of these implementation can share the key format, but they can't share the key format with the implementation that used the encrypted key notion. So specifying which it is and sharing the format, I think it's a good idea, and this is right now not really possible in the way the name will work, so this is one thought. Another idea is that, well, this feature kind of sound like something which is already in the kernel and that is the kernel key rings, right? It has protected keys, it can use keys, store keys which are locked or sealed or bound to the TPM and so on. So this sort of feel like something that needs to be more integrated together, both with the kernel key ring or key chain itself and with the TPM notion in general. I mean, yes, the TPM is a different kind of animal. It's used in different contexts, but the general notion is the same. It is kind of sad that right now we have two, maybe three, depends on how you look at it, different ways to do something which is more or less the same thing. That's it. This is what I have to say. We do have some time for question, but just before that, if it's okay with you, I'd like to take a picture to prove to my boss that I'm actually doing stuff when I'm here not at the beach or at the restaurant, since this is Leon. So if you don't want to, you should see him in the picture, just duck. Cheese. Okay, that was all for me. Now it's your turn, question, remarks, recipes, whatever. Hello. That was easy. So you said that there is a possibility to have encrypted keys and tags. How would those tags be generated? Because it's probably not via this API. Yes, it means that you need to have some other way to encrypt the keys in the beginning. And that way is specific to the device. So for example, the IBM implementation, which I don't know all that about, has some other utility, other way to say, okay, here is the key and I want it encrypted. And my understanding is that this does not necessarily have to happen on the actual machine itself, right? You can have some sort of way to do it. It means that the entire management of this thing and key distribution is left out of scope of here because it's different per implementation. All right. No one else? I have another one? Sure. So basically any user space process could use this API to use the same keys. So as far as I understand it, because in the normal crypto API, I just pass my key in and I could, as a normal user, use the same key that you root used to set up my disk encryption. That's a good question. As far as I remember, there is no, as far as I remember, there is no permission check on the users of the, user space crypto API. So if you enable it and you don't have to, right? It's an optional feature. Then I think you're right. Yes. It needs to be said, though, that there's some assumption in building this. And the assumption is, which is not the normal way we think about this, that anything which is contained within the kernel and the memory it touches is sort of in the same security domain. And although we, as operating system programmers and users, we think about user space kernel border as something very important, it is important, but the assumption behind this work is that it's also porous. And as we see with Spectre and so on, that assumption sometime is true. Yes, is somebody there? Do you see any performance impact using the hardware key in cases like IPsec? Well, it's very much dependent on which hardware you are on, the properties of the system and so on. It can be faster, it can be slower, but have better power characteristics, can be the same. It, just the fact that you're actually using a hardware doesn't in itself mean anything, right? It doesn't mean it's faster, it doesn't mean it's slower. And performance is a multifaceted thing, right? It might be that you get slightly less performance, but for much less power consumption. So there's really no way to say anything without relating to a specific system in use case. Okay, my time is up. So there's... I may miss the first part of the sessions, but seems like to fetch a key or store key you have to provide a tag or handle to a key, right? Yeah, so is there any way to protect or check this tag or handle in the secure domain or some crypto cell? Because this handle tag usually just a number value, so maybe someone can just walk through the... So that's a good question. And first, the answer is very much implementation dependent, right? It's not really part of the API. It depends what the capabilities of the hardware. Specifically to crypto cell with the latest version of this IP, the answer is yes, you can check. A lot of properties relating to this key do some stuff like allow encryption, but not decryption, allow encryption, but just for a certain number of bytes and so on. But this is really outside the scope of this tag because it's specific to crypto cell. So the answer is yes, but it's specific to the specific IP that's specific implementation of this API. All right, I think our time is up. So thank you very much. And if somebody still wants to ask me questions, I'll be just outside. Thank you.