 Hi, everybody. The disk talk is using on-clave technologies to help secure secrets in a distributed world So if you're in the wrong room, please stay. This is gonna be a better talk than whatever else you want to do My name is Luis Miguel Wapaya. I'm a principal technologist at Wind River My specialty is security. I've been a hacker for 44 years. I hacked my first bank when I was 14 years old I started as a phone freaker in the late 70s and and then a game cracker on Apple too in the early 80s and then I Progressed from there and I basically spent a lifetime doing security related things So we're gonna set the stage for the talk We're gonna talk about Starling X, but it's more than just Starling X It's open source in general if you look at all different offerings all over the board You end up having issues that are very common So Starling X is often used to deploy very large-scale distributed systems And these systems are often geographically dispersed and we're talking about national level and sometimes international level the other problem which is I wouldn't say new but it's definitely becoming much more prevalent is The fact that the endpoints that run software are physically unsecure for many years People design software and systems under the assumption that Where they live and run is physically secure like in the data center, but this is no longer true I mean if you look at 5g for example, which is a or telco in general You will find servers that are effectively in a Steel cabinet at the bottom of an antenna and this one is literally in the middle of a farmers field And so physical security of these devices is basically non-existent There's a either a padlock that locks the the cabinet or there's a lock you can pick them You can easily open them some people say oh, yeah, but they're alarmed It's like well, if I'm in the middle of a farmers field and the alarm goes off I have a good 20 minutes before anything shows up so I can do lots of bad things in 20 minutes So that causes a bunch of problems The first one is Starling X and open source in general Uses components that handle very sensitive information the most predominant one you'll find is obviously TLS keys Well, you look at Kubernetes you look at vault you look at anything with an API endpoint. There's a TLS key behind that, right? So, I mean these are a list of a few of the things that we run in Starling X Not all of them because there's just too many of them and I couldn't be bothered to find all the icons for it So I got lazy at some point But they all have secrets in some form or another And so you end up having too many instances where a sense of information is effectively a file on disk And I mean a plain text file on disk So you're gonna end up with a PIM file that has a private key inside of it on disk Or you're gonna end up with a token that's saved on disk or credentials like username password file on disk And that's you know pretty bad like one of the examples like to give is disk encryption So anybody here ever used de-encrypt with lux. Yes. Okay. So one of the things that you need is you need a passphrase and So if you're in a system that's unattended and has to be able to reboot automatically the challenge was that with that is Where's your passphrase? It's gonna be a file on disk in plain text Well, that's kind of crappy because now I can just unlock your encrypted disk because you gave me the information that I need By the way a sign note disk encryption is only good against physical attacks It's not good against insider attacks or code injection attacks other components that implement storage encryption for example vault anybody here use vault Right to unlock your vault. What do you need? Yeah, you need some form of credentials. Well, if you're on an end to unattended system guess where those credentials are They're on plain text in a file somewhere, right and so You end up having roots of trust that end up being in plain text So even though you might have a layer of systems that like oh, you know, my key is encrypted inside the inside of file Yeah, it's encrypted by what oh, it's encrypted by the system over here. And what's the root of trust for that? Well, that's a root of it's a plain text on disk So that ends up being a pretty big problem especially in very large-scale ecosystems where you just have tens of thousands of nodes everywhere if you're able to Compromise one node and steal a bit of information on that You can now use that as a trampoline to effectively start attacking other nodes in the system So we end up in a situation where sensitive information is either in the clear on disk as is or it's protected by a series of Mechanisms that eventually end up being plain less on plain text on this and it's basically turtles all the way down So there's more Too many deployments and we're going to talk about TLS keys a little bit here too many deployments fail to use TLS best practices I don't know if anybody's aware of what they are But one of them is you should never use a long-lived TLS key directly Right, so if your key is like this thing last one year, but I use it directly you're doing it wrong TLS keys Optimally speaking should be ephemeral in nature. So when your service pops up is to generate a key pair It should sign the certificate for it. It should use that And it should last for a very small period of time You know goods well secured services out there will actually have TLS keys that last 10 minutes Maybe 30 minutes and then they just rotate continuously and the reason why you do this is because You have to work on the premise that your environment will get compromised at some point And when they do compromise all these services the private keys are in memory in a piece of software kubernetes vault whatever The plate the private key that is behind that certificate is in memory So if I can compromise your environment and read memory I can steal your key Right, so if your key lasts 10 minutes and my attacks take five minutes to do And then using that key to attack something else takes me another five minutes By the time I get there the key has expired So you want to reduce the window of usability when it comes to people stealing your TLS keys The other thing that people do is self-signed TLS keys. How many people here have made services that use self-signed TLS keys? Raise your hands. Don't be afraid It should be most of you by the way Some of you. Yeah, there you go TLS self-signed TLS keys are absolutely useless unless you use certificate pinning But the problem with certificate pinning is nobody actually does it securely correctly So yeah, you do certificate pinning But the way you distribute your certificates is completely unsecure and if I'm a man in the middle boy, am I going to have fun with that, right? There's no revocation capabilities with certificate pinning And also It's always long-lived keys. So you can't rotate keys every 30 minutes if you use certificate pinning unless you have The most amazing certificate pinning system on the planet Right Most TLS keys in open source end up being files on disk Right So if anybody compromises your environment and is able to just read a file They can just steal your key and that's not a good spot to be in if you have long-lived keys, right? And as I said before many of them are just plain text or they're protected by routes of trust that themselves are plain text So people say well, I use vault. It's like and and how is vault protected? Just follow this go over here. It's like oh Turtles all the way down So this is very low assurance as particularly for endpoints that are physically insecure. So There's servers out there the servers that are in these telecommunication towers They're like Dell servers like regular off the shelf stuff and they have these handy dandy swappable disks Right. So one of the attacks I've seen is the person breaks into the cabinet. The alarm goes off They pop out the disk. They put it in a disk duplicator Two minutes later. They put put pop it back in close the cabinet put the pack log on and bugger off So somebody shows up At the tower going while the alarm went off and there's no signs of tampering anywhere and they go Oh, it's a false positive. So now they're completely unaware That sensitive information grew legs and detecting that a tls key has been stolen is near impossible And that's the thing that since city is after somebody steals your a private key for one of your tls certificates How do you know? You can't really detect it very easily, right? So they're also very susceptible to insider threats. So 75 percent of data leakage in the industry today is because of someone you're paying Right. So either there are idiots which happens often Or they're malicious. They got paid and that also happens. Believe me. It actually really does happen And obviously very susceptible to host compromise So there's a whole bunch of other sensitive information that falls basically under the same problem umbrella so Oh, okay, that didn't click quite well, what are we trying to solve? So what we're trying to solve is we want to use confidential techniques confidential computing techniques More specifically hardware mediated enclaves, which means your root of trust is rooted in a hardware device. It's not software Um, in order to increase resistance now I see increased resistance because if there's any out there that think that hardware media enclaves is like foolproof Boy, I have some really bad news for you. Um, and so it's all about increasing resistance to a level where You feel relatively safe, right? So we want to protect against physical attacks. So if somebody shows up Pups open a disk dupes it runs away. Um, you're good, right? Insider attacks against if somebody's an idiot or if somebody's maliciously trying to steal your stuff We want to protect against that and code injection attacks. They still happen. They're less frequent. So I'm very happy about that By the way, congratulations everyone, but they still do happen Um, and they're very hard to detect by the way So the goal is to significantly increase assurance behind the root of trust guarding secrets with it within an unattended system That runs in a physically insecure location and that's what we're trying to achieve So how do we do it? Well, for the purposes of this study and I I've actually been doing this for a very long time But this particular study we ended up using intel xgx as a hardware media enclave. Now, I'm not an intel xgx commercial That's not my job. Um But intel chips are very widely used in industry So, you know, if you look at the telco industry right now, a lot of the 5g towers everywhere. It's just intel chips everywhere Open rand is just lots of intel chips everywhere um All things considered For this use case that i'm going to talk about intel xgx actually does provide the best assurance Level the possible. There are other options. There's amd scv snp That's actually a really cool option if you're trying to protect against threat factors Wish them from site channel attacks So if you have a payload in a multi-tenant environment that is confidential But you want to protect against the hypervisor admin or hypervisor compromise Or another tenant in a side a payload on the side then amd scv snp is actually a really good solution It's very easy to deploy compared to intel xgx But in our use case when it comes to protecting secrets at rest That are the roots of trust of a chain of secrets after that amd scv snp Unfortunately does not give us the mechanisms that we need and this is this case What we're missing is a data ceiling facility and we're going to talk about that in a few seconds Whoops, that's not what I want to do Um, there's arm trust zone. Anybody ever use arm trust zone here? Absolute one per two. Oh, wow. That's shocking. Wow. Okay Most people don't use it. It's a glorified hypervisor is just an isolation layer between All the payloads in an operating environment and One program The problem with arm trust zone that there's no data ceiling facilities and there's no cryptographic facilities like Amd scv snp encrypts the entire vm in memory Right, so even if you do like a Cambridge cold boot attack You're you're screwed. You can't steal what's in the vm But with trust zone if you actually did that you actually can steal what's inside the trust zone Intel quick assist anybody ever heard of intel quick assist That's actually a really cool thing. Now, usually they come in as PCI cards. So it's extra dollars. However Intel is coming out with sapphire rapids Which is a really cool chip that has a cpu and a bunch of accelerators on the same die And one of them is quick assist. So that's really really cool stuff I'm looking forward to it And then there's tpm chips anybody here to use tpm a couple of people use tpm They're pretty useful, but they're not very widely used. They're a bit onerous to manage Especially when it comes to if you're trying to lock a key to a process And then you upgrade that process you can no longer use the key And so management of the keys inside tpm chip is a nightmare and then imagine if you have 10 000 nodes It's a nightmare amplified Which is why for this particular thing tpm Is a little bit on the back seat of things being considered So the solution proposed therein could be converted to run on other technologies, but for the most part intel xgx seems to be The hardware mediated enclave that gives us the best potential So what is intel xgx for those that haven't heard about this? Basically, intel xgx is a way to Protect a part of a program Which is basically a dot s o file. It's a shared library that you load in memory But it's loaded in in cryptid memory in a way that nothing else in the system Can look into that thing like you could be rude god controller of the high of Colonel in that machine. You still can't read what's in the payload and that's both code and data The other thing that intel xgx offers is you can only only load digitally signed enclaves And so if anybody tries to tamper with the enclave in order to try to compromise it They effectively blow the enclave like it can't be loaded anymore. It's it's basically tamper proof The memory in memory runtime is encrypted And very high resistance against insiders So if you're an insider and you're trying to debug the program you actually cannot debug an intel xgx enclave No matter how high your privileged level can be Very high resistance resistance against compromised kernel So one of the cool things about intel xgx is it can't be used at the kernel level And it can't be Looked at at the kernel level Well, not easily. So again, like I said, there's nothing perfect intel xgx does have exploits against it But they're very difficult to do and the attack complexity behind them is relatively high Um Intel xgx only protects part of the executables. We talked about that. It's a share library. We talked about I'm just gonna skip this The other thing that intel xgx offers which I think is the coolest feature which helps solve the problem at hand is Data sealing. Does anybody know what data sealing is? Nobody. Okay. So what happens is When your code inside the enclave you can use an instruction says that says Seal this piece of data with a cryptographic key. You don't have to provide the key The key is generated based on the signature of the signature of the enclave Mixed in with secret material that's inside the cpu that's been lasered in at manufacturing time. That's unique to that cpu Plus some other stuff that i'm not going to get into but it generates an aes key And basically if you encrypt If you seal data inside an enclave the only place in the world to unseal it is inside the same enclave on the same cpu so if somebody seals Encrypted file that's been data sealed by intel xgx and they go somewhere else There's absolutely nothing in the world that can do to recover that data People go like ah, it's like in the movies. They're just going to crack the cryptographic key I'm like, yes in one septillion year. They'll probably pull it off Um, so it's actually really good and the reason why it's really good is if you Persist secrets on disk at rest You can seal them so that they can only be loaded inside intel xgx when the system starts up On the same system only Yeah, it's pinning to the hardware. Oh, it's pinning to the hardware. You can't get around that which is really good As a result intel xgx is actually incredibly good against physical attacks in terms that if somebody Duplicates your hard disks and runs away with it There's nothing in the world that can do to compromise that it's very good against insider attacks. Now if your Insider is an incredibly sophisticated attacker that knows about how to exploit intel xgx and they have a lot of time on their hands They could eventually get to the point where they can gleam Or derive or deduct what a cryptographic key is memory Unless you use very good cryptographic libraries that have protections against timing attack and that kind of stuff in which case they're in trouble Um, and root privilege users on the system. They can't bypass the protection that xgx does they actually can't turn it off So use intel xgx and data ceiling to guard roots of trust is what we're doing We're going to resist physical attacks resist insider attacks and resist code injection attacks Let's talk about some examples. I have 11 minutes Okay, we're going to talk about cert manager. Who here uses cert manager Cert manager is actually really cool. I like it. Uh, the only problem with cert manager is the root of trust is plain text But there's a way to change that so anybody here ever developed their own local issuer for cert manager No, so cert manager has a facility where when you generate keys You can actually ask cert manager to sign them for you and it will do so using a local issuer So instead of you having to create a certificate signing request walking off to digisert having digisert signing and coming back and inserting the key back in Cert manager will just do it locally on the computer using a local issuer What you can do is you can develop a local issuer that uses the intel xgx crypto key library And that's basically a pk cs 11 cryptographic module, which means you can actually save tokens Long-lived tokens inside the library And they're going to be data sealed on that machine and there are only usable inside the xgx enclave ever So your root of trust Is always encrypted no matter what? Right, which is really cool and you can also put attributes. I don't know if anybody ever used pks 11 before But you can put attributes to basically make sure that the key is non extractable and a whole bunch of really cool stuff xgx library effectively drives the ps 11 engine it all happens in in crypton memory The other thing is the intel xgx Crypto key library has resistance to timing attacks And other side channel attacks like electromagnetic emissions So there's actually really cool stuff you can do to try to own things now This kind of stuff you're you're getting to the point where you actually have to steal the whole server and and run away with it Not the best idea in the world for people trying to serve just to see compromise and enterprise We can use this trick to create a local subca so effectively Your Cert manager becomes a local subca that can sign tls keys that are ephemeral in nature So kubernetes can start up With no keys and go like I need keys and like, okay, let's create keys. Oh, a certain manager is my guy Okay, I'm a certain manager create me some keys and sign everything. Thank you very much And it signs them and it keeps them locally everything's local. So it's not like you're using networking to do anything And each node In a in a tree of nodes So if you look at starling x you have like controllers and then you have worker nodes And you can have a hierarchy of nodes that go all over the place All you need is Your local node the first time it starts up will generate a root key So one single key pair and generate one csr And then you have to give that csr to the parent node and the parent node will use its cert manager to sign it And give it right back to you and that's the only network activity that will ever happen But you end up in a position where If you follow all the way up the chain to the original controller The original controller is the only one where you generate a csr and you have to go to digisert And have it signed by digisert, but now what happens is Everything in your ecosystem is effectively signed Public root ca So you don't have to distribute certificates saying like trust this the public root ca is every operating system on the planet already has A trusted library of these available the second you install them. They're instantly verifiable So it makes distribution of certificates and verification of certificates quite trivial You end up having a cert chain. That's uh four levels deep about Yeah, it's it's it's better than the alternative Right. So yeah, when you verify a cert sometimes it can get expensive because you have to verify You do a asymmetric encryption for each level inside the cert But with acceleration today that exists in in modern chips One tenth of a second maybe Actually, if you use ecc, it's even faster because ecc verification as opposed to rsa verification is like blinding fast So by the way, you should really use ecc key if you use tls Uh because of that because the clients and not only that but it's not the server that does the cryptography It's the client that does the cryptography. The server just says here's my cert, right? Does that answer your question? Yeah Actually Yeah Um Cert manager can now rotate keys all the time Right now I say this with a caveat uh kubernetes and a lot of software out there and we're going to talk about that when I conclude Open source software today. One of the things that it doesn't do is it doesn't do key rotation very well kubernetes For example does not do it whatsoever unless you bring down your entire cluster Delete the keys and restart your entire cluster, right? and so I think the open source community needs to start considering the concept of I need to rotate my tls keys as the program is running and it's not that difficult. I mean You know one second you're accepting one tls connection. Whoop rotate my key the next second you accept another tls connection It's not complicated, right Um Because of that there's no certificate pinning required anywhere, which is really good because today with starling s there is And we are able to use publicly available certification revocation So if something does happen if somebody physically steals Your server and you go like well damn, okay I'm going to take the key that was in there and I'm just going to add it to the the revocation chain and that's it automatically You take advantage of automatic revocation So for disk encryption, we did something cute. We actually and I actually ended up kind of want to say we I mean me by the way We ended up doing a program that is started the second the operating system starts up And that what that program does is it loads the xgx enclave that recovers the passphrase Right now at some point the passphrase does in fact get passed In clear text to deluxe file system and there's nothing you can do about that But at no point in time is your passphrase a plain text file on disk, right? I'm going to have to accelerate So we basically protect the passphrase in memory The program is called during bootstrap. I'm going to skip that for hashikov world You can use the same thing the credentials that are used to initiate the vault You can protect them inside intel xgx and you can use the same methodology for Basically anything that requires a route of trust to initialize it But it's not good enough intel xgx by itself has a problem, which is It can't verify the validity the validity of the host program Which means anybody can run that intel xgx enclave So how do you prevent that from any from happening? Right, so if i'm a local user a malicious insider and I want to recover the passphrase for that dm crypt drive I can just call the enclave and it'll just give it to me like here. Here it is. Oh, that's very nice um And so as a result of that we end up in a situation where um I'm going to skip that Yeah I'm going to go to the next one as a result of that we ended up in a situation where we need to restrict Who can load the intel xgx library? And we're going to use basically or I should say more precisely What can load the intel xgx library? We're going to use app armor So app armor one of the things that you can do with it, which is really cool Is you can say that program over here is a lot to use read this follow here, but nobody else can do it So you can literally say that dot s o file over here, which happens to be an intel xgx enclave Um is effectively only readable by this program over here and nobody else in the system can do it So if you're a malicious insider and you show up and you say like i'm going to just randomly try to load this intel xgx enclave They can't pull it off only that one program can do it So it basically prevents unintended use of The dot s o libraries by programs that are not supposed to read it And that's one of the ways that you can and kernel level components i'm adding that they actually can't load intel xgx whatsoever So people I told that in a previous talk when somebody said well if i'm kernel i'm not subject to app armor So yeah, but if you're kernel you also can't load an intel xgx enclave So when what's next so what are we doing? So there's a prototype in the works We're going to add it to startling x hopefully in the future And we are implementing a local issuer that uses Intel xgx or more specifically we're adding the Intel crypto key library xgx Into a custom issuer that cert manager will use And we're going to also create a mechanism that allows you to automatically once you generate your root key Have it signed by the node right above it And so you can create automatically create the certificate chains We're gonna yeah, I already said that We're going to implement disk encryption Now we're not going to do full disk encryption because there's a real performance impact for the real Full disk encryption So what we want to do is we want to create a bunch of small volumes That are encrypted and then wherever you have sensitive files on the system You just use sim links and put the file inside the virtual Drive and then each virtual volume you can give it different access rights So Or you can load them and then unload them when you no longer need them Which is actually one of the cool things that you can do with that So we're going to add this to for disk encryption and then future work is basically going to focus on looking at ways to Implement protection for things like vault or things like sef or things like postgres or whatever It's all the same trick, right? It's an intel. It's a very small library. It's all the same trick. So it's easy to just reuse an open source version of You know the the the intelligent xgs library that i use for that i'll publish for The disk encryption stuff to just say now i'm going to just reuse that for vault or i'm going to reuse that for insert other open source component here Uh conclusion We need to significantly upgrade the assurance level of different routes of trust insider systems You have to work on the premise that your system is physically unsecure It's being used by Actors that sometimes can't be trusted or or are incompetent in nature And as a result the sense of information is at risk, right? The other thing is we need to become better at using tls Today tls is just People are not doing it correctly long lift keys everywhere Self sign certificates everywhere. Some of them don't even use certificate pinning. So I don't really know why they use that to start with Very frequent rotation I do mean very frequent rotation. This is the key to success You have to take for granted that There will be a system compromise And you have to take for granted that somebody could steal a tls key that's used by kubernetes or something else in memory While if that key is very short lived their window of opportunity is is very small. They can't do much with it, right? No self sign certificates and then Community needs to approach open source components such as kubernetes. So Oh my god, I've been stopped We need to start implementing better things with kubernetes where kubernetes can in fact Rocate keys in memory instead of having to bring down the entire cluster And that's the end of my talk a couple of minutes off. But anyways any questions? Yes Oh, we can take it outside if you want. That's probably a good idea. Okay. That's awesome Anybody else Yeah, yes Yeah Well, there is Oh, I'm going to talk to you after that. I'd like to have a copy of that. I'm just going to steal that I didn't know it existed. So that's news to me. That's great anything else Yeah Uh with app armor Oh, yeah, okay, I see I see what you mean. Well, you have to get a little bit sophisticated about what you're doing So you can't have a multi-tenant environment That is able to read all the files You know in starling x usually you have sys admin for example So you have a restricted number of users So you have to do Good user management. You can't just say like hey everybody just create your own stuff. Um, yeah Anybody else if you have any more questions, I'm going to go out of the way for the next guy But I'll be outside if you want to ask me questions and I need that link from you Thank you very much