 presentation please close the door as quietly as possible yeah like that you want you can evaluate seasons in this link please tweet about the presentation about the conference and write blog posts about the conference and I should promote grand finale today and the end at the 1630 at the 105 there will be a awesome things to win and now please welcome new presenter Nathaniel McCallum and something about photography thank you very much everybody hear me all right we're going to start off with a question how many of you here have a laptop with an encrypted disk okay that's a really good number okay now we're going to do another question how many of you are feeling really guilty that your hand wasn't raised before anybody yeah all right we've got some honest people right because sometimes encryption can be a little bit of pain we're going to talk about some of the problems of key management and how we hope to solve them I do ask your patients we are having some technical difficulties my slide unfortunately is on the cloud it is not on a USB key and apparently the cloud provider had a massive data failure and the entire database or the entire data center went down there was the battery backup didn't work and all kinds of stuff so we're going to try to hobble through this and actually I was able to get a live feed here from their data center and unfortunately they're all running around like crazy right now because this is popping up everywhere in their data center and so we really need to ask a question right can we automate this because if you know usability is a major impediment to security this is something we should really try to solve so before we can answer that question let's look at how this encryption is typically done so it all starts off with a secret right oftentimes this secret is your your hard disk it could be some data that a service has but it's we have something that we want to keep secret and then we usually encrypt that secret and using an encryption key now as soon as your secret grows in size one of the things you're going to quickly realize is that you don't want to constantly re-encrypt your data because that's a lot of work in fact there was a really great talk about about lux to disk re-encryption they're going to be doing it live so it's really really cool stuff so check that out on the web if you didn't go to the talk but but yeah trying to re-encrypt anything that is of a sizable amount is just very difficult so what do we do in the case of compromised keys well we actually wrap this in another key okay so we have the encryption key which protects the data itself and then we have a key encryption key which protects the encryption key and the reason that we do this is that if the outer key is compromised we can just change the outer key without having to change the inner key okay then you don't have to re-encrypt your data and the typical way that this is often deployed as you saw in the beginning of my slide is that you start up your computer and you have some kind of a password and you type your password in and what that password does is it gives you the password is the key encryption key the key encryption key is then used to decrypt the encryption key and then the encryption key is used to decrypt the data and then of course this is distributed out to all of the admins who are supposed to have access to this device of course this this immediately comes with a problem you have multiple people all sharing this password and there are lots of ways to get around this Luxe gives you multiple slots for instance and that's a that's a great solution but it does have its limitations it doesn't scale and so one of the ways that we can try to automate this is we actually generate something that's a little stronger cryptographically so here we have a cryptographically strong random key and then we store this in some remote system which we can then fetch at a later time and this is pretty much the standard escrow model or is it can anybody tell what I'm missing here what's up more encryption keys you see because we can't just transfer this key over the wire because if we did so then anybody who was listening of course could immediately figure out what our key was and then we'd be sunk so we've got to encrypt the channel with which we distribute the key and this is typically done with TLS or with GSS API and now we're done right oh yeah we actually have to authenticate the parties right because we can't just send this magical key to some server if we don't know that the server is the server that is claiming to be so the server itself has to have a key as part of this authentication some kind of identity right and we're done right no because we're gonna fetch the key back so now we have to know that the client also has a key so that this server can verify that this person is supposed to get the key that they're supposed to get so anybody notice that a tendency here we're getting a lot more what keys okay so now we need a central place to manage all of these keys and this is typically done with either a KDC in the case of using Kerberos for instance with GSS API or you'll have some kind of a sort of certification authority if you're using TLS and of course you have to handle the issuance of the keys you have to handle the revocation of the keys because if anything happens if those keys are compromised then your key encryption key is compromised and so on down the chain and of course we also have to have backups right because we can't just deploy this now complex system without having some way to restore to this very complex state because keep in mind this escrow server is probably storing lots of keys from lots of different places and if we all of a sudden lose this escrow key we are lose this escrow server's data we're really sunk so we have to keep really accurate backups up to the minute right with all the keys that are changing hands and this is a fully stateful process here so finally we are done we can rest assured that there is not going to be any kind of problem whatsoever well you see we have this problem now that we're using encryption all over the place to try to protect our keys but we all we're doing is increasing complexity in the stack and anytime that we increase complexity in the stack we're increasing our attack surface and so every little link in this diagram now is a potential point of vulnerability for our system so what we've done here is we've made it truly easy to get the key back but we've also opened up ourselves to a lot of new risk so we've learned a few lessons here we've learned that complexity complexity increases the attack surface we learned that escrow is difficult to deploy and we also learned something else which is that speed matters because if we go back a slide here notice all of this complexity every single hop in this complex chain adds latency and so when you have a data server of a hundred thousand servers all coming up at the same exact time from a power outage what's gonna happen to this server right here we're gonna have a massive bottleneck a denial of services all of these servers come online at once and try to get their keys back so speed matters so question number two can asymmetric cryptography help us here and the answer is actually yes and what we did is a year ago actually after the last dev comp in some brainstorming meetings we came up with a new model and we called this the day of project day oh was Greek for I bind some so binding things together as the idea and one of the things that we wanted to do was we wanted to move the state out of the escrow server okay so in this case what we do is we take this key encryption key and we encrypted again using the public encryption key of the server notice we're doing asymmetric cryptography now this step can actually be done offline as long as the client has these the server's public key and the end result is that all of our secret secrets still reside on the client in this system right so we no longer have any complex state on the escrow side of the equation so then what we do is during the time in which we want to do decryption of this key we then send the encrypted key encryption key to the day of server the day of server just uses its local key which can be stored in memory so it's all very fast and stateless and it can return back the disc encryption key notice it didn't store anything and so we don't we no longer have a central point of compromise in the server and also we've reduced the load significantly because we're not calling out for this guy oh every time a request comes in so this is definitely an improvement however we have some keys here as well we have two keys in particular first we have a key for encrypting the channel where the key is being transferred across and second we have the asymmetric key pair for doing the public key crypto and so this is this is better we don't have like a full assortment of all of these arrays of keys but we still do have some keys which presents some complexity and we still do have the sort of heart bleed problem right if anybody can penetrate this outer layer of encryption then when the key is coming back and it's decrypted states to the client at that point if somebody can break the encryption then they can get the key so we also still have certificate authority and backups and one of the things that we discovered when writing this software and in fact there was releases of this software and one of the things we discovered that this part here actually made it pretty complex to deploy even with fairly step-by-step instructions when people were trying to reproduce this setup they had difficulty and we thought that that was not a really great way to start so we began to look again and we learned some lessons here asymmetric crypto makes it so that the server is stateless and we really like this feature because it means that we can get high performance it also means that we can reduce attack points like say on the disk chain and whatnot the asymmetric crypto also allows for offline provisioning which is a really nice feature particularly think of in cases you know where you might have spotty internet connectivity you can have the public key on the flash drive provision a whole bunch of systems and then when the network comes back up you can you can continue to work but one of the things that we identified was that sending the keys over the wire is a risk and we'd like to ask can we do better than that we also learn that x509 takes a lot of effort so our project died we killed it and we moved on to something which we think is better so let's move to question number three must the key go on the wire now my intention today is to kill someone with math so I want to see a hand if you're dead later so this is just standard alga mall encryption okay this is this is not me doing anything anything special what we do is the server generates a key pair sends the public key to the client the client will generate its key pair and will encrypt some data using it and then when you want and then now the data is encrypted and when you want to perform decryption on the other side the client returns at this K this a value in this K value to the server the K value is the encrypted value and the server performs a mathematical operation and returns the uppercase K here which is the plain text data okay so this is actually going to be the basis for the new crypto in this project we're going to move one step over watch very carefully we'll go back and forth a few times notice that nothing on the left-hand side changes okay so the left-hand side which of alga mall encryption is going to stay exactly the same in this key exchange we add only one thing which is we take this X value here and we generate we generate an ephemeral key pair and by mixing this additional ephemeral key pair at decryption time into the value that we send to the server we can send the key to the server the server performs its side of the mathematical equation and returns back the result and we can now calculate the K rather than the server calculating K now the interesting thing that this does is this means that there's no K ever on the server that means the server never knows anything about what's going on in the client it just simply says I can perform this mathematical operation and if you can contact me then I can perform it for you but it knows literally nothing about the client at least algorithmically so we created a new project called tang and this is the tang model it looks very similar to what we had before we have our secret in the middle we have our encryption key we have our key encryption key which is generated as part of the exchange with our tang server there's a lot of stuff missing in this slide now notice that there's no longer a TLS channel here and that's because this is done completely over the wire it's the same as a diffie helmet same properties as like a diffie helmet exchange or a public algal mole encryption by transferring these over values over the wire nobody gets any advantage and this is provably secure so the end result is that our key encryption key always stays local and the server never sees anything of course we still do have to have backups but our backups are much more limited in scope now and the reason for this is because we have we have in tang a set of keys for doing encryption these keys can be generated at will whatever whatever your key rotation policy states and at that time they can be inserted in backups but aside from key rotation there is absolutely no state whatsoever on the server and so this makes for a very very limited attack footprint the size of the server is extremely small this is the the project page here and this includes the server side demon with the clevis pin which we're going to talk about in a minute don't get too excited we are nearing our first release and it has an extensive test suite it's really fast so we can handle north of 30,000 requests per second so if you think of a data center where you've got a hundred thousand machines all coming up across the span of about five that's five seconds we can handle this on one computer so very very lightweight very very fast very very small attack surface and substantially tested so now that we have accomplished this task of taking that secret that we want that data and we've now bound it to a third party and it can only be decrypted in the presence of that third party let's do a little brainstorming about some of the other kinds of things that we could do obviously the the one that Intel would be very interested in us using is the trusted platform module and this indeed would guarantee that the for instance the disk is in the computer right because there's a physical connectivity is what we're testing for there and think start thinking now in terms not of keys but in terms of relationships right so in this first case we have a relationship between a disk being in a chassis in the second case we have a Bluetooth elli beacon so we can implement this tang protocol over Bluetooth low energy and you could say up in the ceiling of your office you could have this Bluetooth low energy beacon and only the people that are within that you know 30 feet 50 meters whatever the range is not 50 meters but whatever the range is on your on your Bluetooth beacon you know that's the range in which they will be able to perform decryption and if they go outside of that range they can't perform decryption anymore another case might be that we during the provisioning time we generate a random key and we print a QR code and you take that QR code and you stick it in a safe right and this is your this is your last ditch recovery at this point all other methods of decrypting the data have failed and so you take this physical key and you go and unlock the safe and get back your QR code and then you scan it in the in the webcam so obviously techniques like facial recognition fingerprint scans mobile phones can do the same thing with Bluetooth right using the same protocol and then the standard smart card and RFIDs so these are all different kinds of relationships that your data can have with objects now how many of you want to Josh Bresser's talk yesterday on security security everything it was a fantastic talk and one of the things that he said in this talk I think is really worth highlighting which is that security is not a binary and oftentimes when particularly doing encryption we often think in these binary terms right is the data secure or is it not secure and what are the things I wanted to do in shifting us to thinking about relationships between data and the third parties is that we can actually now begin to think in terms that are not necessarily binary and this is this should not surprise us at all because this is precisely the way that we human beings think when we walk into a room we can quickly assess many factors in the room and you know let's say I find a white fridge in a room well that may not be anything on its own but if I find a white fridge and you know a bunch of other white stuff and a surgical table and you know I might start to piece together that maybe this is a hospital right so we gather all of these pieces of data and we weigh them collectively and we establish the ratio relationships between them and based upon this we can reason about things and this is oftentimes the way that we reason about security right if you when you're asking the question maybe am I in a safe environment am I in a good part of town there's not a sign that says you are now entering you the bad part of town right if there was lots of people get sued probably so so there's not a sign that's telling us that and so the way we reason about it is that we enter into that environment and we begin to have relationships with all of the objects that are around us and we begin to reason in ways that are not necessarily binary about our relationships between these objects and so one of the questions we want to ask next is how do we make unlock policy non-binary and there's actually a way to do this it's a technique called some Shamir secret sharing and the way that Shamir's works is that it allows you to take a key one of your security keys here and you can split it up into an arbitrary number of subkeys you also during this operation define a threshold the threshold can be like one in which case if any of these keys are present then we can recalculate this key if the threshold is two we would need two out of these five to recalculate the main key and so on all the way up to five so we can now with Shamir has actually expressed complex relationships between objects so we can begin to as a computer reason in a way that is fairly similar to the way that humans reason which is one of the other things that that this can be nested right so you can actually then take Shamir and apply it again and split this key out and so you can have nested policy and using this we can create fairly complex sets of relationships between objects so let's take an example of a simple laptop this laptop's been issued to you by corporate IT you're probably being if you're using Linux you're probably encrypting with Luxe and Luxe has a great facility called slots and you can have so many slots and there's an or relationship between the keys well we can do the same we can express the same kind of relationship with Shamir's by simply specifying a threshold of one in this case either one of these keys will allow us to reconstitute this key as a result the admin can log in with his or her password or the user can log in with his or her password right and so this makes a lot of sense because you don't want the admin knowing what your password is you want to be able to use your password on a day-to-day basis but the admin also needs to be able to recover your data if something were to happen and so we can express this now in a Shamir's relationship with a threshold of one now let's add tang into the mix right so this is the same exact setup but now we're automated so we still have the admin password for recovery and we still have the user password for for the case where the tang server is not available but if the tang server is present we're gonna automatically unlock the laptop right which is a really really useful feature so think of how this might work and it may be an office setting where you walk into the office on Monday morning and you turn on your laptop and it boots and you don't type anything because it was on the corporate network and it was able to talk to the tang server it has a relationship with that tang server and therefore the description can work but then on Friday night you decide to take your laptop and go sit at a coffee shop well you're no longer on the corporate network you're now on some Wi-Fi hot spot and in this case you do not have access to tang so when you turn on your laptop you're gonna get prompted for a password so fairly simple ways to approach these begin to approach these complex relationships let's say we have a high security system right in this case we want to guarantee that no one user can decrypt the system by him or herself and in this case we've now set the threshold to two so we have three user passwords and so long as any two of them are there to type in their passwords they can now unlock the system here's a complex policy but one that actually has some use and now you can begin to think about the way that we human beings also relates to these objects in this policy we have three layers of nested shemures the first layer has a threshold of one and has a QR code as its as its method of input this is the recovery step right if all else fails you should be able to scan that hard-coded QR code and get into the system and the QR code is kept locked in the safe somewhere but if that if you can't do that now we move on to the next level at this next level we have a threshold of two which means that both of these branches must hold true so we must have TPM in other words the disk must be in the chassis so this allows now for the admin to pull the disk out of the chassis and use the QR code but in a normal case the disk must remain in the chassis now we go down our next branch our threshold here is two again which means that we need two methods of authentication and in this case we have four options we can type in our password we can scan our fingerprint we can contact this tank server or we can do blue to blue to the proximity right so let's say you're sitting at your desk in your office underneath the Bluetooth beacon and you're on the corporate network when you boot your system you're gonna get in automatically because these two are going to return values but now let's say you walk into the conference room and you're not sitting at your desk now you're going to still have access to 10 because you're on the corporate network but you're not going to be in proximity to your Bluetooth and so now we need to provide one of these other two and typically you're probably provided fingerprints again because it's easy you don't think about it you just swipe your finger now again you go out to the coffee shop and so we still need two methods of authentication in this case you can specify your password in your fingerprint so this is a fairly complex policy but you can actually think now about the scenarios in which this provides a very good user experience in high security environments which we've determined by analyzing our surroundings right so this is the quote I want to leave us on we need to let business policy drive crypto policy not vice versa and the reason I say this is because oftentimes we get in this conversation where we're asked well how can we keep our data secure and we can say well you have this one option or this two options right what we really need to be able to say now is we need to be able to describe scenarios where it's not just a binary and where we can in high security environments we can provide high usability where we don't even have to say enter passwords but then as we transition out of those high security environments into lower security environments at that point then our authentication policy gets stricter and stricter and so note that this is not binary we actually have multiple steps in these tiers so this is actually implemented by a project called clevis and you can see the URL there this is client side a pluggable key management so before when I said that Tang had a clevis pin a clevis pin is a plug-in for this framework so this actually provides HTTPS which is the standard escrow case that lots of people are already using so clevis can just go into that environment and work out of the box we also have support for custodia which is a really really awesome key transfer framework and api that semo is working on where are you semo in the back if you have questions talk to him it's really cool so we actually support that out of the box we have support for tang although it's not in this repo it's in the tang repo so we do that to avoid circular dependencies we have support for password and support for shamirs this is minimal dependencies we only require open SSL and lib Janssen but if you want HTTPS currently we require lib curl we are working also on early boot integration and this project is a little bit behind the tang project where the tang project is about to get its first release and is extensively tested this is still undergoing active development however I can give you a demo so we're gonna start off here by actually bringing up our tang server you can see that the tang server is actually already running but there's no keys so the first thing we're gonna do is we're gonna generate a key and to do this make that full size so to do this we run the tang gen command capital a means that we're gonna advertise this key we're gonna pick a crypto group to use and this is a signature key we also have to create a recovery key so that's it I don't have to do anything else my server has already picked up that those keys are available and we'll use them to any incoming connections so let's switch now to talking about clevis so we're gonna go to the provisioning step we're gonna start off pretty simple just a plain old password so what we have here is we have the clevis command and we're gonna provision this is gonna be our pin layout and we're only specifying one pin and that's password and then we're gonna store the provisioning metadata in a file called xxx so now we it's been we're being prompted for the password that we want to use on this provisioning I'm gonna type in foo and it generates our cryptographically strong master key this is we're using currently as 128 so this is 128 big key but it's encrypted using the password and now we can actually look in the file and we can see that it's just really a little bit of jason where we're encrypting the cryptographically strong key using the password and key derivation and a variety of other parameters there and by the way if you were in the lux talk you heard about the lux metadata area which is now going to be in jason notice that this is in jason wonder what that might be about so now we want to that's this is for setup right so we've now set up say our partition and the next thing we want to do is we want to unlock it and this process is called acquiring and so we run the acquire we type in our password again and we get back our cryptographic key so now we want to do this in a completely automated way so we're gonna provision again but this time we're going to use tang so again we're specifying just one pin which is a type tang and we have the the host to contact is local host so we're gonna run this and it's gonna ask us if we want to trust our keys this is the same exact type of behavior you would get in say SSH so we're gonna do trust on first use we're gonna trust those keys and it generates our secure crypto key there and we run the acquisition process again and without any password we get back the same key so this key by the way would be piped into your disk encryption or whatever else so let's look at one more example before we move on to nesting I'm gonna start this little HTTP server and all it does is you can post data to it and it retains the data and then you can get the data back so it's like a fairly typical escrow service and so we're gonna provision again and this time we are going to provision using our HTTP plug-in right there and we're gonna specify the format which is binary which means we're just sending the binary blob to the server so we run it again and you notice that we got a put request over here on the server so we've stored that in the HTTP server now we run acquisition we're gonna get that object back from the server and we have the same exact crypto key again all right now we are going to look at nesting these with shimmers so here's an example with shimmers so we our root plug-in is going to be shimmers and we're gonna have a threshold of one we're gonna have two children pins one of them is password and one of them is Tang so this handles the case where if the tank server is available you type in your password or you don't type in your password but if the tank server is not available then you type in your password so we'll provision this it asks us for a password and it will contact the tank server we trust the keys and it outputs our cryptographic key we run acquisition now watch this it's prompting for the password but I'm gonna do nothing and as soon as it gets the result back from the tank server then we get back the same key immediately we do not have to wait and type in a password because our threshold was one so we can now change this threshold to two and now we're gonna require both so we'll type in our password again we will trust our keys from the server and now we unlock using the acquisition step and now I'm gonna wait again and I'm getting the result from the server but it's not going to continue because our threshold is two at this point we have to type in our password in order to complete the chain and once the chain is completed we get our key back question what's happens of what oh yeah we can do that not a problem so let's stop I guess we can't let's do system so we'll stop the service and we'll stop the socket that activates the service and now we'll run our acquisition step again type in my password but it's gonna time out and not print a key because the tank server is not available and our exit status is non-zero yes I'm sorry what was the question I can't hear okay so rotation is not currently implemented it's gonna be coming hopefully soon and the way that it works is that with the key that we trusted at the beginning of that phase was actually not the key that we're using for distribution that's actually the signing key if you remember at the start of the demo I created two keys a signing key and a recovery key the signing key is used to sign the advertisement of what keys the server has and that's the key we trust so our trust follows the chain of signing keys then the signing keys can sign other signing keys and then they can also sign recovery keys so at that point what happens on the server is when you want to rotate the keys you just generate new keys like we just did and then you modify the the previous keys to not be advertised this means that the new the new clients will get the new keys but the old clients using the old keys will still continue to work and then when the when the clients running the old keys perform their rotation they will see the new keys in the advertisement upgrade to the new keys and after all the clients are updated then you can finally remove the old keys from the server what's that yes this is where the names come from by the way it comes from any of the old techniques for binding things together like old handcuffs for instance if you if you've ever seen in museums old Roman handcuffs they would you'd put your wrist through here and there would be a C shaped thing that's called the clevis and then the thing that it gets hooked to is the tang and then the pin is what binds it so that's where our terminology comes from any other questions yes nothing it's and that's the whole point is that because you're sitting at your desk we consider that to be a high security environment right and this is part of that acceptable risk trade-off so again security is not binary it's something that people want some trade-off for security so they want to identify areas or situations in which a higher security policy is required and other situations in which a lower security policy is required so the answer is nothing prevents that and that choice was not made by me that's something that's really important that choice was not made by me as a developer it was made by the administrator who deployed it and that was made based upon a business justifications which is why I said and I highlighted this phrase let the business policy drive the crypto policy not vice versa so they realized it was an acceptable trade-off and they chose that behavior what are their questions yes yes it's it's theoretical and we haven't implemented it well if you have a think pad you probably have a fingerprint scanner so the practicality is pretty high you use you would use the fingerprint to encrypt a random key and the random key would then be handed back to the chain and to undo the Shamir's chain yes that's exactly the way the fingerprint scanning works this is the the protocol itself I believe you're talking about the key exchange it's been it was published on on several crypto lists you were actually on those lists and you commented so I think that's I think that's why you're asking so yeah it was published to several crypto lists we also advertised it at the storage developer conference and there was a crypto guy there who looked at it so the answer is it's not as it's not been as we reviewed as I would like and you are happy to review it and we will we will definitely take that review can you think of some give me some ideas I'll take them okay looks like we're out of time thank you very much for coming if you are one of the people that ask questions there's I forgot to give out these scarves but they're up here if you want one come get them I think it went well