 Hello. Excellent. Everybody hear me? Sorry? Further up? OK. Better? OK. All right. I'm the leader of the karaoke today, and we'll be starting off. And speaking about net-PGP and signed execution of binaries. I'm fairly old school. I put up a paper here. I don't think there are any proceedings from this conference, but feel free to go and have a look at it and tell me if it's a 404 or whatever. OK. The agenda for today. The building blocks, what we're trying to do, what we're trying to achieve, how we're going to achieve them. An overview of digital signatures for those who've been off in Mars the last five years or whatever. PGP identities and trust how we do it. Contrast that to PKI, how it's done there. Talk a bit about the implementation that I did. And a crypto update if there's time for it at the end. I'm starting five minutes late. I've been told I can finish five minutes late as well. But I overran massively on Friday, so bear with me. Anybody been to any of my talks before? Yes. It's all coming flashing back repressed memories here. OK. So, can anybody guess what this is? Shall I start you off? This is a boat that goes from San Francisco to Marin County. Anybody got a guess? No? Sorry? Well done. Well done. The one at the end is difficult. That's Brick Heck from the series The Middle. But yeah, Ferry Eggs Heck. So, that's what we're going to be talking about today. It gets better, honestly. So, Ferry Eggs Heck is a subsystem that's been in NetBSD since 2004 done by a guy from Australia called Brett Lim. It loads digest into the kernel and it does lazy evaluation on these digests. Whenever you encounter a file that's about to be opened or exact or anything like that, there are three ways of doing it. One is direct, so you're going to say I'm going to open this file. Secondly, it's indirect and that's for things that get invoked by other things. So, we'd have a shell there for one. We'd have interpreters there. And files, which are other things. Configuration files, maybe shared libraries, something like that. These are the examples that we have here. You probably can't see them. If you want to come towards the front and kind of join hands and try and raise the living, please feel free. There are aliases in there called program script interpreter and library as well. So, Ferry Eggs Heck is a very full subsystem within NetBSD. It's also been ported to other operating systems as well. I believe there's a major router manufacturer in California with Ferry Ex-Econ on FerryBSD, or derivation thereof. The types of digest, the usual standard ones. And we'll come on to digest a lot later on as well. Okay, so what do we have? We've got NetPGP. It was originally a GPG clone using the OpenSSL big numbers to do the MPI arithmetic, multi-precision integers. There's a new branch in NetBSD. It's called AGC NetPGP standalone. You can see I've got into the naming things like that. That's how we do branches within our repository. And that uses its own MPIs. It doesn't use OpenSSL at all. So, the barrier for entry for this has now gone way down. We no longer have to have OpenSSL in place before we can actually verify signatures, which, believe me, is a huge benefit. You can think of applications for it like packaging subsystems where you want to check out signatures, but you don't have OpenSSL in place as part of the bootstrap. And this is one of the things that I'll be looking at doing over the next few months. And this will allow us to do signing, principally verification of signatures, but we could also do signing as well. And encrypting, you don't need the private key for. So, again, that's another thing that's just made for this kind of stuff. Okay, so signing. I didn't... This is Flash and it doesn't play well on the Mac, so anything like that. But if ever you want to know what signing is, go to YouTube and have a look at this. It's a marvelous video, it really is. It's done by a girl as her final in her American Sign Language class, and she does the American Sign Language to see the little greens. Forget to you, I think it's the politically correct way of describing that song, right? Okay, so let's go on to signing, what it does. It calculates a digest on the data. It adds hash material from the signature. When you're signing, it adds hash material into the signature. And it uses the private key to produce one or two MPIs, depending on whether you're RSA or DSA. The signature has validity dates as well, so that's from and to. So you can have signatures that are only valid from a certain date. If you want to release a piece of software or something, and you don't want it to be used until a later date, you can set the from date to a month from now, something like that. Also it can expire. We've seen that before with various things where people's keys expire, they don't notice, and interesting things happen. Verification will fail if the signature is out of date. This proves the provenance of something. It's not like a digest, which is just saying, yes, this is what we thought we had. Anybody could write that digest and put something for you. This actually proves the provenance of something because you need the secret key and you need the passphrase, hopefully, that people have put on the secret key to unlock that. So let's talk a bit about identity, shall we? Any Polish speakers in the room? Excellent. Right. This is a Polish driving license. There's a marvelous, marvelous, marvelous thing from the BBC from 2007, I think it is. Irish police were looking for a super criminal in Ireland. He was racking up thousands and thousands of speeding fines, parking tickets, giving a different address every time and not actually paying the fines. He was getting on their nerves, the Irish Guardi, and so they set out looking for him. It was discovered, says the BBC, that the man every member of the Irish police's rank and file had been looking for a Mr Praval Yazdi. Is that right? Correct me if I'm wrong. Praval Yazdi wasn't exactly the sort of prized villain whose apprehension leads to an officer winning an award. Praval Yazdi is Polish for driving license and so if we go back to the driving license, they'd been writing down the name of that all the time. Interesting little story on identity. Make sure you are who you are. We do that by looking at the signature that people have put on things. Same as in the real world, if you want to see a check, has a signature on it, that kind of thing. We sign all legal documents with our own signatures, except for me, I write with an X obviously. We go on to sign files and so I'm going to go into a bit of the implementation now, what I actually did to enable sign verification of files to take place. I added a new bulk signing utility and I was very good and thought of a decent name for it, I called it MultiSign. We can do this in a different machine, it doesn't have to be on the same machine that we're running it on. So we can, for example, build an image and sign the binaries that are on there, which is quite useful as well for people who are writing firmware images and things like that, who are treating the BSDs as very much a firmware, as the firmware for an appliance and there's a fair number of us out there. Why a separate utility? I set it up because of very exact control, which is the one that calculates the digest for very exact. It's not set up to do public keys or secret key signing or private keys as well. You probably can't see it here. I actually took a video late last night of this happening. So there we go and that's me shaking as you can do it. You can see it's not too slow and that's 22 megs worth of quick time. And I tried trimming it, I tried taking the end off like that, but I failed dismally. Okay, so let's go on to something else now that's related to identity. That's a question of trust. Not a ding-batter or anything like that. Just a funny picture, or I thought it was any. There are two principle ways that we have of engendering trust, of checking trust, of finding out what the trust levels are. Principally, the PGP way is to do it with the web of trust. With the usual six degrees of bacon and things like that, we can all be related to Kevin Bacon in some way within six degrees of freedom. My friend Jan Shaman has a marvelous slide on this that shows how we are related to him. So you're probably all aware of this. The trusted third party is the model that PKI uses, that's the certifying authority, one that's in all the SSL certs, things like that. And there are benefits and there are drawbacks of both of these kind of thing. So look at the issues first for PKI. There have been some very, very well publicized breaches. Komodo, I think they had a cross-site scripting or something like that, but they'd actually got some C-sharp libraries on their website that allowed them to do easy signing of keys. And somebody found that out. DigiNotar is the one that was used last year and to forge the Google cert that people were using state-sponsored wise, we think. Anyway, trusted third party, that's great. It's all fine and good as long as we trust the third party. Have any of these third parties ever had audits done that have been published? How they keep their keys normally? We're told they keep them in HSMs and secure modules off to one side. Are there air gaps between the secure modules and the internet? Who has access to them? Who has access to the keys? I have absolutely no idea. And yet every day I'm trusting my identity and other things on the internet with certs that have been provided by these guys. People in Iran and China and stuff like that are trusting their lives to it as well. Anyway, a few issues, maybe. There's single signatories on it as well and it goes right away in a hierarchy right back up to the root, which is obviously a self-cert. What protection do they use for the keys and who guards these guys? Anyway, a few issues. Onto the web of trust way of doing it. The key servers, everybody knows them, it's the Roach, Mattel. You check your key into them and you never leave. There are people there who have had keys from 1993 that are just rubbish, not used anymore. There's some people out on the internet who have done studies and things like that. Keys revoked 10 years ago are still being used, which is interesting. Well, certainly they're trying to be used. People will send email with them. There are other things as well that web of trust doesn't do well. Revocation is one of them, but it's probably better than the way the PKI has of doing it, which is to say, oh, let's go for a new one. So the revocation of keys as well. Everybody around here is a PGP key, I imagine. Does everybody ever revocation, sir? Good. Yeah. An outstanding number of people here have that. Okay, so which is better? So you do a threat analysis on this. I know which one I'd prefer. I'm not going to say which one is better. I know which one I'd prefer, and that's why I've been using PGP for a while. Anyway, net-PGP keys, which is the key management stuff in net-PGP, allows you to get information out of it. There's an option in there to use the minus-minus trusted keys facility, and that gets the key out in a format that we can actually recognize. This is important because we're doing stuff in the kernel. I don't want the kernel to have large libraries of parsing and things like that. It's just not what it's meant for. And so we give it in nice key value terms. There's things down here like the keys itself, the creation time on it, all times done in seconds, obviously. We're not interested in time zones or any other interesting stuff like that. The version of the PGP key algorithm one is RSA, I think. And so on from that. So why am I bothering with this? I mentioned already earlier that RFC4880 ratifies the existing, or codifies existing PGP protocols, packets, things like that. And if you're wondering why I'm reinventing wheels and putting into straight key value stuff, I'm going to take you through some of the first two bytes on a PGP packet. So octet zero, I'll call them octets because we're in Europe. The seventh bit is always set on byte zero. Okay, that's fair enough. It's just a bit unusual, but whatever. The sixth bit denotes the new format for the keys. So version four rather than version three. Version three is the old format, version four is the new one. Okay, so you're with me so far. It's not exactly drastic. That's only the first two bits of the first byte. So the new format, bits five to zero are the packet tag. It's the type of packet that we're having. So it's a signature. It's a public key. It's a secret key. It's a little data one. It's a one pass signature, which introduces literal data and then is bounded by the signature at the end. In the old format, bits five to two are the packet tag and bits one and zero are the length type. That's all well and good. Yeah, that's interesting. I'm obviously laboring the point here. The new format lengths. That's interesting as well. And we have to deal with both versions three and four because of the old signatures that are out there. Another drawback of PGP in general. Octet one, if the value is less than 192, that's the length. If it's less than or equal to 224, then it's used for the second byte. It's being a two byte length. And if it's less than 255, it's the partial length type. So that's the same as HTTP chunked encoding. You jump forward into an offset to find the next offset. And then you keep on going like this until you get one that isn't a partial length type. Which case, yeah, then you have the last packet and you've got the whole thing. You might think this is torturous, but if people are doing signing on input data where you don't know the length, this is typically what's used. GPG will sign like this all the time. And if Octet one is 255, then it's a four byte length. So I really don't want to have to deal with that kind of rubbish in the kernel. So we set it up with a trusted key like we have there. So let's load that into the kernel. And that's what it looks like here. You can see right down the bottom, lift this up. There's a piece of green text down there. And that's my public key being loaded into the kernel. There's a screenshot in the paper if you're having trouble seeing these or if you want to kind of come to the front, then please feel free. Okay. VeryExec has a facility called VeryExecControl, which actually loads of digest into the kernel. It does it through a device called slash dev slash veryexec. Again, stunning names, but very, very useful. I wrote something called SignatureControl, which is now almost exactly the same as VeryExecControl. So I'm going to be merging the two back together again when I get a chance. So this is what a signature for a file looks like. Again, this is, I think, in the paper if you want to see it. The first one there is the name of the file. Then we've got, whoops, sorry. And we have the type of signature that's taken from VeryExec's format. It says that's where the digest name appears in VeryExec. So I'm just using RSA or DSA as being the signature type here. And then you've got the signature after that. And yes, that's broken out from the PGP packet format of things. So that's what a signature for a file or a script or a library or a configuration file or anything like that looks like in the kernel. Okay. So next thing we're going to do, that's probably for those who couldn't see it down at the bottom is the machine that I'm running to test stuff on. That's the kernel in there. So this is actually us loading the signatures there. I've done it on a signature for the ed binary. This is not because I use ed, but it's in slash bin. I wanted something that was nice and easy to show people working and things like that. Remember I said there was lazy evaluation, so it says not evaluated yet. And this is quite important as well because you're probably imagining that there's a huge performance impact of digital signatures or checking digest and everything like that. The digest only gets checked when the file is opened or read. When it's opened in the first place. The information is cached and will go as and when needed. So yeah, slash bin, slash ed, not evaluated. I've got the RSA key down there. And yeah, you probably can't see it, but it tells you when the signature was created down at the bottom. Okay. So we actually run ed. And please bear with me. That's just debugging stuff at the top that I put in there. So I edit a file using ed, and then it tells me flags are direct. We talked about that earlier. And the signature is valid. So now it's a valid signature that I wrote the signature because the... Where's the... I can't see it now. Yeah. We know it's a valid signature and that the time the signature was created was then and the expiry on the signature is then as well. Okay, so that's... We get that out of the kernel by using signature control. It does a request into the kernel to get it out. The kernel formats it and sends it back up through the slash dev slash very exact to get that. This isn't done in years of line. It's done in the kernel and pushed up through properly. Right. That's what happens when everything works well. Okay. So we've got all of that there. That's fine. Next slide. I decided to go and change the last digit down here from a six to a seven. Getting things like that. And I saved that in a file called signature.bed. It's not because I come from Chelsea. It's just I wanted to remember it as an easy name. Right. So what happens when we try running that? Come down here and it says entry status mismatch. Now, this is just doing it in debugging mode. It's just doing it with the very exact level at zero. Very exact is four, four levels it can have. Zero is the one where you're debugging. You just want to see whether it's working or not and nothing happens. Level one is as an IDS. So it'll tell you that something's happened. IPS won't allow, which is two, level two, won't allow you to open the file in that case. And level three is called lockdown and it gets a bit more fascist about things. So it's not to do stuff and all this kind of thing. Nothing to do with me. It's all very exact controls away and it's built into the kernel authentication system like that. Okay. I think that's the next signature is exactly the same. And that's showing me trying to change the strict level down here. You probably can't see this. The strict level on very exact to say, all right, well, I do want to change it. Can't do it because the kernel's up in a secure level. When that's in place, you can't change the strictness level of very exact. So you have to do it early on in the boot process. Very exact has an RCD script and signed exact has exactly the same kind of thing. You load up the signatures, you load the secure level, then you load up the strict level, then the secure level after that. Okay. So you're thinking I'm getting old. I've forgotten about these things or something like that, but no, no, I haven't. That's a bit blurry, isn't it? Any ideas for this one? Want me to give you a hand? One. Gotta go on. Some people in church singing a hymn. Thank you. Second one. Play. What's this one? Main in black, so there's men. Him, play, men. What's this? T and Sean Connery. Oh, Sean. So, him, play, men, T, Sean. Oh, you're still here. Good. It does for me. Okay. So NetPGP at the current time uses OpenSSL Big Nums. So the version that's in the repo in the main branch uses OpenSSL Big Nums. It inherited a callback mechanism for reading, which is bizarre and arcane and all that kind of stuff and needs to change and is being changed at the moment. And the problems that we have with it, like we wanted to enable idea, not an easy fix to do. There's a problem with, I signed the hashes file for NetBSD6 and we can't use NetPGP to verify it. That's all down to a PGP checksumming and the way that the GPG does its calculation of the digest, which is wrong and doesn't conform to the standard, but PGP is, so GPG is the main implementation, so we have to kind of go with their bugs, I'm afraid. So I talked a bit earlier about the separate branch. It's the next generation of NetPGP and I am rewriting it from the RFC app. I've got the verification done already and that's in Tree. And it now uses MPI functionality from LibTomMath and you say, no, no, you can't do that. Nobody else uses it. Well, DropBear uses it and I know certain virtualization manufacturers that used to use DropBear as an SSH in there. Heimdall uses LibTomMath as well. What we do is write a thin layer on top of that to give it the API of OpenSSL bignums and that gives us an easy way of getting the functionality without all the OpenSSL stuff. We don't get the speedups, true, but we also don't have the bring up problems that we have whenever we try and use OpenSSL. So this is the new NetPGP verify in user land. You can see the libraries it uses. Just one main one which is NetPGP verify library and if I do an LDD and a size on those to find out what they are. The big kickers are obviously LibC and that's about it, really, big LibC. The rest are fairly small. The whole thing is a text size for it. Text size... Oh, yeah. Because we're using a small program that calls a library. The text size is tiny. It's not really representative of it. But to look at the library itself which is there. 114k, something like that. Not too bad. Given that LibCrypto itself is about 2 meg, I'm quite happy with that. Performance impact talked about earlier. As with very exec, it's negligible. We can't actually see any difference in performance running with assigned execution as we can with an ordinary one without it. I was going to have a demo up here showing you two systems booting side-by-side and actually running things. I'm afraid I failed in the time department and so I apologize about that. So lessons learned. What did we learn from doing all of this? Firstly that Malik, Calik, Rialik, Free used over and over again in user land. Especially with the Bigham Library as well. It's from the 1990s and it returns pointers to structures as part of the API. We've tended to move on a bit since then but OpenSSL hasn't caught up. Trying to use Rialik in the kernel is an interesting experience. You really should be doing all the stuff in advanced pre-computing sizes or maximum sizes and trying to do that. The problem with that is kernel stack sizes, the stack sizes are limited. Sometimes quite extreme depending on the depth of the call stack within the kernel. So I did find out that I've got to use small steps and I've got lots and lots of them, especially a rule I have with kernel development and I keep relearning that rule every so often as well. So what other things do code signing? That's in the commercial world. Almost everything does it these days, especially if you've got PDA and appliance. Android, iPhone, iOS, all that kind of stuff, all of use it. Microsoft Authentic Code is based on CAs and PKIs. Obviously we don't really want to do that kind of thing or I don't want to do that kind of thing because I don't believe that PKI is the right vehicle for it, especially with all the problems that have been around just recently. We're going to see more of those problems coming in the future as well. So in order to take advantage of this, what we're going to do, Trusted Boot, don't particularly want to go there. If any of you have looked at Trusted Grubb and things like that, you'll know it's horrible. Absolutely horrible. You've got to tie things down way in advance. I actually went down, I sat down with Jared McNeil, one of the developers about two years ago and tried to work out how we could do Trusted Boot on X86 for an FBSD as compared to some of the set-top boxes that manufacturers are using out there, which have got signatures in firmware for hardware manufacturer, firmware authors and do that. And decided that on X86, because of the three-four stage loader process, we've got to secure all of those stages in serial. It was going to be an immense amount of work. So I don't particularly want to go there, but I will have to in the future if we want to actually use this from booting. I also sat down with the chief architect at Yahoo and tried to work out how we were going to do this for certain machines we were going to deploy. And we sat down for a long time and we came out and we didn't have a solution. That's the way it is. We've got some time, if you can bear with me for a minute, we'll go into some other stuff on crypto and other things like that. Everybody all right with that? To do so, you're going to have to pass this. Okay, I'll give you a clue. That's read Hastings of Netflix. These are some soldiers in Iran at war. So we've got war read. Everybody worried? You should be. Thank you. Okay, crypto news. The kind of stuff that's been going on just recently, NIST have had a competition to decide another digest algorithm, we've got three, and we've got Kachak as the digest there. Completely different based on the sponge family. I'll come into that later. I want to have a look at the resistance to collisions of the SHA-1 algorithm. We use SHA-1 in most things these days. Some people have gone on to SHA-256, some 512. They're still the same family, but they're thought to be much, much more resistant to collisions than we have, mainly because of the size. But the fact that they're in the same family is also a bit of a concern for some people. And this is based on the MD5 family of digest as well. So they're all the same kind of thing, and people are finding better and better ways of making collisions for them. Collisions are important because if you can collide, then you can purport to have some data that is not what it actually is. And as we use digest as the basics of signing, this is fairly important. Okay, so I talked about the NIST contest. Kachak was a surprise winner of this. Nobody expected to be. Mainly because Bruce Schneier and some of the other people were calling for a no candidate to be actually announced. When it was announced, Mr. Schneier said, oh yeah, it's a great one. Go for it. Why is it necessary? So some back-of-the-envelope calculations on SHA-1 and its resistance to collisions. This is a server year. A number of cycles. This is a set of commodity hardware, X86, today. One core, eight cores per processor, probably state they are. Four processors, those. Two to the 36 so far. This is a number of cycles per second. So roughly two to the 25 seconds in a year, which gives us two to the 61 cycles every year from one server. But it's still a server. We'll call that a server year. Moore's Law, it's still in operation, right? We're still getting advances every 18 months or so that roughly double what we have at the current time. So two times by 2015, I reckon. The Moore's Law shows no sign of it abating at the moment. Four times increased by 2018 and six times by 2021. Okay. So how many cycles a year are we going to be doing by 2021? Let's have a look at that. Along with Moore's Law there, we're up to two to the 67. Big number? Okay, let's have a look at SHA-1 collisions. A block of a SHA-1 operation is around about two to the 14 cycles. And Stephen's attack on it requires about two to the 60. Operations, SHA-1 operations to break this. So chance of finding a collision, make it up to two to the 74. Okay, you worried yet? Server years. Thank you. We talked about them earlier. So in 2021, two to the seven server years we'll be needing to break SHA-1. We can do it at the moment, but we just need two to the 13 server years. What's all this going to cost if we were going to do this in hardware? Amazon charges four cents an hour, US cents, which is roughly $350 a year. Roughly, yeah. So around about 2021, we're going to be able to do it on X86. And this is still nine years off, mind you, still be able to do it for $43,000. Now, that's quite within the reach of lots of people, university departments doing that as research. $173,000. That's well within the bounds of somebody laying out that money just to do that kind of thing. So these are the costs that are involved in breaking SHA-1. You might think that we're good for another few years with it. It depends what data you're actually protecting with SHA-1. And to summarize, anyway, SHA-1 can be broken by organized crime by 2018, they reckon, and university research by 2021. This is only on commodity hardware on X86. We haven't taken into account Arm8 as a SHA-1 instruction. We use that are out there that are marvelous at this kind of thing, the hardware involved there. And just as a cautionary note on the bottom as well, SHA-2 is the generic name for the 256, 224, 384, and 512s. The other, the later family. Same family, the SHA-1. Just more bits. So let's bring in Kechak. It's not quite such a kind of superfluous thing anymore. It's becoming more of a necessity now. And so we need to start looking at how we're going to implement this and how we're going to bring it into our libraries and our applications. It's a sponge function. That means that it kind of sponges up the input and puts out the output. To me, that seems like kind of a marketing term, but it's the way that they've been using to refer to it and all the literature as well. There is a reference implementation. I'm not sure how good it is. I'm about to have a look at it. And so we'll see what it looks like. But it's a completely different family of hashes. So everybody's quite excited about this kind of thing. What size key should I be using? Well, it depends what you're doing. Fairly obvious. For RSA, DNS sec is using 2K bits. That's way overkill. Because the attacks that are going to go on in this kind of thing are... Well, it's not the attacks. It's more of the overhead that's used to calculate signatures all the time at 2K is... You're doing lots of cycles, lots of churn for absolutely no gain. At the same time, 1024 is too small these days. And it's getting easier to crack. And so we'll move on from there. New kid on the block, elliptical curve DSA. The OpenFSH 5.7 implements this. And it's encoded in RFC 5656. And that's not a stutter. Three curves. OpenFSH has just implemented the ones that are required. Not all of the ones that are in the RFC. There's Diffie Hellman and there's DSA based on elliptical curve. And the reason for using elliptical curve is that the keys are much smaller and it happens much quicker than both DSA and RSA. And we think for about the same level of security. So ECDSA is about quarter the size of the keys of DSA. And DSA is around about half the size for the equivalent level of security as RSA. RSA is the old kid on the block. If you can get such a thing. It's the kind of gold standard as far as these things are concerned. And with that, I am done. Any questions? Yeah. I'm still worried about the shower, actually, because you can assure that Steven's attack is going to stay the best attack possible. And I expect that it's going to bring down the complexity much slower because people are going to find other stuff in that tash. It's basically broken now. Yeah, that's a good point. I didn't mention any of the advances that people are having in actually finding new ways to attack the collision resistance of shower ones. So yes, thank you very much. I completely agree. You mentioned that the signature control there was a level that you could set and it was obviously system overall. Would it make any sense to have this per file system or even per file? Would it be easier to implement it? Would it make sense or not? It's an interesting thought. The way it's done at the moment is not by file system. But yes, I can see that you might want certain things on the root file system which are much more protected than things on my home directory. For example, I don't really care if anybody looks at that kind of thing. I think they need their head seen too. Yeah, I think that's a good idea. It's not implemented that way at the moment. Although there is a mount point in there if you look at the very exact control output. So it may well be possible to add. Thank you. I shall have a chat to Brett about that and see what happens. Anymore? Well, thank you very much for attending.