 Hei. Yes, my name is Dominic. This is the audit-driven approach to security design. As introduced and scrapped, I live in a sailboat back in New Zealand and there are so many ways to die when you live in a sailboat. You have to get very good at risk management and that is directly applicable to computer security. So I have been working as a security auditor at a company called Least Authority the last like 18 months and apart from that I have a project called Secure Scuttlebut which is not the subject of today's talk. So security is more important than ever. More and more like hacks and stuff like this are becoming international headlines. Human states are trying to hack each other mostly by hacking companies and individual people inside of those places. Most of our software isn't really designed well to be secure. A lot of it was designed like—we've still got a lot of software that was designed like a long time before even the internet existed or before people had really thought hard about security and also soon literally everything is going to be a computer. So now we have like multiple things in our pockets and that sort of stuff and within like a decade or two like computers will be so cheap that like take away cups can have computers in them and things like that. So security is only going to become more important. So defining security. So software engineering is about correctness which is did it do what should happen? Which there's this like a silent part in there which is like when it was used as it was intended to be used. Normally it's limited to that but security is about what shouldn't happen when it's used as unintended which usually isn't defined and in security we have to figure out if the thing that happens when you use it as it's unintended is that a thing that we want to happen or not. So some ways that something can fail. Nothing happens is the best one. An error message if you're lucky that is the sort of thing we want and next one is maybe performance degrades and resources are wasted which can be a denial of service. Valuable information is leaked. I made this slide late last night and I noticed now that I left a couple things of it. Valuable information is leaked. The attacker gains some control over the system. For example they are like mis or they have become authenticated when they shouldn't have been and can interact it with it some ways and the worst one is they have they gain total control of the system where they can like execute arbitrary code and make your system do anything they like. So there's a number of approaches to this security problem. From academia their proposed answer is formal methods. So you define a logical model of what the correct behaviour is and then you have like an automatic system that can derive or at least check that the implementation you have is actually provenly the correct software. This sounds like a really good idea but the problem is it's far, far too complicated and no one really uses it for, except for like super complicated expensive like very very very special stuff. Like it is definitely a very long way from mainstream and it has problems itself like what if you didn't specify the correct model? Like what if the spec has a bug in it? Then you could correctly implement that bug. It ends up you actually did the wrong thing. So you still need to audit the model and you still need a human to do that and also it doesn't say anything about like social engineering or other human attacks. So another problem is often security is treated like a waterfall model. So the waterfall model is a theory of software development where development starts at requirements phase, then you do design, then you implement it and then you verify that you implement the right thing and then our maintenance. And the waterfall model was not described by people that were advocating the waterfall model. It was used to describe how not to do it because software is not like building a bridge. You can't just, you know, often you learn a lot while you're developing software and so you can't just do the implementation after you've done the design. You'll end up doing some redesign later. So you need, you actually need an iterative model. And the same with security. You can't just rub a stamp that this is secure now at the end. You might learn things that you should have guarded for beforehand and it's better to redesign and re-order if you really want to have a good security model, which is necessary. So how I ended up becoming a security guy. So I was interested in this problem of database replication. So the idea here is you get, I want to be able to have a database and then copy that database to someone else. They can still edit the database. I can still edit the database and then later on we can synchronise changes again and then continue editing it. And this required a completely different security model to typical application design because your ordinary applications you have one database and then you have secure network connections and someone connects to the, you connect to the database that are authenticated then they can make changes. Someone else has to connect to stuff but all of the authority lives in the database. And this depends on cryptographically secure connections to the server. Now you can still take that idea of cryptography and put it into the database itself and then so like each update you put into the database is signed. The network doesn't need to be secured anymore. It's still good to have a secured network but you don't need to secure it. And even if you haven't received those updates in the database directly from the people that made them because they're signed by them, you can still verify that they made those edits which gives you like a completely different kind of system. And there's a couple of projects that are exploring this space but you kind of have to like start again and rethink of everything. So I had to figure out how to build a new kind of like built secure systems and there really wasn't like an engineering practice that described like, okay, here's how you go about designing something that's secure. So I had to figure that out for myself but I think you'll agree that what I came up with is pretty obvious and that you would have the same conclusions as well. So I call it audit-driven security. The idea is to let the needs of the audit drive the design. So it was inspired by test-driven development. So in test-driven development, you end up redesigning code to make it easier to test and I totally advise redesigning code to make it easier to audit as well. So to put the focus on knowing rather than having because that's how you know that you have the thing that you want. So an example analogy is navigation-driven shipping. So navigation is knowing where you are. So focus on navigation rather than just getting there. Let the needs of navigation drive decision-making. Know where you are at times and it's better to arrive late than dead. So here's an example of one time they didn't use navigation-driven shipping. They're syncing of the SSY wrapper. So they were heading from Sydney, Australia to Auckland, New Zealand, which is about 2,000 miles. They had about 100 kilometres to go, 100 miles to go and they were driving at full speed at night in a heavy fog. They drifted of course a few miles and they crashed into a cliff. 140 people died and what is still one of the worst maritime disasters in New Zealand ever. And they were, yeah, they were just going to, they were, so they had a commercial incentive to go quickly because the more passages you make, the more fears you get and the more profit you make, they didn't actually know exactly where they were but they didn't know that and they didn't act. Well, they should have known that because it was foggy and dark. So they didn't, they knew there was probably a good chance that they weren't where they thought they were but they didn't act to reduce uncertainty. They just assumed they were safe, not focused on knowing. So how we focus on knowing with computer security is by auditing. So auditing is mostly reading the code in the specs and then understanding the code and what is the intended use. And then asking yourself, can this be misused in an interesting or evil way? Which is the fun part. The best thing a security audit can tell you is that it wasn't very interesting to audit your code. That means you did a good job. So it's very important to study reported vulnerabilities to get ideas about ways that things might possibly be attacked. The security community, you get a lot, if you discover a new kind of vulnerability, you get a lot of kudos for describing that often in like a talk at a security conference and that sort of stuff. So there's a lot of material out there to be studied. Another thing you can do is look for signs of weakness. So this isn't like a vulnerability itself, it's just a clue about where a vulnerability might be. So some really simple signs they look for is just like, I start looking in the places where the files are too big because that's kind of usually more disorganised and people ended up more confused and like don't really understand how all that works. Like if you have a really good understanding of the thing, that ends up with a small file. So big files or messy code and that sort of stuff, those are like good places to start looking. And even like really small, fixing really tiny things that like aren't really big problems improve, make the, mean that those are the things where that could have been bugs. So that keeps you further away from bugs. So real attacks are always like combinations of these things. So for example in ransomware, for ransomware to work, first you need to fool people, make a social engineering attack for people to open attachments and then that compromises their system via some technical problem. And then they like encrypt the hard drive and demand payment and encrypting the hard drive is interesting because the design of how file systems works enables this to work. If you had a different kind of file system design, where it was always possible to roll back to a previous state, then this wouldn't work. It only works because file systems are mutable and you can overwrite the files and then go back to the old version. So this is a very important concept, error chains. So whenever a plane crashes and stuff, they investigate everything about what led up to that. All of the things that came together to make that plane crash and that's all the chain of things that went wrong is called the error chain. And then you look for ways that the disaster could have been invented. So why Rafa was driving at full speed, couldn't see where it was, the ship had drifted east. So if you changed anyone of those, it wouldn't have happened. The crew knew they were driving at full speed and they couldn't see where they were. They didn't know they had drifted east. This disaster was in the 1890s and GPS hadn't been invented yet. So they could have definitely reacted to some of those things. For example, slow down would have been the easiest thing or even wait until the fall went away. So it's easy to prevent things by attacking the error chain. So it's like wearing a belt and suspenders at the same time. When I was researching this, I was looking for this image, I found many articles that said it's a big fashion mistake to wear a belt and suspenders at the same time and that you should not do it. But I think from a security engineering perspective when I see someone that's wearing a belt and suspenders, I'm like, that's a genius. Security engineered a much, much lower risk of his pants falling down. So for security problems, you should definitely go for belt and suspenders. So an example is two-factor authentication. So with two-factor authentication, you have to log in, you have to have a phone and your password. And this makes the error chain longer because you have to, attacker has to compromise not only your password, but also your phone. So a note is that this, if your phone has access to the email that's probably used to reset the password anyway, then having the phone is actually all you need. But it means that to compromise you now, there needs to be physically near you and steal your phone, which is a lot harder than if someone is physically really far away from you, then they have no chance of doing this. However, on the other hand, you could probably quite easily lose your phone. But it's still an improvement because it protects you from most of the people on Earth who aren't near you. And then the next thing is avoiding, if you can, entire classes of errors. Can we design things in a way so we don't, so we skip complete types of attacks? So this idea really came to me via the work of Dan Bernstein or DJB, as he's known. This is a really important, so this is something I'd want to stress as well, is that you really, the way that you really keep abreast of what is important and happening is by following the security community. I think it's better to put more trust in individuals and see that from these people that are considered to be reputable, who do they trust and who do they acknowledge as doing the important work and so on. I'd put a lot more trust in individuals than in institutions. And I'd evaluate the institutions based on the individuals. But Dan Bernstein is quite widely respected because he's a leading academic but also writes libraries that are useful to practitioners. So there's two papers that I learned a lot from. One is Some Thoughts on Security after 10 years of Q-Mail. Q-Mail is a security-oriented mail transfer agent, like an email server basically. It was written in 1995. It used a variety of approaches to get really good security. Interestingly, the main thing was just like a software engineering perspective. It's like vulnerabilities are just bugs, so just try and write less bugs. Ideally try to write zero bugs. As it's happened, Q-Mail has had very, very few bugs. And the main technique it used to get less bugs was to write less code because the more code you have, the more bugs you can have. And then there were a few particular things that he'd suggest being particularly problematic. A major one is avoid parsing or any sort of thing where you go from like one format of encoding to another format of encoding, which was the subject of the previous talk. When you go from data in a database into HTML, that transition is the dangerous part. And then an interesting thing that he suggests in the paper that didn't actually put in Q-Mail is if you take complicated things such as converting one image to another, you can put that into a separate process that is locked down so it can't communicate with the internet and that sort of stuff. And then if that breaks or that has a vulnerability, it can't get out. Kind of like if it's also in the previous talk advocated putting the firewall on the outgoing firewall, put the thing inside a box that it can't escape from and then if it gets hacked, nothing happens. And the other thing that was really interesting was another paper called The Security Impact of a New Cryptographic Library. This was about this library Nackle or Libsodium as the now maintained version. So this was just a collection of cryptography primitives that's kind of the go-to library to build new systems. I think it was released in 2009. And it was systematically designed to avoid whole classes of errors. So for example, there's this thing called a side-channel attack where when an algorithm is running if you observe it in particular ways, you can usually by the time, how much time it takes to run, you can learn like secret values that are inside of it. So AES, which is part of TLS and is very likely to widely used encryption library, has a side-channel attack where if you can observe how long it takes to decrypt a thing, you can learn, you can regenerate what the secret key is and then decrypt it. And it's kind of like an academic problem. I haven't really heard of like practical attacks that have used this, but the thing is that the attacks of this class will probably improve and then it will probably become a danger, a practical danger at some time in the future. However, this library completely avoids it by using designs that simply don't have the side-channel attack. So the side-channel attack happens because as the programme is running, depending on the secret information, sometimes it branches over to here, sometimes it branches over to here, and these take different amounts of time. So you can tell if you observe how much time it takes, you can get hints about what's inside of it. But the algorithms in this, they just always do exactly, execute exactly the same instructions every single time. So they never make any difference in how much time they execute in, and so it doesn't reveal any information at all. It's also as fast as possible so that users aren't tempted to disable security and includes only the best algorithms. It has one of each kind of algorithm and doesn't allow you to configure it at all. So once you've chosen to use sodium, you've got the best algorithms and you don't need to... It doesn't give you the opportunity to shoot yourself in the foot. And the algorithms are carefully selected so that the implementations are straightforward and don't have any... You don't need to remember really special things. If it works, then it's probably secure. And that makes it easy to check that the implementation is also correct and secure. So a negative example that I learned about yesterday is Jason Webtoken. So Sid did a great job explaining what some of the problems... Some of the things you needed to know to use Jason Webtoken. So one is that it has this algorithms none, the thing you can just completely turn off security. Why would you want to do that? In the spec it says, in case you want to put it inside of another thing that's already authenticated, but does anyone actually need to do that? That just seems like that's kind of a really bad idea and maybe that should just not be part of Jason Webtoken's. And then it's also possible to enable a combination of algorithms which is insecure. So if you have an HMAC algorithm and a signing algorithm, then you can trick it into... into using the public key as the HMAC key. And that means the attacker can generate their own thing that seems to be signed. And this was all described in this blog post in 2015. So it's been around for a while and it's kind of... This is actually like a bad design because you shouldn't have to know about things that are inside. So if we apply an error chain analysis to that, well, libraries correctly implemented the spec. There were too many algorithms. So it gave you these extra options and allowed you to change it. If you didn't have those many options, you wouldn't have that problem. If libraries had incorrectly implemented the spec, it could have been more secure and HMAC and signatures are completely different things. So it's kind of strange to have them in the same system. So if you just had HMAC or just had signatures, it would have been okay. If you had removed none, that would have been great too. So I went and actually read the RFC to see what it said. And I found this interesting part that even if a JSON web token can be successfully validated, unless the algorithms used in the JW2 are acceptable to the application, it should reject the JWT. It's like, hang on, what does should mean again? And that's in capital letters, so that means it's the should as is defined in RFC 2119. Should. This word or the objective recommended mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implementations must be understood and carefully weighed before choosing a different course. Were there really valid reasons in particular circumstances for the library not to reject an unacceptable algorithm? I think the spec should have used the word must here. So I think this is a somewhat pedantic case, but a case where the spec is actually wrong. The spec is actually the course of the vulnerability. So when I was researching this, I also discovered this thing called simple web tokens, which the JSON web token cited as an influence. And I was quite impressed that simple web tokens are great. The S might as well stand for secure. It's a URL query stream plus an HMAC. That's it. So it's kind of like JWT, except it only does HMAC. There's no nan and there's no secure. So this is harder to use insecurely. And also it doesn't exceedingly look like it's been encrypted. This JSON web token encodes as base 64, which looks like you can't read it. It looks like it's encrypted, but it's not encrypted. So don't think it's encrypted. Don't think anything secret in there. Simple web tokens, it's obviously not encrypted because it's just a query string. So it doesn't mislead you into using it the wrong way. So my suggestion for secure JSON web tokens would just be the same as JSON web tokens, but without none and without HS. Simple web tokens already exist, so you can use that if you want HMAC. Most people could probably just use that. So that's an example of configurable security. This is actually how most airline disasters happen. Human error. In this case, there's a switch. Wings stay on, wings fall off. The accidently switches the wings off. This is a very good example of how... That was the problem with... Yeah, so don't make security configurable. It's like this. A convertible. You can leave the top down. You can leave the windows open. In any way, someone can cut through the walls with a knife, but with an APC. There's no windows to leave open. This armour of plating can withstand rocket fire. You have eight wheels. They all drive. So like mines, landmines can go off under the wheels, and you can still drive home. Actually, as I was researching this slide, I thought, oh, tracks. Tracks are better than wheels, but turns out that tracks are actually quite susceptible to mines. Everyone's gone over to using wheels instead. So yeah, that's the kind of software security we want everyone to have. Another way of looking at this problem is you don't want to leak internal details. Like if there's something about a module or protocol or something that you need to understand and how to use it securely, then it's just likely that you won't understand that and then you'll use it insecurely. So I joke that all three-letter acronym algorithms have a secret, have a bug in it that you need to understand to be completely sure that you're using it securely. I'm not going to go into what all of these are, but I'll leave this exercise for the reader. You can research those, but all of those have bugs, and they're all very widely common uses, probably in the software that you're writing as well. So it's good to actually know what those things are. But the ones that have four letters or five letters, those are probably more chance of those being safe. Certainly all of the algorithms in Libsodium were explicitly designed not to have any bugs and to have much safer to use. So the goal is to be secure by design. You should just better look at the design and be like, yeah, this is definitely secure. And then obviously secure even. And then to make something that's obviously secure and insecure, that would be obvious as well, because you look there and if there's something extra that shouldn't be there, then it's like that's obvious. You shouldn't be able to make something insecure just by leaving something out and still having it work. If you leave something out, it should just completely fail. And then it's obviously that it's not correct. So an example of this is train brakes. So train brakes have a really interesting design where there's big springs that hold the brakes down onto the wheels. So it's the opposite of like on a bicycle or on a car, you have to push a lever to engage the brakes. On a train, the brakes are engaged by default and you have to push a lever to release the brakes to start moving. And that lever is controlled by compressed air that is generated in the engine and then comes down into pipe. And this really has the clever thing. If the train becomes decoupled, then that pipe breaks and then the rest of the train that's no longer connected to the engine now has no air in it, so the brakes will come on and it just safely stops. And it's pretty obvious that this would happen to check that a train is going to automatically stop like this. You just have to check that there's nothing extra. Like the brakes haven't been welded open or something like that, which would be much easier to spot. Yeah, so obviously safe. So wrapping up, the audit-driven design process is so study vulnerabilities and similar systems for inspiration, define what the system does and also doesn't do. Then try to think of ways you can misuse the system and if you find something, add don't do this to the previous step and then go back and redesign it. Consider small problems and bad patterns to be vulnerabilities. Like if it turns out that there's some tiny little piece of information that you can extract or something like that, consider that, actually we don't want it to do that either. Repeat until there's no vulnerabilities. And documentation should clearly state what it does and what you specifically designed it not to do. So, thank you. Remember, less code, less bugs, separate things into well-defined things so you can build a secure system from secure components and explicitly document what you could happen and study reported vulnerabilities and local patterns.