 Okay, our next speaker will be trying to tame the chaos, Peter Sewell from the University of Cambridge. A warm round of applause please. Thank you. Okay, so here we are, V3 again, with a program full of fascinating and exciting and cool stuff. And if we look just at the security talks, all manner of interesting things. So I'm going to go to lots of these. You should too. Let's see, though. If we read some of the titles, exploiting kernel memory corruptions, attacking end-to-end email encryption, what else have we got? And Windows user space exploitation, compromising online accounts. Okay, so a lot of these talks, as usual, they're explaining to us for our computers they're not doing what we would like. And this, as usual, is the nearest tip of a tiny iceberg. And we have hundreds of thousands of known vulnerabilities. How many unknown vulnerabilities? Let me do, it's a bit dark here, but let me try some audience participation. How many of you have in the last year written at least 100 lines of code? And of those people, keep your hands up if you are completely confident that code does the right thing. I would like to talk to you later. And so would the people in the self-driving car that you're probably engineering. So how many unknown vulnerabilities, then, well, you can take what's in your mind right now and multiply it by, oh, this doesn't work very well, no, no, no, computers, they do the wrong thing again and again and again. We can multiply by this estimate of about 100 billion lines of new code each year. So if we take all of this, these talks, these numbers, these should make us not happy and excited, these should make us sad, very sad, and frankly, rather scared of what is going on in the world. So what can we do? We can, option one, give up using computers all together. In most audiences, this will be a hard sell, but here, I'm sure you're all happy to just turn them off now, or we can throw up our hands in the air and collapse in some kind of pit of despair. Well, maybe. Maybe not. My task today is to give you a tiny smidgen, a very small amount of hope that it may be possible to do slightly better. If we want to do better, we first have to understand why things are so bad. If we look at our aeronautical engineering colleagues, for example, they can make planes which very rarely fall out of the sky. Why is it that they can do that, and we are so shit at making computers? Well, I'm going to reuse a picture that I used at 31C3, which is still the best description I can find of the stack of hardware and software that we live above. Here we go. It's, there's so much information in this, it's just ace. So we see down here all of this transparent silicon, it's the hardware we live above. We see a stack of phases of a C compiler. We see all kinds of other bits and pieces. I think it might be a slightly hostile wireless environment in this room for some reason. If we look at this and think about the root causes for why our systems are so bad, we can see several things. So the first is, obviously, there's a lot of legacy complexity that we're really stuck with. If you pull out one of those pieces from the middle and try and replace it with something which is not exactly the same shape, the whole pile will collapse. So we somehow need to find an incremental path to a better world. And then this is the most fundamental reason that these are not like aeroplanes. These systems are discrete, not continuous. If you take an honest girder made out of a piece of steel, and you push on it a little bit, it moves a little bit, basically in proportion. If it's 1% too strong, 1% too weak, basically it doesn't matter. But in these things, one line of code can mean you're open to a catastrophic exploit and one line in many, many million. Okay, next thing. This really doesn't work. They're very complicated. But the scary thing is not the static complexity of those lines of code and the number of components, although that's pretty intimidating. The really scary thing is the exponential number of states and execution paths. So these two together, they mean that the testing that we rely on, testing is the only way we have to build systems which are not instantly broken. Testing can never save us from these exploitable errors. And that means ultimately we need to do proof, like honest machine-checked mathematical proof. And it also tells us that we need to arrange for our systems to be secure for simple reasons that do not depend on the correctness of all of those 100 billion lines of code. Then another thing that we have here, all these interfaces, the architecture interface, the C language interface, the Sockets API interface, the TCPY interface, all of these we rely on to let different parts of the system be engineered by different organizations. But they're all really badly described and badly defined. So what you find is for each of these, typically a prose book, fearing and thickness between about that and about that, full of text. Well, we still rely on testing, limited though it is, but you can never test anything against those books. So we need instead interface descriptions, definitions, specifications that are more rigorous, mathematically rigorous, and that are executable, not in the normal sense of executable as an implementation, but executable as a test oracle. So you can compute whether some observed behavior is allowed or not and not have to read the book and argue on the internet. And we also need interface definitions that support this proof that we need. And then all of this stuff was made up in the 1970s or the 60s. And in the 70s and the 60s, the world was a happy place and people walked around with flowers in their hair and nothing bad ever happened. And that is no longer the case. So we need defensive design, but we need defensive design that is somehow compatible or can be made enough compatible with this legacy investment. So that's quite hard. And then, I can't say very much about this, but we have at the moment very bad incentives because we have a very strong incentive to make things that are more complex than we can possibly handle or understand. We make things, we add features, we make a thing which is just about shippable, and then we ship it. And then we go on to the next thing. So we need economic and legal incentives for simplicity and for security that we do not yet have. But it's hard to talk about those until we have the technical means to react to those incentives, all right? Okay, so what I'm gonna do now, I'm gonna talk about two of these interfaces, the architect interface and the C language interface and see what we can do to make things better. A lot of this has to do with memory. So whoever it was that picked the subtitle for this meeting, refreshing memories, thank you, because it's great, I love it. Let's refresh your memory quite a way, all right? So I invite you to think back to when you were very, very young in about 1837, all right? When cool hacking was going on, Charles Babbage was designing the analytical engine. And even then, you see, there was this dichotomy between a mill performing operations and a store holding numbers. This is a planned view of the analytical engine. Well, it was vaporware, but A, designed from the analytical engine. And you see here, these circles, these are columns of numbers each made out of a stack of cogs, maybe a 50-digit decimal number in each of those columns. And this array, really, he imagined it going on out to, I don't know, over there somewhere with about 1,000 numbers. So even then, you have a memory which is an array of numbers, all right? I think these were not, I don't think you could do address computation on these things. I think addresses were only constants, but still, basically an array of numbers. So, okay, what have you got now? Let's look a bit at C. How many of those people who have programmed a hundred lines of code? How many of you were C programmers? A bit a few, or maybe you're just embarrassed. Ah, I forgot to say, yeah, all of those that didn't put your hands up for that first question, you should feel proud, and you should glory in your innocence while you still have it, all right? You are not yet complicit in this train wreck. Oh, it worked then. Okay, so here's a little piece of C code which I shall explain. I shall explain it in several different ways. So we start out, it creates two locations, X and secret, ah, don't do that. This is so bad. Creates X, storing one, and secret key, which stores, I might get this wrong, your pin. And then there's some computation which is supposed to only operate on X, but maybe it's a teensy bit buggy or corrupted by somebody. And then we try and, we make a pointer to X. Okay, and in this execution, X just happened to be allocated at 14. This pointer contains the number 14. And then we add one to that pointer. Ah, it's C, so actually that adding one to the pointer, it really means add the four bytes which are the size of that thing. So we add four to 14, and we get 18. Okay, and then we try and read the thing which is pointed to. Okay, what's gonna happen? Let me compile this and run it in a conventional way. Oh, it printed the secret key, bad. At this program, rather distressingly, this is characteristic by no means of all security flaws, but a very disturbingly large fraction of all the security flaws are basically doing this. Okay, so does C really let you do that? Oh, yes and no. So if you look at the C standard, which is one of these beautiful books, right? It says, you have to read moderately carefully to understand this, but it says that this program has undefined behavior. And many of you will know what that means, but others won't. But so that means as far as the standard is concerned, for programs like that, there is no constraint on the behavior of the implementation at all. Or, said another way, and maybe a more useful way, the standard lets implementation, so C compilers assume that programs don't have this kind of stuff. And it's the programmer's responsibility to ensure that. Now, in about 1970, 75, maybe in 1980, this was a really good idea. It gives compilers a lot of flexibility to do efficient implementation. Now, not so much. But how does this work in the definition? So the standard somehow has to be able to look at this program and identify it as having this undefined behavior. And indeed, well, if we just think about the numbers in memory, this 18, it points to a perfectly legitimate storage location. And that plus one was also something that you're definitely allowed to do in C. So the only way that we can know that this is undefined behavior is to keep track of the original allocation of this pointer. And here I've got these allocation identifiers, at three, at four, at five. And you see here what I've still got in this pointer, despite the fact that I added four to it, I've still remembered the allocation ID. So then I can check, or the standard can check, when we try and read from this, whether that address is within the footprint of that original allocation, i.e. is within there. And in fact, it's not, it's over here, it's not within there at all. So this program is deemed to have undefined behavior. Just to clarify something, people often get confused. So we detect undefined behavior here, but it isn't really a temporal thing. The fact that there is an undefined behavior anywhere in the execution means the whole program is toast. Okay, but this is really interesting because we're relying for this critical part of the standard on some information which is not there at runtime in a conventional implementation. So just to emphasize that point, if I compile this program, let's say to arm assembly language, I get a sequence of store and load and add instructions, store, load, add, store, load, load, no. And if I look at what the ARM architecture says can happen, these blue transitions, well, one thing to notice is that we can perfectly well do some things out of order. At this point, we could either do this load or we could do that store, yeah. That would be a whole other talk. Let's stick with this sequential execution for now. What I want to emphasize is that this load of this address really was loading a 64-bit number which was the address of X and adding four to it and then loading from that address the secret key. So there's no trace of these allocation IDs, no trace at all. Okay, let me step back a little bit. So I've been showing you some screenshots of C behavior and ARM behavior. And I claim that these are actually showing you all of the behaviors allowed by what one would like the standards to be for these two things. And really what I've been showing you are GUIs attached to mathematical models that we've built in a big research project for the last many years, REMs, rigorous engineering of mainstream systems, sounds good, aspirational title, I think I would say. And in that, we've built semantics, so mathematical models defining the envelope of all allowed behavior for a quite big fragments of C and for the concurrency architecture of major processes, ARM and X86 and RISC-5 and IBM Power, and also for the instruction set architecture of these processes, the details of how individual instructions execute. And in each case, these are specs that are executable as test oracles. You can compare algorithmically some observed behavior against whether the specs says it's allowed or not, which you can't do with those books. So this is an idiot simple idea, but for some reason the industry has not taken it on board any time in the last five decades. It's not that hard to do. These are also mathematical models, I'll come back to that later. But another thing I want to emphasize here is that in each of these cases, especially these first two, when you start looking really closely, then you learn that what you have to do is not build a nice clean mathematical model of something which is well understood. What you learn is that there are real open questions, for example, within the C Standards Committee and within ARM a few years ago about exactly what these things even were. And that is a bit disturbing, given that these are the foundations for all of our information infrastructure. There's also a lot of other work going on with other people within this RMS project on WebAssembly and JavaScript and file systems and other concurrency. Don't have time to talk about those, but Hannes is gonna talk a little bit about TCP later today. You should go to that talk too, if you like this one. If you don't like this one, you should still go to that talk. Okay, so this is doing somewhat better, certainly more rigorous engineering of specifications, but so far only for existing infrastructure. For this C language, ARM architecture, what have you? So what if we try and do better? What if we try and build better security in? So the architectures that we have, really they date back to the 1970s or even 60s with the idea of virtual memory. And that gives you, I don't need to go into what it is, but that gives you very coarse grain protection. You can have your separate processes isolated from each other and on a good day you might have separate drowzer tabs isolated from each other in some browsers sometimes, but it doesn't scale, right? It's just too expensive. You don't have lots of little compartments. One thing we could do, we can certainly design much better programming languages than C, but all of that legacy code, that's got a massive inertia, right? So an obvious question is whether we can build in better security into the hardware that doesn't need some kind of massive pervasive change to all the software we ever wrote. And that's the question that a different large project, the Cherry project, has been addressing. So this is something that's been going on for the last eight years or so, mostly at SRI and at Cambridge, funded by DARPA and the Epsirk and ARM and other people, mostly led by Robert Watson, Simon Moore, Peter Neumann and a cast of thousands and me a tiny bit. So don't do that. I should learn to stop pushing this button. It's very hard to not push the button. So here's the question in a more focused way that Cherry is asking, right? Can we build hardware support, which is both efficient enough and deployable enough that gives us both fine-grained memory protection to prevent that kind of bug that we were talking about earlier, or that kind of leak, at least, and also scalable compartmentalization. So you can take bits of untrusted code and put them in safe boxes and know that nothing bad is gonna get out, right? That's the question. The answer, the basic idea of the answer is pretty simple and it also dates back to the 70s, is to add hardware support for what people call capabilities, right? And we'll see that working in a few moments. The idea is that this will let programmers exercise the principle of least privilege, so with each little bit of code having only the permissions it needs to do its job, and also the principle of intentional use. So with each little bit of code, when it uses one of those capabilities, will require it to explicitly use it rather than implicitly. So the intention of the code has to be made plain. So let's see how this works. So these capabilities, they're basically replacing some, or maybe all, of the pointers in your code. So if we take this example again, all right? In ISOC, we had an address and in the standard, well, the standard is not very clear about this, but we're trying to make it more clear an allocation ID, something like it. Now in CherryC, we can run the same code, but we might represent this pointer, not just with that number at runtime, but with something that contains that number and the base of the original allocation and the length of the original allocation and some permissions. And having all of those in the pointer means that we can do the hardware can do at runtime, at access time, a very efficient check that this is within this region of memory. And if it's not, it can fail stop and trap. No information leak. Then you need a bit more machinery to make this actually work, right? So it would be nice if all of these were 64 bit numbers, but then you get a 256 bit pointer and that's a bit expensive. So there's a clever compression scheme that squeezes all this down into one to eight bits with enough accuracy. And then you need to design the instruction set of the CPU carefully to ensure that you can never forge one of these capabilities with a normal instruction. You can never just magic one up out of a bunch of bits that you happen to find lying around. So all the capability manipulating instructions, they're going to take a valid capability and construct another, possibly smaller, in memory extent, or possibly with less permissions, new capability. They're never going to grow the memory extent or add permissions. And when the hardware starts at hardware initialization time, the hardware will hand to the software at capability to everything and then as the operating system works and the linker works and the C language allocator works, these capabilities will be chopped up into ever finer and smaller pieces to be handed out to the right places. And then you need one more little thing. So this is C language world we're living in here or really a machine code language world. So there's nothing stopping code, writing bytes of stuff. So you have to somehow protect these capabilities against being forged by the code either on purpose or accidentally writing bytes over the top. You need to preserve their integrity. So that's done by adding, for each of these 128 bit sized and aligned units of memory, just a one extra bit that holds a tag and that tag records whether this memory holds a currently valid capability or not. It's all the capability manipulating instructions. If they're given a valid capability with a tag set, then the result will still have the tag set. And, but if you write, say you just wrote that byte there, which might change the base, then the hardware will clear that tag. On these tags, they're not conventionally addressable. They don't have addresses. They just stop there in the memory system of the hardware. So there is no way that software can futz with these. So this is really cool, right? We've taken what in ISO was undefined behavior in the standard and in implementations was a memory leak and we've turned it into something that in charity is in both the specification and the implementation is guaranteed to trap, to fail stop and not leak that information. So there's another few things about the design that I should mention. So I think all these blue things I think I've talked about but then a lot of care has gone in to make this be very flexible, to make it possible to adopt it. So you can use capabilities for all pointers if you want to or just at some interfaces. You might, for example, use them at the kernel user space interface. It coexists very nicely with existing CNC++. You don't need to change the source code very much tall and we'll see a tiny bit of data about this to make these, to start using these. It coexists with the existing virtual memory machinery. So you can use both capabilities and virtual memory if you want or you can just use capabilities if you want or just use virtual memory if you want. And then there's some more machinery which I'm not gonna talk about for doing this secure encapsulation stuff. What I've talked about so far is basically the fine-grained memory protection story. I should say the focus so far is on spatial memory safety. So that's not in the first instance protecting against reuse of an old capability in a bad way but there are various schemes for supporting temporal memory safety with basically the same setup. Okay, so... So how do we... Okay, so this is some kind of a high-level idea. How do we know that it can be made to work? Well, the only way is to actually build it and see. And this has been a really interesting process, right? Because mostly when you're building something either in academia or in industry, you have to keep most of the parts fixed. I mean, you're not routinely changing both the processor and the operating system because that would just be too scary. But here we have been doing that and indeed we've really been doing a three-way, three-way co-design of building hardware and building and adapting software to run above it and also building these mathematical models all at the same time. This is, well, A, it's a lot of fun because you often get groups of people together that can do all of those things but B, it's really effective. So what we've produced then is an architecture specification, in fact, extending MIPS because it was conveniently free of IP concerns. And that specification is one of these mathematically rigorous things expressed in a domain-specific language specifically for writing instruction-set architecture specifications and we've designed and implemented actually two of those. And then there are hardware processor implementations that actually run on FPGAs and a lot of hardware research devoted to making this go fast and be efficient despite these extra tags or whatnot. And then there's a big software stack adapted to run over this stuff. So including Clang and a FreeBSD kernel and FreeBSD user space and WebKit and all kinds of pieces. So this is a very major evaluation effort but one is not normally able to do. This is why this combination of things is why that list of people up there was about 40 people. I say this is based on MIPS but the ideas are not specific to MIPS and there are ongoing research projects to see if this can be adapted to the ARM application class architecture and the ARM microcontroller architecture under risk five. We'll see how that goes in a new course. Okay, so then with all of these pieces we can evaluate whether it actually works. And that's still kind of an ongoing process but the data that we've got so far is pretty encouraging. So we see here first performance. So you see maybe a percent or two in many cases of runtime overhead. There are workloads where it performs badly. If you have something that's very pointer heavy then the extra pressure from those larger pointers will be a bit annoying but really it seems to be surprisingly good. Then is it something that can work without massive adaption to the existing code? Well, in this port of the free BSD user space so all the programs that, all the normal programs that run there was something like 20,000 files and only 200 of those needed a change. And that's been done by one or two people in not all that large an amount of time. Some of the other code that's more like an operating system needs a bit more invasive change but still it seems to be viable. Is it actually more secure? How are you gonna measure that? We like measuring things as a whole. We try and do science where we can. Can we measure that? Not really but it certainly does, the whole setup certainly does mitigate against quite a large family of known attacks. I mean if you try and run Heartbleed you'll get a trap. It will say trap. I think it will say signal 34 in fact. So that's pretty good. And if you think about attacks very many of them involve a chain of multiple flaws put together by an ingenious member of the C3 community or somebody else. Very often at least one of those is some kind of memory access permission flaw of some kind or other. Okay so this is nice but I was talking a lot in the first part of the talk about rigorous engineering. So here we've got formally defined definitions of the architecture and that's been really useful for testing against and for doing test generation on the basis of and doing specification coverage in an automated way but surely we should be able to do more than that. So this formal definition of the architecture once it looked like. So for each instruction of this cherry mix processor there's a definition and that definition it looks a bit like normal code. So here's the definition of how the instruction for capability increment cursor immediate goes. So this is a thing that takes capability register and an immediate value and it tries to add something onto the cursor of that capability. And what you see here is some code looks like code that defines exactly what happens and also defines exactly what happens if something goes wrong. You can see there's some kind of an exception case there. So that's a processor. No that's a yeah that's a raising our processor exception. So it looks like code but from this we can generate both reasonably high performance emulators reasonably in C and in OCaml and also mathematical models in the theorem Provers. So three of the main theorem Provers in the world are called Isabel and Hull and Coq and we generate definitions in all of those. So then we've got a mathematical definition of the architecture then finally we can start stating some properties. So cherry is designed with some kind of vague security goals in mind from the beginning but now we can take let's say one of those goals and make it into a precise thing. So this is a kind of a secure encapsulation kind of a thing not exactly the memory leak that we were looking at before sorry the secret leak that we were looking at. So let's take imagine some cherry compartment I haven't told you exactly what that means but let's suppose that that's understood and that's an inside that component there's a piece of software running and that might be maybe a video codec or a web browser or maybe even a data structure implementation maybe or a C++ class and all of its objects maybe. So imagine that compartment running and imagine the initial capabilities that it has access to via registers and memory and via all the capabilities that has access to via the memory that it has access to and so on recursively. So imagine all of those capabilities. So the theorem says no matter how this code executes whatever it does, however compromised or buggy it was in an execution up until any point at which it raises an exception or makes an inter-compartment call it can't make any access which is not allowed by those initial capabilities and you have to exclude exceptions and inter-compartment calls because by design they do introduce new capabilities into the execution. But until that point you get this guarantee and this is something that we've proved actually Kindle and Neenhouse has proved with a machine-checked proof in fact the Isabel theorem prover. So this is a fact which is about as solidly known as any fact the human race knows. These provers they're checking the mathematical proof in exceedingly great detail with great care and attention and they're structured in such a way that the soundness of the prover depends only on some tiny little kernel. So conceivably cosmic rays hit the transistors at all of the wrong points but basically we know this for sure. I emphasize this is a security property. We have not proved, we certainly wouldn't claim to have proved that cherry is secure and there are all kinds of other kinds of attack that this statement doesn't even address but at least this one property we know it for real. So that's kind of comforting and not a thing that conventional computer science and engineering has been able to do on the whole. Okay, so I'll be taming the chaos then. Well, no, sorry. But I've shown you two kinds of things. So the first was showing how we can do somewhat more rigorous engineering for that existing infrastructure. It's now feasible to do that. And in fact, I believe it has been feasible to build specifications which you can use as test oracles for many decades. We haven't needed any super fancy new tech for that. We've just needed to focus on that idea. And then, so that's for existing infrastructure and then cherry is a relatively lightweight change that does build in improved security. And we think it's demonstrably deployable at least in principle. Is it deployable in practice? Are we gonna have in all of our phones five years from now, will these all be cherry-endabled? Well, I can't say. I think it is plausible that that should happen and we're exploring various routes that might mean that that happens and we'll see how that goes. Okay, so with that, I aim to have given you at least not a whole lot of hope, but just a little bit of hope that the world can be made a better place. And I encourage you to think about ways to do it because Lordy knows we need it. But maybe you can at least dream for a few moments of living in a better world. And with that, I thank you for your attention. Thanks very much. And with that, we'll head straight into the Q&A. We'll just start with mic number one, which I can see right now. Yeah, so thanks for the very encouraging talk. And I have a question about how to secure the part below that. So the theory that you stated assumes that the machine description matches the hardware at least it describes the hardware accurately. But are there any attempts to check that the hardware actually works like that? That there are no holes in the hardware that some combination of machine commands work differently or allow memory access that is not in the model? So is it possible to use hardware designs and check that it matches the spec and will Intel or AMD try to check their models against that spec? Okay, so the question is basically can we do provably correct hardware underneath this architectural abstraction? And the answer is, so it's a good question. People are working on that. And it can be done, I think, on at least a modest academic scale. Trying to do that in its full glory for a industrial superscalar processor design I think is right now out of reach. But it's certainly something one would like to be able to do. I think we will get there maybe in a decade or so. A decade is quick. Well, thanks. Mic number four is empty again, okay? So we'll take the internet. Where is the internet? The internet is right here and everywhere. So a ruminate would like to know is the effort and cost of building in security to hardware as you've described in your talk significantly greater or less than the effort and cost of porting software to more secure languages? So is the effort of building new hardware more or less than the cost of porting existing software to better languages? I think the answer has to be yes, it is less. And the difficulty with porting software is all of you and all of your colleagues and all of your friends and all of your enemies have been beavering away, writing lines of code for 50 years really, I don't know what to say, effectively, really numerously. There's an awful lot of it. There's a really terrifying amount of code out there. It's very hard to get people to rewrite it. Okay, mic two? Yeah. So given that most of the problems that we can solve today are on the software level and not on the hardware level, is it possible to go happily using your test system as a model source so have during development the code base in case you're from the secure hardware but then essentially you should use the same software for unsupervised work and the equipment in the same way as from office. Right. So this question, if I paraphrase it, is can we use this hardware implementation really as the perfect sanitizer? And I think there is scope for doing that. And you can even imagine running this not on actual silicon hardware, but in like a QEMU implementation, which we have in order to do that. And I think for that purpose, it could already be a pretty effective bug finding tool. I think that would be worth doing. But you would like, I mean, as for any testing, it will only explore a very tiny fraction of the possible paths. So we would like to have it always on if we possibly can. Okay, the internet again. Okay, we would like to know how does your technique compare to Intel MPX, which failed miserably with a tulipsy ignoring it? Will it be better or faster? Could you talk a little bit about that? I think for that question, for a real answer to that question, you would need one of my more hardwarey colleagues. What I can say is they make nasty faces when you say MPX. Well, that's that. Mic number one. Somewhat inherent to your whole construction was this idea of having an unchangeable bit, which describes whether your pointer contents have been changed. Is this even something that can be done in software, or do you have to construct a whole extra RAM or something? So you can't do it exclusively in software because you need to protect it. In the hardware implementations that my colleagues have built, it's done in a reasonably efficient way. So it's cached and managed in clever ways to ensure that it's not terribly expensive to do. But you do need slightly wider data paths in your cache protocol and things like that to manage it. Okay, Mike, seven. How good does the system help secure the control for integrity compared to other systems like hardware-assisted data flow into isolation or code point integrity in ARM now? Well, that's another question which is a bit hard to answer because then it depends somewhat on how you're using the cache and how you're using the capabilities. So you can arrange here that each, say each C language allocation is independently protected and indeed you can also arrange that each, where this is sensible, that each sub-object is independently protected. And those protections are going to give you protection against trashing bits of the stack because they'll restrict the capability of those objects. Okay, the internet again? Twitter would like to know, is it possible to run Cherry C without custom hardware? So can you run Cherry C without custom hardware? And the answer is you can run it in emulation above this QEMU. That works pretty fast. I mean, fast enough for debugging as the earlier question was talking about. You can imagine, you know, somewhat faster emulations of that. It's not clear how much it's worth going down that route. I mean, the real potential big win is to have this be always on. And that's what we would like to have. Okay, mic one. Do you have some examples of the kinds of code changes that you need to the operating system and user line that you mentioned? So what kind of changes do you need to adapt code to Cherry? So if you look at C code, there is this C standard that deems a whole lot of things to be undefined behavior. And then if you look at actual C code, you find that very, very much of it does depend on some idiom that the C standard does not approve of. So there is, for example, there are quite a lot of instances of code that constructs a pointer by doing a more than one, more than plus one offset and then brings it back within range before you use it. So, in fact, in Cherry C, many of those cases are allowed, but not all of them. More exotic are cases where there's really some crazy inter-object stuff going on or where a pointer arithmetic between objects, which the C standard is certainly quite down on, but which does happen. And code which is, say, manipulating the lower order bits of a pointer to store some data in it. That's pretty common. You can do it in Cherry C, but you might have to adapt the code a little bit. Other cases are more fundamental. So, say, in operating system code, if you swap out a page to disk and then you swap it back in, then somehow the operating system has to reconstruct new capabilities from a large capability which it kept for the purpose. So, that needs a bit more work. So, it's kind of a combination. Some things you would regard as good changes to the code anyway, and others are more radical. Okay, another one from the Internet? Last one. The Internet has gone quiet! Bliss! Wow! Yes! Last question from the Internet. Are there any plans to impact test on Linux or just FreeBSD? If anyone has a good number of spare minutes, then that would be lovely to try the impact on Linux, not just FreeBSD. The BSD code base is simpler and maybe cleaner, and my systems colleagues are already familiar with it, so it's the obvious target. And for an academic project, doing that is already a humongous amount of work. So, I think we would love to know how Linux and especially how Android could be adapted, but it's not something that we have the resource to do at the moment. Mic number one again? How likely or dangerous... Your mic number one is not working. How likely or dangerous do you think it is for a programmer to screw up the capabilities he specifies? So, how often will the programmer screw up the capabilities? So, in many cases, the programmer is just doing an access with a C or C++ pointer or a C++ object in a normal way. They're not manually constructing these things, so a lot of that is going to just work. If you're building some particular secure encapsulation setup, you have to be a bit careful, but on the whole, I think this is a small risk compared to the state of software as we have it now. Okay, mic eight. So, existing memory packaging techniques are quite patchworky, like canaries and address random layout... Yes, address layout randomization. Yes. Is this system designed to replace or supplement these measures? I think if you... So, again, it depends a bit how you use it. So, if you have capabilities really everywhere, then there's not much need for address space layout randomization or canaries to protect against explicit information leaks. I... I imagine that you might still want randomization to protect against some side-channel flows, but canaries not so much, and whether you actually do for side-channel flows, I don't know. That's a good question. Mic four, please. Thanks for the very interesting talk. So, you presented with a system of verifying existing C-software sort of post hoc and improving its security via random mechanisms. Improving or proving? Improving, yes. Improving, no. Yes. So, what's your opinion on the viability of using a static analysis and static verification for that? Would it be possible to somehow analyze software that already exists and is written in these unsafe languages and show that it nevertheless has some security properties without writing all the proofs manually in, like, Hall or Isabelle? Well, so if you want to analyze existing software, so static analysis is already useful for finding bugs in existing software. If you want to have assurance from static analysis, that's really very tough. So, you certainly wouldn't want to manually write proofs in terms of the definitional C semantics. You would need intermediate infrastructure. And if you look at the people who have done verified software, so verified compilers and verified hypervisors and verified operating systems, all of those are now possible on significant scales. But they use whole layers of proof and verification infrastructure for doing that. They're not writing proofs by hand for every little detail. And some of those verification methods either are or you can imagine them being basically like static analysis. You might use a static analysis to prove some relatively simple facts about the code and then stitch those together into a larger assembly. I think any kind of more complete thing I think is hard to imagine. These analysis tools mostly, they rely on making some approximation in order to be able to do their thing at all. So, it's hard to get assurance out of some. Okay, Mike, one. You said you modified on MIPS architecture to add some logic to check the capabilities. And do you know what the cost of this regarding computational power and energetic power supply? Sorry, can you repeat the last part of that? What's the cost of this modification regarding computational power and power consumption? So, what's the energy cost? So, I gave you a performance cost. I carefully didn't give you an energy cost estimate because it's really hard to do in a scientifically credible way without making a more or less production superscalar implementation. And we are sadly not in position to do that. Although if you have 10 or 20 million pounds then I would be happy to accept it. Mike, number 7, please. How does the class of problems that you can address with cherry compared to the class of problems that are excluded by, for example, the rust programming language? So, how does the problems of cherry relate to the problems excluded by cherry relate to the problems excluded by rust? So, if you are happy to write all of your code in rust without ever using the word unsafe, then maybe there would be no point in cherry at all. Are you happy to do that? I think someone shakes their head sideways. Mike, number 1. What do you think about the following thesis? We are building a whole system of things with artificial intelligence and something like that that is above this technical level and is building another chaos that isn't healing with those things. It's dreadful, isn't it? There are, so you might, in fact some of my colleagues are interested in this question, if you might well want to bound what your artificial intelligence can access or touch and for that you might want this kind of technology, but this is not, we are with machine learning systems, we are intrinsically building things that we on the whole don't understand and that will have edge cases that go wrong and this is not speaking to that in any way. Okay, does the internet have a question? No. Good. I don't see anyone else, so let's conclude this. Thank you very much. Thank you.