 Brian is an associate professor with a joint appointment in the electrical engineering and computer science department at Carnegie Mellon University Where he also got his PhD in 2010 going on to win the ACM dissertation award that year He worked then for Microsoft Research for six years and then he came back to CMU He's received numerous awards An accolade, so just mentioned a few he's been listed in the 30 under 30 Science list from Forbes. He's received the the best paper award at the IEEE Symposium on security and privacy and the username symposium on network systems and today he's gonna Talk to us about he is his designing tools and methodologies to help us Create software libraries that can be Automatically verified and so help our job in designing more secure systems. So Brian Thank you. My name is Brian Parno And I'd like to thank all of the organizers for asking me to come out and talk to you today about some of the work We've been doing on formally verifying the security of critical software particularly cryptographic software So this work was motivated by looking at the HTTPS ecosystem Which as many of you probably know is a critical piece of our software infrastructure today So it's used by over 40% of all internet connections And it's used not just for web but also for VPNs for emails just about everything and why is that? It's because we as security professionals tell people don't invent your own crypto don't invent your own protocols You will inevitably screw it up. And so instead you should use something that's well tested and understood like HTTPS Unfortunately HTTPS is actually a very complex beast. It's not just the HTTPS at the top It's also the standards for certificates X599 as in one It's the TLS protocol itself and then it's all the cryptographic algorithms and the low-level operations that underlie them And of course if you look at in terms of volume there are over a hundred pages just informally specifying the TLS Protocol if you look at the implementations open SSL is almost 300,000 lines of C and assembly code and even boring SSL Which is supposed to be a slimmed down version of the protocol is still almost 200,000 lines of C and assembly So it probably shouldn't surprise us that historically we've seen bugs in every single one of these components Right and these bugs range from low-level buffer overflows like heart bleed all the way to fundamental problems with the protocol itself So even the standards themselves were flawed and this is not just one implementation I'm going to mention open SSL a lot, but it's not that open SSL is particularly bad It's that these flaws affect everybody and we all don't actually see a plateau either It's not that the software is generally getting better and sort of aging like wine. It seems more like milk right now So inspired by this this problem We've started the Everest project and our goal is to develop verified replacements for this entire ecosystem So we can put a stop to this continual cascade of high-profile vulnerabilities that we see in practice So more concretely our goals are to develop verified replacements for each one of these components and furthermore to verify that the Collection of components together give you the abstraction that you want namely a secure authenticated channel to the remote party Now we also don't want this to just be an academic project We want it to be actually achieve widespread deployment to the point where it's a two-line change to your current deployment And you can start running our verified software And so we want to make sure that we can inter-operate with existing systems We want to be standards compliant with the latest versions and we want to be fast And in fact we want to be at least as fast if not faster than unverified implementations So nobody has to choose between fast and secure The goal is that they can have both Finally are we're developing enough these the software using a number of Verification tools and along the way we're trying to make the verification tools easier and more trustworthy Not just to make our lives easier, but so that somebody else can come along and say hey You've done TLS and HTTPS But I really care about quick or Kerberos or some other protocol and these these tools can see wider use as well So along the way there's a number of challenging research questions that we're trying to address Things like how do you assess whether a protocol is actually secure especially when you have to run it alongside protocols like TLS 1.0 That we know are insecure and I mentioned performance before so we want to know how we can get a good performance despite Designing our code to be verified We want to handle the kinds of advanced threats like side channels and other attacks that are discussed at conferences like chess And of course we want to figure out ways in which we can instill confidence in these verification tools and develop an ecosystem Where it can be sustainable right so if we finish the project and then wander off We want other people to be able to pick up these tools and keep moving forward with them rather than having code that just bit Rots on the side So this is a very large-scale effort primarily spread between different Microsoft research labs Inria in Paris and now Carnegie Mellon as well And it's still an ongoing effort or partway through and this is an area that interests you you're more than welcome to join in So this is gives you some sense of how far we've gone We've devoted a lot of attention to cryptographic software and to the TLS protocol as well as the parsing and handling of x509 and SN1 we are still working our way up the stack to the point where we'll address HTTPS itself Along the way we've generated all kinds of academic publications, which is nice We've also had a number of spin-offs So we've had some work on quick which is a related protocol that came out of Google We've been working on verified reference implementations and actually informing the standardization process of TLS 1.3 itself So our our team along with a number of academic groups have been looking at the TLS 1.3 standard as it was being developed and Since it's one of the most thoroughly vetted standards We've ever seen and it's great that we can actually start to get ahead of the curve and start patching the standard before it Goes out to the water world rather than five ten years later We've also had some success deploying portions of our verify code in actual real world software So it's being used at Microsoft and side of a couple of blockchains It's also being used by the Linux wire guard VPN and Mozilla Firefox So if there's any Firefox users out there, you've probably used some of our verified cryptographic software so today I'd like to focus on One portion of this overall project and namely on evercrypt, which is our verified cryptographic provider Some of you may be asking what why a verified cryptographic software particularly things like AES, right? It seems like it can't be that hard to get right You run some test vectors if it comes out the way you expect everything should be good But of course cryptographic software is critical And so bugs anywhere in that software is going to undermine the entire Enterprise of developing secure software and furthermore we see a steady stream of vulnerabilities in this cryptographic software So it's not like we've completely solved the problem or that we can know how to write these algorithms Effectively and securely and if you look at the way these bug reports are written You'll notice that there's a couple of themes First a lot of these vulnerabilities only affect a specific platform Right, so it's a bug on a 64-bit version, but not the 32-bit version Or it's a bug on one of the advanced instruction sets like SSE or AVX and not not some of the other flavors And people respond to this by saying well, let's do some more random testing And of course we all know that with these cryptographic algorithms where you have gigantic inputs We're never going to exhaustively test anything by exploring randomly. All right, so we need something more systematic and of course You're all where we also have the problem of side channels right in some areas of security We like to sort of brush that side channels under the covers and say oh, that's not a really a big problem We'll focus on other things But when it comes to cryptographic algorithms, we've actually seen attacks based on side channels And so it means our job of verifying and developing software is even harder because we need to take these side channels into account So how do we wind up in this mess? Why do we see so many vulnerabilities year after year? Well, if you take a look at open SSL code, I don't know how well you can see this on the right It's actually a mix of pearl assembly and CPU processor macros and it's customized for over 50 different hardware platforms So if you look take a look at some of this code, you start to wonder Why are we trusting 40% of the internet's traffic to code that looks like this and The common answer is that we want performance the world wants performance They want their code to go as fast as possible and the cryptographic software is typically the bottleneck for a lot of these applications And so we choose performance So if you look at open SSL and compare it to other popular open source libraries it usually meets or beats their performance and it does that by in part by taking advantage of platform specific optimizations, right? So AS&I buys you at least a 4 to 8x improvement over just a vanilla C implementation or even sort of vanilla assembly implementation And so we have this combination of desire for performance and that leads us to create complex ugly code, which then leads to vulnerabilities So stepping back a level, what does an average programmer want when they reach out to grab a cryptographic provider? Something like open SSL. Why are they turning open SSL? Well, they want something that's usable, right? They want something that's written in C or assembly that they can easily plug into the project Not something written in some esoteric research language that they've never heard of They want something that's comprehensive, right? They don't want to have to go out and find Oh, this library will give me RSA, this library will give me AES They want something that will provide all the cryptographic needs in one place Ideally, they'd like something that does auto configuration and multiplexing So if it's running on an Intel CPU that supports AS&I, it should automatically choose an optimized version for that And if it's not, it should fall back on some more generic algorithm And that should all happen under the covers, that shouldn't be the application developer's responsibility Ideally, we would like to be moving towards cryptographic libraries that offer agility So that means that you should have a single unified API for something like hashing Where maybe you specify which algorithm you want, but other than that the API is perfectly uniform And that should make it easy to switch to a new hash algorithm if we discover a flaw in SHA-256 Rather than taking 5 or 10 years to migrate away from SHA-1 as we saw in the past From a research perspective, we have a slightly different set of criteria Nobody actually wants to verify C or assembly because it's a mess We'd much rather verify languages that are designed for verification We'd also like to focus on programmer productivity, right? We're all busy people We'd like to program as efficiently as possible and not spend a lot of time customizing for each individual platform Of course, we also like auto configuration because it's very embarrassing if you have your verified software And you hand it to somebody and the first thing they see is a blue screen because they used an illegal instruction Finally, we'd like to have deep integration, meaning that even though we might have multiple implementations One for AS and I, one for written in C, they should both verify to the exact same specification So that the client can be ignorant and blissfully ignorant of what's happening underneath And in particular, we can enforce that through abstraction So we say we're not going to tell you what's happening inside, we're just going to promise you That we've correctly implemented this crypto and you don't need to worry about which particular implementation Or what kind of optimizations we've done internally And so ultimately our goal with EverCrypt is to produce a comprehensive verification result Without giving up on performance, we want to give you the best of both worlds So looking at a little bit more detail, at a high level EverCrypt is designed to serve both unverified C clients So you can use the crypto directly, but also to serve higher level verified software So we want to be able to have this verified TLS layer run on top of EverCrypt As well as other clients like a Merkle Tree library Internally, EverCrypt is written in a combination of verified C like code and verified assembly The assembly lets us go fast, the C code gives us a generic fallback And overall EverCrypt is aiming to offer features like agility that I talked about before To make it easy to switch to new algorithms Multiplexing so that we can optimize for individual platforms And abstraction, both to allow us to change in the future And to hide extraneous details from verified clients that come along later To give you a sense of what we've achieved so far I think we feel fairly good about claiming that EverCrypt is comprehensive It offers a wide variety of functionalities from authenticated encryption To hashing, message authentication, symmetric crypto, key derivation And other popular crypto functionalities And we've got both generic implementations in C that should run on just about any platform As well as some targeted optimizations for hardware specific features So that gives you a broad view of the project In the next couple of slides I'd like to step through how we verify each layer of the project Starting at assembly, working our way up through C And cryptographic constructions all the way up to the applications that might use something like EverCrypt So I think I've alluded to these requirements before But when we're developing cryptographic software we want several properties First we'd like it to be correct So if we think we're implementing AS or we think we're implementing ECC It should really do that and nothing else And so people in the formal methods refer to that as functionally correct So it's actually implementing the function that we intended and not something else We'd also like it to be secure In that case what I mean is that it follows the intended control flow And it does not inadvertently leak information about our secrets Say through timing or through leaving state lying around after the program finishes executing And finally we want it to be fast We want it to be able to take advantage of both platform agnostic and platform specific optimizations Historically it's been difficult to meet both Or to meet all three of these requirements And so we've wound up with implementations that are either fast but unverified OpenSSL or verified but slow So there's been a number of verification experts in the academic community And many of them have had to give up on performance In order to get the verification results to go through So just as an example if we look at SHA-256 There's been a couple of efforts to verify the correctness of code Some of it even taken from OpenSSL But typically it's taken some of these slower variants of OpenSSL So the result has been something that leaves us with a substantial performance gap Compared to unverified code And historically the world cares more about performance than they do about speed Than about the security of their software Particularly when you're talking about 2 to 5 to 10x overheads If you look at how OpenSSL is achieving this speed Start to dig into the details You'll notice that it's actually mixing assembly and purl And it's building up a giant string inside of purl representing the assembly code So that's it Emit things like both based on the purl variables And then it also embeds C preprocessor macros to customize based on which platform you're running on So in this case for a particular variety of ARM It's going to emit some instructions in other cases it won't It also uses some macros to specialize the code So in the 15th round of this particular algorithm It's going to have some instructions And in the other rounds it won't emit those instructions Okay there's more complexity There's a purl for loop in order to do loop unrolling So it's much more succinct to write it as a for loop But the actual emitted instructions are going to be unrolled so you can save a register Furthermore it's going to assign register names based on purl variables And then it's going to use some purl tricks to shuffle the names of those variables So that you can avoid extra moves during the SHA algorithm Okay so these are all fancy tricks They also do some things where they actually take the string that's emitted by purl And interpret it with another purl function In order to do some mathematical operations inside So this is not the kind of thing that's designed to make you feel warm and fuzzy about the security of the software I'm actually amazed that they get this code right And I have a great deal of respect for the people who develop the software this way But I think their lives are very hard And the result is something that looks like this So if you are the developer of the software Maybe you have some chance of understanding it But for most of the rest of us it's actually very hard to look at this And convince yourself even what algorithm it's implementing Let alone that it's correct or to debug it And ultimately prove that it's doing the correct thing So we developed a tool called Veil That's designed to give you a firmer foundation with designing code like this In particular the goal is to give you a flexible framework That can target multiple architectures And they can prove that the software is correct and secure Without giving up on performance So how do we do that? Veil supports flexible syntax that allows you to adapt the tool to different platforms So the tool itself is agnostic to what platform it is And so you can target at any platform you choose It's high performance in the sense that we can actually generate code That's identical to OpenSSL so we can match their performance And then we can tweak it to go further In some cases actually exceed their performance And finally it's high assurance because we can actually generate formal proofs About all the code that we're writing in Veil And check that they're actually correct using a machine to do the checking for us So let me give you some examples of constructs in the Veil language So at the base we have just standard assembly instructions So you can define what it means to do a move on your architecture What it means to do a shift Or something more complicated like here's the AAS key gen assist instruction We also use structured control flow Since that seems to be what most cryptographic implementations go with And it makes the verification process easier And finally we have optimizations constructs That allow us to play some of the same tricks that OpenSSL was doing But in a more principled way So let's take a look at one of those examples So we have the notion of an inline if statement And this is an if statement that's evaluated not at execution time But while we're actually generating the code While we're actually emitting the assembly code That will then be assembled into the executable So for example we can do a check to see which platform are we emitting code for We can emit different code for the x86 instructions Or for ones that support advanced instructions We can also use it to do loop unrolling Similar to what OpenSSL was doing But in a way that allows us to verify the loop before we do the unrolling So let's look at a concrete example here To emit a sort of arbitrary number of add instructions So we can write that as a single recursive loop here And invoke it with say 100 When we do that that's going to generate an abstract syntax tree in AST That consists of 100 add instructions And that will then produce 100 add instructions That will be processed by our assembler But the nice thing is as a developer you can write four or five lines And verify just those four or five lines And then underneath is provably secured to unroll that to this large Hopefully more performant executable code So that gives us fast code It still leaves a question of how do we prove that it's actually correct So to do that we develop a veiled tool And a veiled language that you write your cryptographic implementation in It should look fairly familiar to standard assembly programming And the tool is going to produce an AST representing the program And it's going to automatically produce a number of lemmas About properties of that program Our programmer will also hand write some lemmas to help the verification tool With some of the hard bits And we're going to feed that into a proof assistant Along with the specification of the cryptographic algorithm So we'll say here's a succinct description of what it means to compute Shaw Please check that this AST actually computes Shaw And the proof assistant is going to answer us and either say yes or no And the assistant that we're using is sound Meaning that it will never accept a program that is incorrect But it's incomplete And that's why we have to provide these extra lemmas to help it along That our hopefully correct code is actually correct Of course to prove that our implementation is correct We have to have some semantics for the machine we're going to run on We have to know what it means to do an ad instruction on this particular architecture And so we have to develop semantics for the architectures that we're targeting And then feed that into the proof assistant as well The actual verifier that we're using at present is called F-star It's a functional language that looks kind of like ML But enriched to provide general proofs It's based on the Z3 solver It's automated and is able to see a lot of the low-level proof steps automatically And so we help it with the higher-level proof steps We also have a back-in for Daphne Which is a verification language targeted more at C-sharp or Java developers And in theory we could target other back-ins as well like Coq or Isabelle So once we've verified our program and the verifier is signed off We have a trusted printer that's going to print the actual assembly code And then a assembler, either something from the GNU toolchain Or something from the Microsoft toolchain Can process that and produce the executable we're going to run At a high level, we can divide this into various levels of trust So we carefully design the system so that we don't actually trust the veiled tool So the tool itself and the code and the lemmas that the programmers write Are all untrusted If there's a bug anywhere in there, it'll be found when we feed it into the verifier Of course, we have to trust that we got the semantics of the platform So if we say that when you do an add instruction it actually does subtraction Then obviously when you actually execute the program we're going to get a different result We have to trust the cryptographic specification So we have to trust that we correctly took the RFC for SHA And translate that into something machine readable And we trust the assembly printer Overall the trusted computing base is relatively small And can be used to verify arbitrary amounts of cryptographic code on top of it So when I say that we're verifying software That often sounds somewhat esoteric It sounds like something that monks do in some exotic village And not something that anybody would do in their sort of day to day life So I think it helps to demystify the process if I show you a small example here So this is the language Daphne that I mentioned earlier It's designed to look kind of like C sharp or Java And I'm using it because it's nice for demo purposes But the experience of verifying assembly looks fairly similar as well So here hopefully you can see that we've got a small procedure where we're hoping to duplicate an array We're taking an array of integers and a length and we're going to copy it over It's pretty straightforward So just like in Java C sharp we're going to start by allocating an output So we're going to say that we return an output array And we're going to allocate that output array And in the background the verifier is running And automatically checking whether we've made any obvious mistakes So you can see here it's highlighted the fact that we might have just tried to allocate an array Using a negative value Length is an int And so this is clearly an illegal operation And it might result in undefined behavior So unlike a standard language where you might add a dynamic check to say Hey, is length greater than zero then do the allocation otherwise do something else In Daphne and other verification languages you can actually add a precondition So up here we can say we require the caller to prove that length is non-zero So that is going to become an obligation on the caller But given that fact we can actually prove that it's safe to do this allocation And that's not something that will be checked at runtime It's something that's statically checked by the tool So now that we can allocate our code we can do a standard little loop here Say while i is less than length And because I've been doing this for a while I know that Daphne needs a little bit of help Keeping track of where this index variable is So I'm going to add a loop invariant To keep track of what we're doing at each stage I'm going to say that the output of i is equal to input of i These typos are convincing you this is a live demo And again in the background more or less in real time We're getting feedback about the correctness or incorrectness of this code So we can take a look and see In this case Daphne is worried that we might have just de-referenced an null pointer That's a very common way to make an error in a code And so again rather than adding a dynamic check I can add another precondition that says Please don't call us with a null input So it's your responsibility to filter out nulls And so that gets rid of that problem The next problem is a little bit more subtle It says that we might have an index out of range Some of you may be wondering that seems a little odd We said that i starts at zero We loop until we get up to length And we're indexing it into input But of course the problem is that While I called this input and I called this length There's nothing binding the two So I had something in mind when I wrote that But there's nothing actually connecting them Until I add one more requirement that says that The actual length that we statically analyzed the program to have Here is equal to length Given that we now know that this is a safe index Safe access and the program is happy Finally there's one other problem here Which is that Daphne isn't convinced that this program is going to terminate And that's a property that we probably want from our cryptographic code And so here I've forgotten to add our increment There we go Okay so now we've proven that we don't have any Obvious bugs in the program But we haven't actually said anything about what this procedure is doing Presumably the person calling this has some expectations about what it'll actually do And to provide those properties We need to add an insurers clause to say Here's what we're going to return Here's the properties of what we're returning So in this case we want to ensure that the output array Is equal to the input array And unfortunately Daphne can't see that that's true And this is a case illustrating the incompleteness of the tool So this is perfectly correct software But Daphne can't see that on its own So we have to give it one more little bit of help here Which is to add one more invariant That tells it what we've been doing in this loop at each iteration So I'm going to say that for all j that are between zero And the value that we're about to do Then we've correctly done the copy So we're kind of keeping track of our progress Through the course of this loop This invariant is untrusted So Daphne will verify that it's true when we get to the while loop It'll then verify that the loop preserves that invariant And then it'll attempt to use that invariant To prove the post condition In this case it goes through So hopefully this gives you some sense of what it's like To develop verified software You're writing software kind of like you normally would Except that you, as you go along Are adding pre and post conditions To either prove that you are not making mistakes Or to provide properties to the caller So that you can ultimately improve some high level property To say correctly Back to our regular program here So hopefully that gives you a sense of how we're proving That our software is correct But correctness is not quite the same as security For security we want to talk about information leakage In particular we don't want secrets leaking Either through digital side channels So through the timing the program takes Or through the memory accesses that it makes Nor do we want to leave our secrets lying around in memory So after the program executes We'd like to leave it in a nice clean state So when I say digital side channels What I mean is if the cryptographic program Is taking a public input and a secret input And we allow the adversary to see some side channels Observations there should be no correlation With the secret input More formally we can say that if we imagine Two runs of the program where the public inputs Are held constant so you get the same public input And we give two different secret inputs Then the outputs or the observations That the adversary sees should be indistinguishable And if that's the case Then he can't have any idea which We're using and so we can infer that our secrets Are actually being protected So more formally for all pairs of secrets And for all possible public values The observations you get from running The program on input P and S1 Should be equal to the observations you get Running it on P and S2 So what does that look like in our general framework Well I told you that we're specifying hardware So for functional correctness We say that the hardware consists of Some number of cores, a flat array of memory And some IO state The cores have registers, they have Some segments and paging and other sort of Gory details And then we're also specifying instructions So for example we say that the add instruction If it takes two registers is going to Read the two values, do addition Modulo to 32 and Store the result back in one of the registers Now to talk about information leakage We need to augment that with additional information So in particular we're going to add A trace field to the state To represent all the things that the adversary might have Observed as we execute the program And then we're going to expand our semantics To add things to this trace So in particular every time we hit a branch We're going to record which branch we took So we're going to say either we took the true branch Or the false branch And every time we have an instruction that Reads or writes to memory We're going to add the address to that trace So statically you can think about this trace Growing and growing as the program executes And what we have to prove is that We're going to be independent of the secrets That the program might be processing So in particular we're going to take this hardware Specification and we want to prove that we meet This non-airference leakage specification That I showed you earlier And we want to do that for the cryptographic code That we've developed and of course we're going to Develop a whole bunch of cryptographic code And that starts to sound a little bit painful We're going to have to do this complicated Non-airference proof for every cryptographic algorithm We develop and we're lazy people So instead of doing that What this means is that we actually wrote an analyzer program In our verifier tool In this case f-star that takes in an AST Representing the program and outputs yes or no Saying either this is leaking secrets or no it's not We then write a proof That this analyzer is sound And that proof is proven correct Against the leakage specification That I showed you earlier So given that proof we can trust The output of the analyzer and say If it says it's sound then we're in good shape The proof effort is one time So if we do this We can run it over multiple instances Of different cryptographic algorithms And the specification is trusted But it's very succinct So it sort of fit on one side in math And it's not too much worse When we make it mechanically verified So now that we've got this verified leakage analysis We can run all of our different cryptographic algorithms Through it and amortize the effort We put into that one time proof Now when you're doing this kind of leakage analysis Historically there's been a major challenge In this program that's storing a zero To the address held in RBX And then later we're reading from the address Stored in RBX and sticky it into RCX I might insert an additional store here That's going to the value in RAX Okay So then if I ask you what value does RCX To have, does it have a zero or does it have ten The only way you can answer that Is if you know whether RBX and RAX Are aliased if they overlap Right Okay So this is a historical problem And historically people have chosen Different compromises along the way So one option is to do the analysis In some higher level simpler language Where it's easier to see where the aliases Happen Unfortunately historically we've seen that compilers Are prone to inserting side channels As they do compilation Most of them are not designed with side channels in mind And so that seems a little bit dicey Another option is to implement pointer analysis There's loads of literature on doing pointer analysis So those techniques are inherently imprecise So you wind up rejecting programs That you might otherwise accept Or you could go the other way and say Well you're probably not doing any aliasing And we'll just assume that you're not Unfortunately cryptographic software Including cryptographic software written by OpenSSL That we're trying to emulate Actually does use aliasing all over the place So this would not be a sound assumption Fortunately with mail we're actually able to take A very different approach based on the fact That we're already doing a bunch of work To figure out a way to leverage that effort And turn that to the purposes of leakage analysis So in particular If you're doing functional verification Say you want to prove the output is going to have A particular value, you already have to prove The fact that rex and rbx are different Otherwise you won't be able to talk about the output In a sensible way And so what we do is we have the developer As they're writing the software Actually add annotations to say Hey the store I believe is storing secret values When I load I believe it's loading public values So those values are checked along with all the other Functional verification and then once they've been checked They can be soundly used as part of the analysis So that gives us A system that gives us fast Secure and correct software We've used it on a variety of implementations Just to prove the diversity And to support the comprehensiveness of evercrypt So we've used it for things on ARM On x64 and along the way We've actually found some vulnerabilities in the OpenSSL code that we're reporting over to our system Some of the key lessons we took away Is that as we ported these algorithms To different platforms The limas and other proofs that we developed For one platform actually translated fairly nicely To other platforms so we were able to amortize our effort And it actually was non-trivial One of the most non-trivial parts was understanding Those invariance that the OpenSSL developers Had in their heads and that unfortunately Are not terribly well documented along the way We were also able to leverage our automation To actually get a lot of those optimization proof And correctly without a lot of work from the developers In particular developing The veiled tool and the analysis around it Was fairly intensive but most of the implementations Particularly the ones that were ports From other platforms were fairly straightforward So to give you a quick summary Veil allows us to do Correct, fast and secure software It's very flexible and allows us to match The optimizations that are used in popular And fast, unverified systems And it allows us to do more exotic Verified analysis such as The leakage analysis I showed you Okay so that's the assembly code Let's quickly look through the other parts of Evercript So in particular We don't want to write everything in assembly It'd be nice to have, write the easy parts in C And also write the parts that are going to be Generic across platforms in C So we do that by developing in a fragment Of F-star which is an imperative looking fragment That we can actually extract to C code And verify that against the same high level Specifications that we've already written in F-star Those high level specifications themselves Can be extracted to a camel And we can use that to test the specifications For obvious errors that we may have introduced along the way So for example you might Write a shawl looking something like this If you squint a little, if you've done some functional Programming before, this shouldn't look too unfamiliar Or if you're computing some bitwise operation On X, Y and Z and we're pulling values Out of the hash and each one of these lines Corresponds pretty directly to the C code That gets submitted on the other side Of course this raises a lot of problems If we're interoperating between C and assembly They have different memory models about how they think About memory, they have different calling conventions When you're calling from assembly into Or from C into assembly based on what platform What OS, what compiler you're using And of course there's different ways of reading about side channels So you can see our paper, there's various solutions to that That resolve this tension And allow us to soundly interoperate Between these two pieces So of course that tells you how we're writing Low level cryptographic primitives But ultimately we want to assemble these Into cryptographic constructions So let me give you a flavor of what we're doing Taking from the TLS record layer So here we're going to be verifying The property of stream encryption So we need to have a formal definition for what that means Hopefully everybody remembers That says that we're going to take some plain text message We're going to fragment into pieces That'll fit into network packets Each of those fragments will be encrypted We'll send the ciphertext over The other side is going to decrypt And reassemble some prefix Of that message because some messages may be lost Or may not have arrived yet The property we would like Is some ideal log So when you do encryption, we stick the plain text Into the log, we randomly sample the ciphertext And that's what actually gets sent on the wire And then decryption just consists Of looking in the table And finding one of those values And returning that to the caller And so what does it mean for it to be secure? It means that an adversary should not be able to distinguish Between the real and the ideal Except for some negligible bound But of course the implementation we have in practice Is much more complex With all these different packets being fragmented And it's a very hairy piece of code And yet we want to verify it against That high level simple description So the way we do that is that we assume That we have a block cipher and we model it as a PRF And we formally specify what that means And then we go through a number of steps to prove That we actually meet that high level Cryptographic based definition We prove functional correctness We actually prove the cryptographic soundness Of the implementation itself In a way that looks very similar To a standard cryptographic proof Except that it's about the implementation Rather than about a paper description of the system And this involves not just the Kind of proofs I've showed you earlier But also proofs about memory safety Proves about injectivity of the way In which we handle messages All kinds of intermediate theorems That we have to prove in order to get These high level properties One of the nice things that comes out of this Is that we can actually get very concrete bounds So for particular implementations For particular construction For the code that's actually going to execute So we can say for this implementation Here's exactly when you need to start Rekeying your connection So it's only safe up to this number of messages So we need to Once we're given all these cryptographic constructions We need to stitch it together in a way That's going to offer a clean agile interface To people who are developing software on top of it So how do we do that? One way is through abstraction So when you're using these automated tools Automation does really well when you give it The correct information, if you give it too much Everything bogs down and development gets rather miserable So we actually wind up abstracting Not just the implementations, but even the specifications themselves So at a high level the caller only cares Say I implemented a compression function Doesn't care about all the details of exactly How Shah is going to operate So we abstract at that level We also use generic programming All over the place So I showed you this example of compress earlier That takes in an algorithm and has state Of the algorithm Unfortunately that's not something you can easily Extract to C Why not because the internal state of the Shah function Actually depends on which algorithm you're doing So some Shah algorithms You have 32-bit state, other algorithms You have 64-bit state So in C you could in theory do that via a union That's going to be kind of ugly code And it's not going to be very efficient So instead what we do is we rely on partial evaluation So before we emit code We actually evaluate it for individual algorithms That is efficient and can be extracted to Idiomatic looking C Finally internally we actually do A variety of multiplexing steps So that we can choose between an optimized implementation Done in Vale and the generic version in C And that choice can be Based both on static configuration So you can statically choose and say I don't want to use assembly, maybe because you're doing testing And we can do dynamic discovery Of CPU features using things like CPU ID Fortunately Despite all the complexity happening underneath We have a single agile specification That we export to clients So all of the cryptographic code Whether it's written in C or assembly or a mix of the two Is verified against the same exact specification And so the caller knows that byte for byte The output is going to be exactly the same So they don't have to care about what's happening internally So let's very briefly touch on what we can do With software written this way So we've developed a number of higher level constructs Things like HMAC, HKDF, Merkle Trees A portion of the quick specification That does transport encryption And the nice thing about this is that each one of these Can in turn offer an agile Abstracted interface to their clients In turn And so the Merkle Trees for example Can be instantiated with any hash function That we've implemented and it's a single line change To say I want to do shot two or I want to do shot three And the entire Merkle Trees implementation Still continues to function So the actual Merkle Trees implementation Is an optimized version designed to be Incrementally constructed so you only have to do On average one hash computation to extend the tree And we're keeping a whole bunch of the recent nodes In memory so that it can be even faster Despite this additional complexity for performance reasons We can actually prove that it's implementing a Merkle Tree And give a cryptographic proof showing that If you find a collision on a Merkle Tree We can reduce that to a collision on the underlying hash function Which is exactly the property you'd like to have And this is a non-trivial property So there was the Bitcoin implementation Of Merkle Trees like this Actually had a flaw that was in the implementation For about three years before anybody noticed So I've talked a lot about performance I'd like to give you some more specifics About what that means But the high-level takeaway is that EverCrypt Is able to match or exceed the performance Of the best implementations out there Either ones that are verified or unverified So as an example here's some performance For shot 256 The first two bars are comparing OpenSSL And EverCrypt's portable implementations These are the ones that will run on any platform And here they're fairly close But if we get to the point where we're targeting Platforms you can see that we're at Exactly parity with OpenSSL across All of these different sizes Similar story holds for authenticated encryption So here we're looking at the targeted Implementation of EverCrypt in blue And the targeted implementation from OpenSSL And you can see that we're beating On ASGCM We can actually beat OpenSSL's ASGCM And also their Polycha-cha implementation And we're at the point where we're At sub-cycle latencies per byte On these implementations So we don't have to say, oh, we're sorry We verified the software but it's really slow It's actually some of the fastest software out there As another example, if we look at Curve25519, popular let to curve these days There's a variety of implementations In C and assembly, some are verified Shown in green, some are unverified, shown in red And you can see that our portable Implementation in C is actually beating The existing C implementations as well as Some of the assembly implementations And our assembly implementation, which is using Some of the latest instructions from Intel And some of the other implementations out there That we're aware of and that we've tested So we no longer have to choose between Performance and security, we can actually Get both using techniques like this And this also translates into good performance For the applications that we build on top of it So this is showing that we can get Over 2.7 million insertions per second Into these incremental Merkle trees That performance is consistent across As this tree grows, so we are getting This incremental property And if you look at Bitcoin's implementation Okay, so again this is Not just some micro benchmarks, it's actually Translating into real world performance At the application level So just to summarize I've argued that cryptographic software Needs to be both fast and secure If we want it to be used in the real world Without a steady stream of vulnerabilities We've developed a number of new tools And techniques that allow us to make this Realize this desire in practice And with EverCrypt we can actually offer Agile, multiplex, high performance Cryptographic provider to the world And we're hoping that the larger Ever's project Will showcase the power of verification And its applicability to real world software And so with that all of our tools And software are available, they're all open source So you're welcome to check it out And with that I'd like to say thank you for your attention There is time for questions So do you expect That your Implementation, so for example On curve 25519 Will stay The fastest Practical implementation Yeah so that's a good question And no I don't because I think all these algorithms People are innovating and improving All the time So I think the important thing is not so much that If you take a snapshot at any given moment At the moment we happen to be the fastest But I don't think that's the important part I think the important part is that we've shown That these tools are able to adapt And pick up these clever constructions And so that's the place where we want to be Either academically or out in the open SSL world Port them into something that's verified And prove that they actually got it right So I mean wouldn't it make sense To be able to take The output of I mean the output assembly language For us something like A gibrike program Like curve 25519 And just Prove the output result Yeah, so one option would be to take the assembly that opens this all in bits and say, okay, there's 20,000 instructions here, we're going to just verify it in one shot. We found that as a developer, it's much more pleasant to develop it in a modular fashion to verify, say, the loop before it gets unrolled rather than after, and that seems to be, at least in our experience, a friendlier, more scalable way of doing the verification. Okay, thank you. Sure, thank you. So very nice talk, thank you. I was wondering when you're verifying the correctness of the assembly programs, you need to have a specification somewhere of the semantics of each assembly instruction that is being used. Yes. And there's, well, lots of them, and some of them are really, really complex, so I was wondering, where do they come from? Are you sharing that code base with other projects, or are you handwriting all of those semantics, or? Yeah, that's a great question. I mean, if you look at the Intel manuals, they stack up at least as high. Fortunately, what we found is that we don't actually have that many assembly instructions for the kind of cryptographic software we're doing, so we've actually only specified something on the order of 50 or 60 assembly instructions, and that covers about all the algorithms I showed you, and over time, we gradually expand that as needed, and we're able to do that in part because if you look at something like XOR, Intel specifies, I think 22 different flavors of XOR, we specify one, and so we just say you can't use the other 21 varieties, and that's enough to make progress. In terms of complexity, yes, some of the instructions get complex, but we are able to avoid some of that complexity, again, by avoiding some of, just saying, okay, you're gonna use the instruction in this particular way, or by writing the, we're able to write some of the properties in a more functional style as that gets us some benefits as well. But so far, at least all of our specifications have been handwritten. There have been some projects at, say, ARM about releasing their own mechanical specifications, and I'd be more than happy to use those, they just weren't available when we started the project. Similarly, I think some folks at UIUC just released a massive specification for X64, and I think that is also an awesome project, and we would be more than happy to validate our specifications against theirs, and eventually start using those specifications as well. I think that's a, to the extent that the whole world can start agreeing on verification or specifications for the hardware, and all start testing and contributing back, I think that's a good place to be. Thank you. Hey, thank you for your talk, it was very interesting. It's a very clever approach to taking this input from this tool and rendering assembler and assembling it with standardized tools. Do you also take a step of verifying the binary machine code that was generated by the tools to make sure the tool actually generated what you intended? Yeah, it's a good question. So right now we're trusting the assembler to correctly emit machine code. We have not yet taken that extra step to say, let's specify how to interpret the machine code and check that back against the original assembly, partly because it's tedious, partly because the specification for what that's supposed to look like might not be that much smaller or simpler than the assembler. So if you take a small simpler assembler, what is it doing? It's basically looking at the assembly instruction and mapping that to a series of binary values. What is your specification going to look like? It's gonna say, take this assembly instruction, map it to binary value. So you might get some leverage because, again, as I said earlier, we don't have to include all the instructions. We can write an assembly that just targets ours. So you might get a little bit of leverage, but historically, I guess we're more concerned at higher layers. Okay, thank you. Sure. I actually expected you have already said it, but I didn't catch it. Are you compatible with OpenSSL at the source code level? I mean, can a client switch between Everest and OpenSSL? That's a great question. To do that, we're working on creating an OpenSSL engine, which is their way of adding a plug-in to the code. They just changed that recently, so we have somebody that's working on that's trying to map their interface to our interface. I think we're always gonna need some mapping like that because we've been trying to design our API to be a little bit more somewhat different fashion than OpenSSL chose to develop their API, but given that mapping should be straightforward to plug in. Hi, you mentioned that you had also this tool that when you implement, say, a particular mode for authenticated encryption gives you also the bounds. I was wondering, what happens if you implement a scheme that is broken like OCP-2? Would you find a failure in the proof? Yeah, so with any of these implementations, what you typically find is if you implement a broken scheme, or if you implement a broken implementation, then somewhere, inevitably, the verifier says, no, sorry, I'm not gonna accept this program. And typically, you react to that, as you saw with my demo by saying, okay, I must have screwed something up. Let me see if I can figure out how to convince the tool that I actually have something correct. I'll add more invariants, I'll add more preconditions. And if the tool still sits there and says, no, I'm not gonna let this go through, eventually, you start to think, okay, maybe there's a problem here. Actually, that's how in the process of developing, say, the TLS layer, we actually discovered some vulnerabilities in the specification itself. It gives the proof just when it goes through and you start there, it's there long enough and you're like, oh, actually there is a flaw here. And the same thing would happen on the cryptographic side as well. Thank you. Hi, Brian, nice to talk to you, thank you very much for it. This is more of a nebulous question, so maybe there's no good answer, but you're sort of coming down from software into the processor and having to sort of deal with whatever semantics the processor provides. I wonder if there's any, excuse me, if there's any, I don't know, experience that you gain from that that gives insight into how we should be building processors. Yeah, that's a good question. So I think there's a couple things. I think one is that I really like this trend of, say, ARM exposing formal specifications for their API. That saves me a ton of work, I don't have to go off and painfully interpret their English pros. And furthermore, those have been extensively validated in a way that we haven't done with our specifications. I think the second part is that right now, even those specifications that ARM is providing us with only talk about functional properties. They say, here's what an ad does. It takes two values, it adds them and sticks it back in. Nobody has a formal specification, and very few have even an informal specification of what security properties they're supposed to have. So when we're talking about non-interference, we start writing the specification of what observations the adversary is making. That's based entirely on the lore from the security world. So we say, well, you shouldn't use Dubai, because we all know that that's input sensitive. But there's no specific, like if you look in the Intel manual, it doesn't say anything about that. And so that's not a great place to be. It's not a firm foundation. So what we would much prefer is to have some specification where the hardware people could say, hey, this is the security guarantee we're willing to make to you. Here's a formal specification of how it behaves. And then we can write secure software, improve our software, is secure against that. And hopefully the hardware people could be proving that their hardware is actually meeting that obligation. I think right now there's not even an agreement on what form that would take. And so I mean, you could extend our approach and just say, we're gonna create this trace and we're gonna fill it with everything the hardware is doing. But as a software person, that's way more information than I wanna have. As a hardware person, nobody wants to sort of bind their hands that way, right? You don't want to expose all the information because then you can't change it. So I think we as a community need to sort of come up with a consensus of what language should we specify that in and how can we sort of meet in the middle of there? Cool, thank you. Okay, so let's thank Brian again for our wonderful talk. Lunch is served.