 I apologize to everybody now. The way that I do presentations and my cadence and speech is probably a little bit different than folks are used to. But at the very least, I hope you're entertained. So it is exactly 11.45. So we should go ahead and get started. So we are here to continue talking about Envoy. Specifically, I'm going to talk, or Bill and myself are going to be talking about some of the open SSL handling that we've been doing within Envoy. And to kind of get into this a little bit, that's me. Folks tend to call me Redbeard. Redbeard is not my given name, but if anyone can help me understand why I get called that, that might be useful. Many years ago, I worked in the consulting division of Red Hat, where I was the subject matter expert on a whole bunch of different topics, including OpenShift before it utilized Kubernetes. Then I left Red Hat and helped to get a company called CoroS off the ground, where I was the chief architect. And I did a little bit of everything. And I love talking about Kubernetes. So if anybody, I'm very slow to respond to email, but if anybody wants to kind of talk to me on WeChat or anything like that, happy to take the time and take it from there. And at this point, I would like to introduce this lovely gentleman to stage left. His name is Bill DeCoste. Hi, everybody. I'm Bill DeCoste. I got a kick out of this picture, because this was also taken back in the OpenShift V2 days pre-Kubernetes, pre-containers. That picture's really old. So I've been at Red Hat for, I actually came over with a J-Boss acquisition. So it's probably 14 years now. I've been around for a while. And I've been the worker B, the engineer who's done the actual development on replacing boring SSL in Envoy and Istio Proxy with Red Hat, the sort of Red Hat certified open SSL libraries. So as mentioned, we are both from Red Hat. We are here to talk about a piece of software, or here to talk about work done on a CNCF project, specifically Envoy. I should hope that you picked up on this being about Envoy. If that wasn't obvious, there's gradually not too much I can do for you. But let's start out, Act 1s. This is a bit of a kind of circuitous route. So this is actually the tale of two cryptographic libraries. These topics may sound boring. If so, let's blame it on the reason Bill and I are even here talking to you today, which is specifically a library called Boring SSL. So if you're like me, you probably find cryptography pretty interesting. Hopefully, during the course of this presentation, some of the nuances we'll be talking about will shed light on how one library can actually matter so much. We'll also be talking about open SSL and the differences between that and boring SSL. We'll need to, if we're going to make any sense out of this. So where are we going with this? For many folks, the differences between these two pieces of software lies somewhere between being transparent and irrelevant. This may even lead you to be asking where I'm actually going with this. Is this even important? There's a lot of great content that's happening here today. And if you don't care about the nuances between boring SSL and open SSL, I'd ask that you seek out a talk that actually resonates with you. But for the folks that do stick around, I'm sure that some of this will be interesting, even if you still don't think it's important by the end. But I've digressed a bit. Let's get everything back on track and go back in time. Specifically, we're going to go back to the 1990s, a more simple time before encryption on the worldwide web. So 1990s, Netscape brought us SSL, secure socket layers. The intention was actually just to make consumers comfortable with using their credit card on the actual internet. Then, after that, we end up with TLS. So the successor to SSL was TLS, or transport layer security. While TLS made modest attempts at being backwards compatible prior to TLS version 1.3, it was fundamentally different technology. One of the most important differences is that all operations, both clear and protected in TLS, happen over a single port. And that connection upgrades to secure communications. And SSL and TLS are fundamentally different. This means that there was never really any interoperability between those, which caused a number of problems. The biggest of which, in my opinion, is specifically that SSL and TLS are often conflated. In practice, you pick one, or you pick the other, with the majority of web components today using TLS. Predating all of this, though, we have two other sets of standards, PKCS and X509. PKCS stands for Public Key Cryptography Standards. And these were all introduced by a company called RSA Corporation. PKCS were not industry standards. They were there to promote the products from RSA. Fortunately, they paved the way for a pretty broad range of standards we use across computing, like message encryption, transport encryption, and secure time stamping, et cetera. I am going to get a drink of water here. Helps if I don't try to drink the mic. So PKCS-1, this is actually relevant, though, like I said, it's curious. Defines a standardized set of grammar for the mathematics and formatting of RSA keys. PKCS-7 defines the basis for the subsequent cryptographic message syntax, or CMS, which is the IETF standard for securing messages on the internet. CMS is currently ratified in RFC 5652. Beyond that, we've got PKCS-12. It's a container format used to encapsulate public and private key material. Gets used a lot for bundling those public and private keys with things like Java key stores, and it's also known as a PFX file, and gets used by Microsoft Outlook, and web browsers, and a whole bunch of other things. The big thing to remember, it's a container format so that you can bundle the assets together. But it is not that type of container format parallel to all of this. As mentioned, in addition to the set of PKCS standards, we have X509. X509 predates SSL by almost a decade. This is a thing before Netscape kind of assembled this so that folks could put credit cards on the internet. In the end, it's really just an encoding mechanism. It was built on top the ideas of public key cryptography and utilizes ASN1 syntax behind the scenes. Now, as we realize that X509 is just a structuring and a grouping mechanism, it should start becoming clear that X509 is independent of an orthogonal to SSL and TLS. Sure, both SSL and TLS utilize X509, but X509 can also be utilized without SSL and TLS. Why does it matter? We're five, 10 minutes in, and I've thrown a wall of acronyms at you. You're probably asking yourself. Assuring you, or assuming that you did decide that this was important, and you're still here on slide 13. Why does this matter? Unfortunately, it does matter. A lot of these terms are thrown around, and folks pretend that they're interchangeable. In some ways, they can be. But if we can't agree on what we're talking about, it makes it harder to understand. At the same time, once we understand them, it unlocks a world of prior art, allowing us to stand on the shoulders of those who have come before us. This is because standards are important. Less than 10 minutes of clarification can unlock a world of making all of you more effective. It's an obvious choice to me, because these standards, which provide interoperability, and the interoperability is what makes it possible for even a small team of people to unravel this work. So where do we end up with? With boring SSL and open SSL and all of this. Ultimately, the way that we got here, the reason why these exist, comes down to Heartbleed. Heartbleed and Google. So after Heartbleed was released, a collaboration began on a fork of open SSL. The intention was to provide a fork of SSL that had been put on a diet, removing large swaths of code to make it more straightforward for the average user to understand. In addition, it became doubly pressing, because the threats to Google presented by the US government. In parallel to this, it is important to note that Google's desire to create boring SSL stems, does not stem, from actively vulnerable code. It just, it's there, or it was more that they wanted to make it hard for users to know. Sorry, I wrote all of these notes in to make it easier for the translator. And I've lost my place. It makes it hard, yeah, the complexity of open SSL makes it hard for users to know what they're implementing is objectively correct. They were maintaining an increasingly large patch set on open SSL and decided that they'd had enough. At Red Hat, completely unrelated to Google, we have to make decisions for millions of customers, not just one customer. Google has more or less one customer, their internal users. We have users that care about many, many different things, and for us, one bug doesn't justify throwing out years of expertise, tooling, et cetera. So, enter lift, and by lift, I actually mean Twitter, because none of this is straightforward. Twitter had invested a lot of time and effort developing the ideas that we commonly call a service mesh. Matt Klein and some other folks, William from Boyant and some of the folks, formerly not from Tetrate, what was the other service mesh company that was working on Envoy? I can't remember their name. There were a bunch of folks at Twitter who all worked on service mesh tooling, but the majority of it was never open sourced. So the developers all, when they left Twitter, had to start over and reimplement all of these ideas. So at lift, Matt started working on the proxy server that he wished he had when he worked at Twitter. And that was Envoy. And Envoy actually began its life with open SSL. But then, specifically, issue number 152, we end up with replace open SSL with boring SSL. And we have this kind of decision coming in from Matt. And he says, I mainly wanted to switch, because I know that you all, meaning Google, will help make sure we are up to date and secure if we happen to use boring SSL. And Pioter says, well, there's no reason why we can't support both. They're mostly API compatible. Fair enough. Sounds reasonable. Personally, I like that. Being able to maintain or have the Google folks maintain what matters to them, other parts of the community maintain what matters to them is perfect. Unfortunately, time is a fickle mistress, because then we get to issue number 3404, where it gets called out that the answer around adding back support for open SSL is absolutely not. They're migrating completely away from it. Folks from Akamai made this request, because this is the things that they need. And this is where the Faustian bargain comes in. Why would the project decide to make such a turn? Well, to look at that, we actually have to look at the charter of boring SSL, which is that it is designed to meet Google's needs. And while it is an open source project, it is not intended for general use, as open SSL is. They do not recommend that third parties depend on it. Doing so is likely to be frustrating, because there are no guarantees of API or ABI stability, which Bill will touch on a little bit. Merlin lies the challenge. Switch was made to the project, which is not necessarily aligned around what we traditionally think of as community. It's aligned around the interests and expertise of a single corporation. It's understandable, though. And I don't fault Google for it. When the community was much smaller, it made sense to consolidate around the opinions and needs of the specific groups maintaining specific pieces. But as we grow the project, the nuances of other members of the community become more apparent. And this is why we need stewards who understand the technology. That's why the knowledge of the standards and how they work are important, which is where we get into Act 2. So at Red Hat, we have products and we have projects that we collaborate on. This is how we balance supporting the things that we sell as a company and the kind of things that we give away for free to our users. This means every product has one or more upstreams. In the case of OpenShift Service Mesh, the product, we compose a number of projects together. So this is going to be the open source projects Istio, Keali, Yeager, Prometheus, et cetera. Envoy is the upstream for Istio proxy, but Envoy is not Istio proxy. And I don't want to focus on Istio, since this is a cloud native con talk, and Istio is not a cloud native computing foundation project, but it carries a patch set atop Envoy to create the component Istio proxy. So there are some differences here, and Bill is going to spend a few minutes talking to us about these changes, or the changes that brought the Istio patches along. And now, we're from Bill. Okay, can everybody hear me? Yep. Okay, so when I first started doing this about a year ago, and I'll talk about sort of why we did what we did and how we did what we did to bring OpenSSL into Envoy, our sort of mandate was Istio proxy, but 99% of Istio proxy is really Envoy. The core of it is really Envoy. However, there are some pieces that if you go take a look at the source up in GitHub under Istio proxy, there are some additions that they drop on top of Envoy that are Istio-specific functionality. The couple that kind of come to mind is some SNI functionality in there, and these are all the SSL TLS-specific ones for SNI, and also for JSON web tokens for JWT. Those are the two pieces. However, over time, just over the last year that I've been working on this, I've seen a whole bunch of that stuff, particularly the JWT stuff, get pulled out of proxy because now Envoy is doing it natively. So I think the differences over time, I've seen shrink, but I think the distinction that Redbeard made is really important because a lot of people think that Istio proxy is exactly the same as Envoy, and there are some, code-wise, they're relatively small, but I think they're significant, but over time, they're becoming kind of less and less. So my kind of plea to folks is to ask questions, but really, why does Red Hat care? Envoy, Istio proxy, why does it matter what SSL library is actually used? Tragically, it matters because of this word salad. So within the US, there's this thing called, actually the US and Canada, it's called the Federal Information Processing Standards, document number 140, revision number two, also known as FIPS 140-2, what? So the correct look for this is a glassy eyed stare dreaming of something far less boring than government regulation, but FIPS 140, or FIPS defines the standards around the agencies of the federal government, around the way the agencies of the federal government handle information processing. Specifically, FIPS 140-2 focuses on cryptographic mechanisms and more importantly, may and may not be used. Beyond that, it also gets into the details of how random number generators can work, how tamper evidence and tamper proof mechanisms work on hardware devices, et cetera. Beyond that, many governments and private companies adopt the guidance of FIPS 140-2 either because they do business with the US government or the Canadian government or because they just don't want to reinvent a wheel. So aspects of security divisions within smaller governments will often just go, okay, we're going to download all of the documents for FIPS, we will start from there and then we will make the changes, the patches to the document that we need that way. So even other nation states use these standards as a rubric for developing their own guidelines. So some examples here, FIPS 140-2 defines things like, hey, you can use AES, which is really some opinionated RSA, you can use certain types of ECC, but you can't use DES. You can't use triple DES with less than three keys. You can't use RSA with less than 2048 bits. So that means, you know, things like ed25519, you can't use it. Despite the fact that it may be more secure, you may not use it because it has not been proven that way. So this all seems completely reasonable, like, but if you change the code, you have to recertify it. I'll repeat that again. If you change the code, you have to recertify with an independent auditor to re-get your FIPS certification. It's kinda weird. This is a massive pain in the behind. Certification is conducted by these independent auditors that are contracted by the vendor who produces the software. Due to this, it generally takes a minimum of three to six months and can cost over $100,000. So again, three to six months and can cost over $100,000 any time you make a code change. At Red Hat, we have an answer to this. That enables us to work around it a little bit. It's called the OpenSSL capsule. So it's a special way of running OpenSSL which allows us to simplify this. It separates OpenSSL into two halves. When OpenSSL was running capsule mode, we express OpenSSL as multiple discrete components. This means that all of the cryptographic code, which for example implements the mathematics behind RSA, live in one binary while the code that handles the sending heartbeat packets in a malformed way lives in another separate shared object. This means that the cryptographic code can be certified and only needs to be recertified if something wasn't implemented correctly, mathematically. This is actually critical for our certification workflow. By doing this, it allows us to certify and audit that single binary for almost all Red Hat products. By running the system, literally booting the kernel with FIPS equals one, you tell the kernel, the Golang binaries, Apache HTTPD, OpenShift, Ansible, you name it, to bypass any internal cryptographic mechanisms that they have and instead use the OpenSSL capsule. It also means we don't have teams building their own cryptographic implementations and potentially messing that up. So important points to note out about this. FIPS mode, it is not used by default. When enabled, it affects the kernel, Go, and other binaries. But enough about that. You've been listening to me talk about the non-technical things, which I'm sure that's not why any of you are here. So let's hear some more from Bill about the changes that were required and who else was involved. Thanks, Red Bird. Okay, so what I wanna cover, I'm gonna go through maybe five or six slides. And please, if there's questions, please jump in. What I really wanna cover is what we've done with Envoy, why we did it, why we Red Hat did it. Because it's not as simple as just going and sort of being able to plug in OpenSSL instead of boring SSL. We've got some, if you go and kind of dig around in the code. Who here, let me back up. Who here has actually built Envoy? Like on a download of the Envoy source and built it. One, two, three. Okay, it is, it really is. Especially if you've been doing Istio, right? Istio builds really, really easy and building Envoy is not trivial. So there's a couple, when you go and take a look at the changes that we've made that I'm gonna reference a whole bunch of Git repositories. Everything is all open and available that we've done. So you can go take a look at it and see what we've done. You can go run the exact same thing and replace all of the boring SSL with OpenSSL. I'm actually gonna step down if that's okay. Can you guys see me back there? But what I wanna cover is what we've done, why we did it and sort of where we're going forward with the upstream community because I think that's the most important thing. And we've gained, as I mentioned, I've been doing this for about a year, but over the last several months, we've really gained a lot of momentum, I think, because it's not just Red Hat. There's a lot of other companies, organizations that are interested in kind of the work that we're doing are being able to plug OpenSSL into Envoy or Istio proxy. I'll just talk about Envoy, but it applies just as well to Istio's proxy. That they're also interested in for various reasons of running OpenSSL instead of boring SSL. And these are sort of Red Hats of priorities, right? Our current focus right now is on Istio proxy, but there's a lot of stuff going on too that maybe is pure Envoy. But what we need to do as Red Hat is support the REL version for the reasons that Brian just mentioned, for FIPS and for other reasons, we need to support OpenSSL because that's the component that we have our expertise in. We've got contributors there. That's what we're comfortable supporting. Going and supporting a whole nother crypto standard is opens up a whole ton of risk for us as a company. Okay, so hopefully that's enough about sort of Red Hat as a company. FIPS, he just talked about boring SSL over maybe six months ago, actually became FIPS compliant, so that's become sort of less of an issue. But what we needed to do is be able to take Envoy, completely remove all of the boring SSL. If you go and build Envoy, it pulls in, there's this wonderful tool called Bazel. There's maybe some smirks out there from the people who have used it. It's a tool that Google created that will build almost anything, sort of regardless of language, it's a port, it's Java and Go and C++ and all sorts of other things to go and build whatever kind of software you've got and pull in all the dependencies and manage all the dependencies. So if you go and kind of take this, run Bazel, it'll go and pull in all these dependencies, boring SSL being one of them. When it pulls in boring SSL, we need to go and completely strip that out. We can't, Red Hat needs to go and have a sort of a whole separate build. We need to pull in all the software and build it completely offline so that we know whatever we ship or whatever we provide, we know where all that source is and we can sort of tag it and go look, this is all the source, it's not sort of online pulling in dependencies, right? And Bazel sort of works in a separate way that it kind of assumes you've got all this internet connectivity and it's going to pull down every damn thing it needs. The build, all the build tools, all the dependencies, all the dependencies of dependencies and anyway build the whole tree. It's a great tool, but it also can be a little complex. It's really powerful, but it can be complex. We also use a different build tool set for those same kind of reasons as what they run upstream. So if you go and go take a look at the CI that Envoy runs upstream, they use kind of a whole different sort of build tools, different versions of the compiler and that sort of thing. They also, and this is sort of another interesting conversation that came up over the last couple of weeks, is if you go take, boring SSL as Brian mentioned is very dynamic, right? It's changing all the time. The APIs change all the time because it's really targeted in a different sort of set of use cases. In a lot of cases, then open SSL is. But Envoy doesn't care because they dynamically link in boring SSL, right? They're not, excuse me, they statically link in boring SSL into their Envoy binary. So they don't really care if this stuff all they're changing. If there's, sitting on the operating system, there's another set of libraries of boring SSL because they've statically linked everything. It's all bundled up into their one sort of Envoy binary and that's it. So you get the version of boring SSL that's gonna work for that version of Envoy and they're good. And that makes perfect sense. Our case is a little bit different. We actually wanna go and have Envoy use the underlying crypto libraries that are provided by the operating system so that we can switch out the operating system and then Envoy will automatically go and pick up the crypto libraries. A little, do you have a clicker? Thanks. Okay, so Maestra is the community project name for what Red Hat's been doing with Istio. Think of this as just sort of a fork. It's our community. If you think of OpenShift Origin, people here are familiar with OpenShift? Yeah, okay. So if you think of OpenShift, Kubernetes is sort of the upstream origin is the community version that Red Hat's behind. That's sort of an extension of Kubernetes. And then similarly, OpenShift Container Platform is our product that Red Hat sells. If you go think of the same way, Maestra, which is where in GitHub all the stuff lives, all the work that we've done is analogous to Origin. This is all the community stuff that we go and do builds for Istio. Images and builds and all that stuff is up there. Again, this is talking about Istio but it all really does apply directly to Envoy because 99% of the work is Envoy, not Istio. Right now, if anybody wants to go and take a look and run any of the proxy, all the stuff's available. 0.11 is the current tag that's out there. That's the first release that actually has OpenSSL. All the previous releases were running the boring SSL. And you can see the versions there that correspond to the upstream. The images that we're shipping are now UBI-based. So there's a kind of universal-based image, right? So that's the Linux version that sits underneath all these images. And we're also working with, Red Hat as a company is contributing a whole bunch of stuff to both Envoy and Istio but the key piece is what we've been working for over a year now with upstream to do everything we can to go get as much as possible, pluggable, easily pluggable in upstream. The upstream, and I think we're kind of getting to a pretty happy place for everybody. The upstream wants to keep moving as fast as possible. Envoy's moving incredibly quickly, boring SSL's moving fairly quickly. So they don't want to slow down the additions to Envoy by having their developers have to go and support and become experts in potentially two different crypto APIs and two different crypto libraries. And that makes perfect sense. We had originally discussed, actually coming up with an abstraction layer for all of the crypto that you could plug into. And the decision was made that because boring SSL it has diverged and it's gonna continue to diverge at a pretty good rate from open SSL that it doesn't really make any sense. It would put on such a burden for Envoy developers to go and continue to move forward and add features to Envoy that have nothing to do with crypto that they'd have to become experts in both of those and have to deal with this API. Maintaining the API for the crypto abstraction would become sort of a nightmare over time. So what we've decided to do is kind of take a kind of a least common denominator of what we can pull out of the core of Envoy and move that into extensions. And that work, we've done a lot of that and we're going to continue to do that so that anything that really touches the SSL libraries is now going to live in extensions and hopefully becomes fairly pluggable. Okay, next one? Wow, this time went by fast. Okay, so we've pulled out right now if you go take a look at what's up in Envoy, there's been a, we've had a whole bunch of PRs that have been merged that pulls almost every all of the crypto stuff out of the core and now lives in extensions. And the bulk of that right now is in transport socket and the TLS listener, 90%, maybe 95%. There's a couple of things right now that are in common and one of the challenges that I personally have had and Brian has talked about this is upstream is moving so fast that there's lots and lots of developers and companies that will go and add new code in and every time we'll go grab the latest version, stuff is leaked back in, right? So, and I'll have to go back and they're against a boring SSL API and they break the work that we've done and have to go back and refactor all the replacement work. So this is, it's been sort of a cyclical process but I can kind of see the light at the end of the tunnel at this point. A perfect example of that is for the quick protocol. They recently went and added support. They've had support for quick but they actually pull in Kishnau which is a library from Chrome that the Chrome guys came up with that has very specific boring SSL. They're including, include boring SSL slash blah, blah, like it's very, it's not even open SSL slash something that happens to be a boring SSL library. They're really specifically got in the code boring SSL. So this is a problem for us. But this is sort of just one of the challenges we've got because there are dependencies that the Envoy pulls in that do have a boring SSL dependencies. Here's the really, the good news. This is last Friday, right? If you'll get the date on Friday, the upstream guys, Matt Klein agreed to, they have opened, there's now a repository for open SSL Envoy that they're gonna go, we're gonna go and put all the work that we've done up there and every time there's a merge into Envoy, they're gonna go trigger a CI build of the open SSL extension. They're not gonna stop anybody. It's not gonna block a merge if it fails but it's gonna start to flag stuff, right? So, and this is huge, right? Because then if you have a conscientious developer who goes and says, hey, I went and maybe I shouldn't have leaked this SSL code. I put it in the wrong place. They'll probably accept my merge because I wanna keep things moving but maybe I can rework my pull request, right? And put it maybe where it belongs. And so that's a big one for us. This is literally what I'm gonna be doing next week. It's going actually populating that because if you go look at the repository, it's empty because I've been on a plane coming here since the repository was created but this is kinda really big news for us. Okay, next one, down to three minutes. These are all just sort of for reference, right? So what we've done is for each one of the components and for each one of the dependencies, we've gone and created sort of a similar pattern of how to replace the boring SSL with all the open SSL. So we've got sort of a process on the next slide and we're probably not gonna have time because I'll give a couple of minutes for questions. But what happens is, yeah, can you hit the next one? And this is where we, two other repositories where we've had to go and create, it became so problematic to go in and talk directly to the open SSL API for what they can do in boring SSL that we've had to sort of simulate some boring SSL functionality on top of open SSL and that's what these two guys are. But so what happens is we'll go and pull an envoy and we run a script for each one of the core components and from each one of the dependencies and that will go and update, it'll change the Bazel configuration for the build, it'll go pull out all the boring SSL specifics, replace it with the open SSL, the corresponding open SSL code and then kind of go do the whole build. So this is definitely for reference and the kind of key point of all this is if you're interested, all this stuff is upstream, it's all available, it's all in Maestra, up in GitHub and hopefully very, very soon, like by the end of this week, a lot of this, by the end of next week, hopefully a lot of this stuff will be in actually Envoy Proxy upstream and it'll be available there and be hooked into their CI soon thereafter. I think we've got one minute. Okay, so Bill has documented the placement process here and talked a little bit about that. We are going to publish the slides on the schedule. I do want to take a second to say this work is not just being done by Red Hat, as you saw, there was the request from folks at Akamai for this sort of support, there's also some support coming from Intel and VMware. You know, there are a number of organizations that want to see this for very similar reasons as the FIPS reasons that we talked about. So to summarize what we've been talking about here today, the choice of SSL or TLS implementation that you use has profound effects, understand why you're making the choice that you are and join us in helping to work on this and collaborate. Thank you very, very much. Very honored to present you here today. Yeah, thank you. And I just want to sort of exaggerate that Red Hat, everybody saying that everything that it's going on here is not vendor-specific, right? And when it was just Red Hat, well, we need open SL, we need open upstream, really doesn't care, right? That's one vendor saying we really didn't get any momentum until VMware and Intel and all these other guys said, hey, they've got their own reasons, they're different from Red Hats, but there's sort of enough commonality right now that there's enough momentum gaining, now we've got a repo right up in Envoy Proxy because I really wouldn't thank those guys. And if there's anybody who has the sort of the same open SL kind of drivers for who you're working for, please let us know and get involved because the more and more people, the more momentum we can get for getting an open SSL upstream, the better. Thank you. Yeah, thank you.