 Thank you for joining my session. Today, we're going to be talking about injecting security to the cloud. My name is Susan Henricks, and I am from Verizon Media, formerly Yahoo. I work with the Edge team at Verizon Media. Been there six to seven years. I work on web proxies, routing, and some networking tools. I'm a committer on the Apache Traffic Server project. And in my Cobia spare time, I practice and teach yoga. And let's see my contact information if you want to reach out to me later. The goal of this talk is to present to you some of the TLS scenarios that we encountered as we were moving our applications into a hybrid cloud environment. Specifically, we're going to talk, or I'm going to talk, about how we use two open source projects to help that transition. A proxy, Apache Traffic Server, and an access control system, Athens, to do this. And we're supporting our proxy in kind of an application delivery network environment. And as with any significant project, this is not my work alone. I've worked with a lot of my colleagues at Verizon Media and other members of the open source community for this work. Some of them are listed below. Let me start by saying a few words about these two open source projects. First, Apache Traffic Server, I mentioned, is a web caching and proxying tool. And it's been out in the world for quite a while. It's been an open source project for over 10 years. And it was used in a proprietary form by Yahoo and InkTomy before that. The unique thing about Apache Traffic Server is that it allows the injection of business-specific logic via plug-ins, so separately written, dynamically loadable libraries, that trigger code on specific keep hook points in the lifecycle of an HTTP transaction or session. This enables the developers to create and specialize how we handle web traffic for our company environments. I've worked in the area of TLS for ATS since I've started. And specifically in the last three to four years, we've spent a lot of our effort in the TLS area enhancing our support for mutual TLS with the goal of supporting our hybrid cloud environments. The other project that we have leveraged is Athens, which is a role-based access control system. And it leverages mutual TLS authentication to perform the authentication part of that. It was contributed to open source by Yahoo Verizon Media. And it's been the basis for most of our Verizon Media authentication and authorization work. And as I go through this talk, we'll talk a little bit more about these projects, but it's kind of the high level of what they are. So motivation, why are we going through this effort? So in the good old days, we had corporate data centers. An organization would have one data center, maybe a couple data centers, where all of their servers running their front end, their back end, and their middle end where would all be in the same physical location. So we could be kind of loosey-goosey in terms of authentication and in terms of encryption. We could just maybe rely on IP addresses to verify that the client was coming from within the data center. Might not have been a great idea, but in fact, that's what folks would do. And encryption within the data center wasn't a huge concern. But more recently, we've had public clouds come out into the world. In that case, an organization, instead of building their own data centers and hiring their employees for the care and feeding of those machines, would rent virtual resources from third parties. And this saves money and time and makes it very easy to dynamically bring up an application. And maybe if that doesn't work, bring it back down. So it's very good for startups and even for larger companies who are launching speculative applications. But if you have a large, well-known, dedicated application, say, like Yahoo Mail, deploying that in a public cloud can be a very expensive proposition. So of course, we mix the best of both worlds. And we have now the hybrid cloud. So in that case, you're going dedicated company data centers or on-prem resources. You're going to have one or more public clouds, so maybe an AWS cloud and a Google Cloud or an Azure cloud. And depending on what stage of life, the lifecycle and application is, it may run on-prem. It may run in a cloud. It may move over time from on-prem to the cloud and maybe back again. So in this kind of environment, we have very dynamic and complex test relationships. So a user agent may come and enter your application from an on-prem system or from a third-party cloud. Within the back end, within a data center, maybe on-prem is going to access some service that's running in a cloud and vice versa. So we can't get away with just checking the IP address to make sure that our requester is in our corporate data center. And we have to be very aware of encrypting things over the internet, protecting our data as it's transiting on secured networking space. So we've done that by taking advantage of TLS authentication. And let me take a little bit of time just to quickly review kind of the classic TLS authentication models. And the kind of the classic configuration is only the server is proving its identity. And in my little diagram, I'm showing a client is talking to a server. It sends a client hello message. The server sees that, sends back a server cert. The client will verify the signature on the server cert, make sure it's signed by an entity it trusts. And then through the rest of the key exchange, the server is going to prove to the client that it owns the corresponding private key. If the client is authenticating itself to the server, if that's needed, that's done at a different level via password or two-factor authentication. But TLS also provides mutual authentication. So in that case, both the client and the server exchange certificates. So again, in our scenario, the client will send a hello. The server will send its server cert. And then it also sends a message saying, hey, give me your cert. And the client will send its cert over. It'll check the server cert signature. The server will check this client cert signature. And then through the rest of the key exchange, each will prove to the other side that they own the corresponding private key. So this mutual TLS authentication mode, this client cert mode isn't great for humans, but it is pretty decent model if your client is another program, another service. So that's how our Athens works for the authentication mode. We have Athens certs that are provided and that proves their identity to the other side. But when we started, most of our clients, most of our servers didn't support mutual TLS. This was a new concept for them. So we could have waited and rewritten all of our servers and clients to perform mutual TLS and we would probably still be waiting. But instead, you could put a proxy and use that to help retrofit your clients and servers to support mutual TLS. And in the case of retrofitting a client, we could stick a proxy like ATS in front of the client. And four years ago, you could configure ATS to always provide the same client cert when requested. So in that case, the client would talk through the proxy to the origin. The origin would ask for a cert, it would always provide the same certificate. And that would work, but in a lot of cases, your client is gonna need you to provide different client certs to different origins, depending on what they're asking for. So we added within ATS the ability to do fine grain client certificate selection. We added logic within our main routing configuration line that will specify for this particular transition use this cert. So here we're showing if a client comes in requesting bank.yahoo.com slash PCI, we'll map it to PCI.server.com and we will use the client cert that's in the file PCI.PEM. But if the client comes in to yahoo.com slash datastore, we'll send that request onto datastore.server.com and we'll provide the client cert that is in datastore.PEM. So similarly, we have issues with the server and we can use our proxy to retrofit a server to support mutual TLS. In that case, we put a proxy in front of the server, the origin. And again, when we started four years ago, we could configure that proxy, our ATS box, to always request a client certificate and that would work. But in some cases in a more complex server environment, you'd wanna only ask for a client cert in certain situations. Now it's a little harder in this server scenario because this is at the very beginning of the handshake. All we know at this point is the client's IP address and whatever in the client hello message. One of the things that is in the client hello message is a server name indication or an SNI. And that gives a domain name that the server is indicating that he wants to talk to and all the modern clients do do this, do provide this TLS extension. So we can take advantage of that and we've augmented ATS to have a configuration where based on the SNI name provided by the client, we can do certain things. Like we can adjust what HTTP protocols we're gonna offer. Maybe we'll offer HTTP2 or not, which TLS protocols we're gonna allow or we're gonna allow TLS1.0 or only TLS1.2, 1.3. What ciphers we're gonna offer. And then in particular for this case, are we gonna require a client certificate? So that works. However, SNI is not really a security thing. The client can put whatever they want in for the SNI. So we're taking this user provided input and we're using it for a security decision. So that's a bit sketchy. So what happens if we have a user agent that doesn't play by the rules? And here we're showing a malicious user talking to a server. So say in our servers configured to require a client certificate for accesses to secure.com but not for insecure.com. So our malicious user will send an SNI in its client to low saying insecure.com. The server will see that, will look through its poll and they say, I don't need to have a client search for that. We'll complete the handshake. And then the malicious user later on that connection sends a get request with a host header set to secure.com. Again, our server has already set up the connection. So it merely sends that request off to secure.com and sends back the sensitive response from secure.com. So we've made this assumption that the SNI corresponds to the host name that we're not checking that, which is dangerous. And this is not a super tricky hacker kind of thing. There is the curl command to do exactly what I've specified there. So we can't solely rely on the SNI for our security policy decisions. So we have several options. Option one, we could just back off and say, ah, this SNI thing, that was a bad decision to be truly secure. We should always require a client search or not and have a dedicated proxy for that. You could do that. Again, that's a pretty tedious kind of thing. And there may be other SNI based actions that you wanna distinguish on. So we couldn't really do that. But option two, we could always check that the SNI and the host name and the request match. But that's probably a bit overkill on the other side. If the SNI name isn't used for a policy, it doesn't really matter. And with connection reuse and browsers, it's pretty frequent there. You're gonna have a mixture of requests on a connection that will have different host names. And so those would fail. So as I show here in the diagram, the browser will negotiate with an SNI of food.com. There's no policy associated with that. So we just do the handshake, makes a request with a host header of food.com. That's great. We make the check, they match. We sum the response, makes another request with the host header of bar.com. We make the check, the server does that bar.com is not equal to food.com. So it rejects it. And that's probably not necessary. Because again, in this case, the SNI values weren't triggering any security sensitive policy. So with ATS, we've took this middle ground of saying only check the host header and the SNI. If the host header value would have triggered a security sensitive SNI policy. So in our malicious user case, they come in with insecure.com, we handshake without a client cert, and make a request for secure.com. Our ATS is going to check and see that the secure.com would have triggered an SNI policy that requires a client cert. It's going to check to see if the host header value matches the SNI value and they don't. So it's going to fail the request. It's going to turn a 403. And this solution has worked out pretty well for us. We haven't, it catches problems and we haven't had very many false failures. In the community, there's been proposals for other solutions to address this. Specifically in the H2-2 BIS working group, there has been a draft RFC floating around for secondary certificate authentication in HDB-2. And what they propose is dynamically renegotiating for client certs. And this works out and could be implemented today with HDB-2, because the underlying TLS does support renegotiating client secure certificates for both TLS 1.3 and earlier. So in that scenario, the diagram shows we would get our SNI for insecure.com. We'd do the handshake. We'd get a request for the host header of secure.com. The server would recognize that it needs a client cert for that. It would renegotiate the connection, requesting a client cert. The browser would send the client cert. It'd validate, and we'd all be great. In HDB-1, requests are made in series, one after another. So there's always a sense of what's the current transaction on the connection. So we can kind of leverage that lower level TLS to bootstrap us up and get newer client certs to match the transactions as they come. But in HDB-2, there can be multiple parallel active connections. There was no sense of just one current transaction. So leveraging the current client certificate for the underlying TLS connection just doesn't make any sense. And the RFC recognizes that and proposes additional logic at the HDB-2 level to track all of this. And once that gets approved and implemented, that would be a good approach and we'll reconsider that. But at this point, our current approach supports both HDB-1 and 2. And so we're continuing on with that. So with this technology, we've implemented a secure edge to proxy TLS as traffic is entering and leaving our various clouds. So it's a lot like retrofitting the TLS server. So a server in a cloud, we may not wanna directly expose to the outside world. So we have this edge that's proxying requests towards the servers. We don't wanna expose those cloud servers directly because we may not trust the implementation completely. And also our security team finds it easier and more direct to be able to concentrate on one implementation to ensure that the security policy is inconsistently enforced. So we have three options or I'll talk about three options about how we're proxying TLS through our secure edge, TLS delegation, tunneling and bridging. Before I go into TLS delegation though, let me say a few more words about how Athens implements authentication and authorization. So as I said earlier, Athens builds on top of mutual TLS authentication. So the client provides an X509 Athens client cert that's signed by our Athens CA. And inside the cert, there is one of the common names identifies the Athens principal or the Athens user. Then later requests over that connection will include an Athens token in the header of the client request. And that Athens token is an OAuth2 token. It defines the principal and the rules that that principal is authorized for. So based on that token, the server can verify that the request that that principal is making is allowed based on the rules. The token itself has a limited lifetime and it's signed. So with that kind of background, here is how we are supporting TLS delegation for Athens. So in this case, we have a client, we have a proxy and a server. And the proxy is a member of our secure edge running ATS. The client will come in and make a connection to the proxy. It will provide its Athens client cert to the proxy. And the proxy will perform an F1 function to verify the client cert, mostly verifying that it is signed by the Athens CA. Then the proxy will later make a request on that connection and that request will include an Athens token in the header. So an F2 function will be performed by the proxy. In this case, it's implemented in a plugin off the read request header hook. And that function will take the client cert and the authorization header. And it's gonna make sure that the authorization header, so the access token is down to the client cert so that the principal names match. You will also verify that the access token is otherwise well formed. And assuming that all checks out, it's gonna add a verification header to the request and it's gonna turn around and make the connection to the server. It cannot reuse the client cert because it doesn't have access to the client cert key, but instead it's going to provide a proxy Athens cert to the server. The server will verify the goodness of the proxy cert and then assuming that works out, complete the connection and the proxy will forward on the request to the server with the original access token A and its validation header V. Then the server is going to perform its own validation step ensuring that the validation header is present and that the access token is well formed. So it's gonna take the proxy's word for it that that request came from a client that had an access token bound to that, sorry, a client cert that was bound to that access token. Assuming that all checks out, it'll send a response to the proxy, the proxy will send a response back to the client and all is good. So this works out well. The logic, the TLS connection is terminated on the proxy but now we've added another step. So as an attacker, this is another point where we could go and try to attack and a malicious user could try to bypass the proxy completely and try to pretend to be its own proxy and try to connect up to the server. Since the malicious user, if it's acting as a proxy doesn't have to have a client cert that matches the access token, it maybe makes it a little easier. So the malicious user would make a connection to the server assuming that it could and it will provide its own version of a client cert. And if that client cert that M cert checks out, the handshake will complete and it can make a request with an access token and a verification header. The access token is signed. So presumably our malicious user has stolen an access token from someplace. Assuming that all checks out, then it could get the response from the server. So there's a couple levels of protection here. The malicious user will have to find some place, a signed access token. And then secondarily, it's gonna have to create or steal a client cert. So this is the area where we need to be careful on the client cert, what makes a good client cert. So in our case, it has to be signed by the app and CA. So that's a bit more difficult. But if you're setting up your server so any cert signed by a common CA, like DigiCert works, then all our malicious user would have to do is go to DigiCert, pay a little bit of money and get a valid cert. So that's one of the areas that as you're designing these kinds of systems, you need to be careful. Moving on, another option is TLS tunneling. So as I mentioned in TLS delegation, we are breaking the, we're terminating the TLS tunnel at the proxy. So clear text is sitting at the proxy. For some traffic, that is unacceptable, particularly for things that have compliance issues like PCI and payment systems. We just cannot terminate on the edge tunnel. We need to limit the exposure of that data. So in that case, we can provide a TLS tunnel. So in that case, the TLS is end to end. It goes from the server, sorry, from the client to the server. And we have a policy that based on the SNI name, we're going to see, oh, instead of terminating it, we should send it on to this other destination. So here we have FQDN of supersensitiveexample.com. If we see that as the SNI, we are going to forward that client to low and all the other traffic directly to supersensitiveorigin.example.com. And then all other traffic is just gonna be blindly passed back and forth through the proxy. So that works, we don't terminate the connection, but we don't really have any control or understanding at the proxy level of the strength of the TLS. Are they doing mutual TLS? What protocol version are they using? We just don't know. And you have to go back to and audit the client and the server to do that. Bridging also avoids that decryption on the proxy, but it allows us to make some guarantees on the strength of our TLS connection independent of the user agent or the server stack. So in that case, we're gonna build a separate tunnel between two gateways, two ATS boxes and potentially double tunnel the traffic. So here's a diagram of how that sets up. In this case, our client is going to be proxying against the Ingress ATS. It's gonna send a connect request for service, but that's gonna get intercepted, delivered to the Ingress ATS. And that's gonna get intercepted by our TLS bridge plugin, which is gonna be configured to say, oh, any connect request is going to service. I instead wanna send through a TLS tunnel that I'm gonna set up with peer ATS. So it sees the connect to service. It starts setting up and negotiating a TLS tunnel with the peer ATS, which we have complete control over. Once that's set up, then it's gonna blindly tunnel everything it gets from the client into the tunnel to the peer ATS where it's gonna come out of the tunnel and go to the service and then return traffic. Similarly, it will go to the peer ATS into the tunnel into the Ingress ATS out of the tunnel back to the client. So here again, our traffic from client to server is not decrypted on either of our ATS boxes, but between our two ATS boxes going over our untrusted network, we have control over the strength of that TLS connection. So we can enforce our mutual TLS, what protocols we're providing, et cetera. And here's another diagram showing basically what I talked about, another way of looking at that flow. So wrapping it up, the secure edge approach has been very instrumental for Verizon media to be able to move our applications to the hybrid cloud. The proxy layer, that secure edge layer provides one place to ensure that security policy is applied consistently. It also enables us to leverage our knowledge. All teams do not have to become mutual TLS experts. We provide support for teams as they're onboarding to the secure edge. And also it has the possibility so our clients and servers for different applications can move between clouds over time and it doesn't dramatically affect the security policy. We're still gonna authenticate the Athens regardless of whether the client comes in from an on-premise cloud or third-party cloud and similarly with the servers. We'll still require client asserts, client Athens asserts, regardless of which cloud they're sitting in. So this leveraging open source technology has really enabled us to nimbly move into the hybrid cloud and provide us a very secure stance. And with that, if there are any questions, I have a few minutes, we'd be happy to answer any questions.