 I'm Emily, I'm from the Chrome security team, and I'm here today to talk to you about whether certificate transparency is usable. If you're not familiar, certificate transparency is this emerging system for improving the WebPKI by making it possible to detect malicious or misissued or otherwise bad certificates. When I talk about whether certificate transparency is usable, I mean things like, can it be gradually deployed? Because we can't just flip a switch on the Internet and have it magically deployed everywhere. So can it be deployed gradually and provide incremental value along the way? I'm also talking about how much pain it imposes on the existing people in the WebPKI today, and also on end users, for whom we don't want to show lots of warnings and breakage, so we want things to generally work on the Web for them. So I'm going to go into some more background about what CT is and what I mean by usable, but first I want to give a couple motivating examples about why CT is important. You may be familiar with this incident from 2011, in which Google became aware of a number of misissued certificates that were being used against Google users, and these certificates had been issued by a root CA called Diginotar, which at the time was widely trusted in client trust stores. So fortunately, this attack didn't actually succeed against most Chrome users because of a feature called certificate pinning, but it was basically only by sheer luck that Google found out about the attack. And once it was discovered, the certificate was revoked and eventually Diginotar was removed from client trust stores, but it does go to show how important it is to find out about attacks like these. Maybe even just as important as preventing them from succeeding, because when we find out about them, we can discover CA misbehavior or other security problems, and then we can fix those and prevent attacks like this from going forward. So this discovery of attacks like this is basically what CT is designed to facilitate, and in fact it's already basically doing its job, even though it's not fully deployed yet. This is an example from a Facebook incident in 2016, which they blogged about, where they used CT to discover some certificates which were not malicious, they weren't mis-issued, they were an instance of a vendor managing some subdomains who issued certificates that weren't really supposed to be issued, so it was kind of an instance of this vendor falling through the security team's policies. But even though this isn't an attack, it was still really valuable for Facebook to find out about it, and it says something kind of cool and interesting about the process of CT deployment, because this was years before CT was fully deployed, and yet Facebook was still fairly easily able to get this value out of CT by finding these certificates. So hopefully that kind of sets up why CT is important and why we care about it. I'm going to go into some more background now about what it is, a little high-level overview of how it works, so hopefully, even if you haven't heard of CT before, you'll be able to follow the rest of the talk, and I'm going to talk a little bit more about what I mean for a system like this to be considered usable. So in starting to give an overview of CT, let's first talk about what the WebPKI looked like before CT. At a very basic level, an entity called a root certificate authority, or a root CA, issues a certificate to a server after the server proves that it is the domain that it's claiming to be. Then the server provides the certificate to the browser in the course of setting up a TLS connection, and the browser validates that this certificate chains to one of the root CA's that the client trusts, that's in the client trust store. If that validation can't happen, so if the certificate doesn't chain to one of these root CA's, then the browser says, this is no good, I can't validate this certificate, and will generally throw up a full page error telling the user that a secure connection can't be made to this server. There are a couple unfortunate properties about the WebPKI that I want to highlight. Obviously the root CA's are trusted third parties, and for the most part, they are completely equally trusted, meaning that with some caveats and exceptions, for the most part, any CA can issue a certificate for any domain. That's a lot of power that these CA's have, and there's a set of policies governing how they're supposed to wield that power and what they're supposed to do to be operating properly, but that good behavior is basically enforced by a combination of compliance audits and luck, and neither of those things work very well, and there aren't a lot of technical means built in the system to guarantee that we're going to discover bad behavior by CA's who hold all this power in the Internet. So the goal of CT is to allow anyone to discover what certificates are being issued, and anyone here can be a researcher, for example, who is trying to understand how CA's are behaving and make sure that they're behaving properly, or it could be a domain owner, like Facebook in the example that I gave, who is just trying to keep an eye out for suspicious certificates for their own domain. So I'm going to give a very high-level overview of how CT works. I'm not going to go into the details, but I think it's enough to sort of analyze the usability of the system. CT works with a group of logs, and these are logs that are operated independently by different organizations. Google operates some logs, CA's operate some logs, other organizations like Cloudflare is spinning up some logs, and these logs are publicly auditable, append-only records of all the certificates that have been issued. When a certificate gets issued, it gets submitted to one of these logs, and the log gives back in return what's called a signed certificate timestamp, or SCT. This process of submitting to a log can be done by any number of people. Sometimes the CA will submit the certificate to the log at the time that the certificate is issued. Sometimes the holder of the certificate will submit it, or it could be anyone else who happens to encounter the certificate and can submit it to the log so that it can be publicly monitored. This SCT, the signed certificate timestamp, it can be thought of as a signed verifiable promise that the log is going to actually log this certificate within some period of time. In the deployed system of CT, that period of time is 24 hours. So these promises, we have to do something with these promises, and the end goal of CT is that browsers wouldn't accept certificates as valid unless they come with these promises that the certificates will be publicly logged. So what happens is when a browser receives a certificate, it comes along with a set of SCTs from various logs, and the browser then validates the signature on those SCTs. And if the signature validates, then the browser knows that if the logs are behaving honestly, those certificates will be publicly logged so that anyone can see that they're out there. So this is kind of cool, but as I've described it so far, this is basically just another layer of trusted third parties on top of the existing PKI, right? These logs are just more, we're trusting them to behave honestly to actually log what they say they're going to log. CT wasn't really designed that way. It was designed for the logs to be untrusted. And so there are a set of protocols by which these entities called auditors can cryptographically check that logs are behaving as they're supposed to behave, namely that they're logging what they say they're going to log. If they issue one of these SCTs, these promises to add something to the log, they actually follow up on that and do the logging. And also that they're presenting a consistent view to all clients. I'm not going to go into the details of these auditing protocols, but I will mention that an auditor can be anyone. It can be a web browser. It can be the software manufacturer that makes the web browser. It can be a researcher, et cetera. And this is a very important part of CT, so I will come back to it a little bit later in my talk. Often when we talk about usable security, we tend to think about end users, things like can an end user figure out how to use a password manager? Or can a user make sense of this warning that they're seeing? But CT is not really an end user-facing system. And so you might be wondering, what does it even mean for a system like this to be usable? The usability properties of CT that we care about are that it can be deployed gradually. So we can have sort of a gradual ramp up. And as we go, we get more and more value, because we need there to be value that we get along the way. Otherwise, why would anyone do it? And we need it to be gradual, because we can't just take a system like this and just deploy it across the whole web all at once. Closely related to that, we want it to be a layer on top of the existing WebPKI. We don't want to try to replace the WebPKI wholesale. That's a pretty daunting task. And going hand in hand with that, that means that the existing stakeholders in the WebPKI today have to be sort of OK with adopting CT. They have to find the amount of work that they need to do acceptable. And finally, in Chrome, we care a lot about every warning that we show to users, causing warning fatigue, just making the web seem scary and broken and confusing. And we don't want to do that. So we want CT to be deployable without causing a lot of pain and confusion for end users and kind of getting in the way of them just using the web as they normally would. So because we want CT to be usable as this layer on top of the WebPKI, that means it has to be appealing to all the entities operating in the WebPKI today plus this new set of stakeholders, like log operators and auditors and monitors of the CT logs. And this zoo of stakeholders is kind of one of the main reasons why it's challenging to design a usable system like CT. OK, I'm going to now go into kind of the state of CT adoption today, what the ecosystem looks like right now, and what it says about the usability of CT overall. I'm going to be, this will be some qualitative observations, but I'm also going to be throwing some data at you. So I want to just say up front where that data comes from. Some of it is from standard Chrome telemetry, though unfortunately some of these metrics haven't hit stable yet, so some of them are from beta. But if you're interested, I will post the slides and update them with the data from stable when it comes in in a few weeks. We also have a separate data set of users who encounter certificate errors and opt in to send reports about those certificate errors that they encounter. We do have a CCS paper describing this data set, not about certificate transparencies specifically, but just about this data set in general, so you can check that out if you're interested. And finally, kind of independent of Chrome, we can also look at CT adoption by looking at how popular sites are adopting it. And so to do that, we look at the Alexa Top 10,000 list and also another list of popular sites, which is a little bit new, it's called the Chrome User Experience Report, and it's a public data set of 10,000 top sites as measured from Chrome usage. So in examining the state of the CT ecosystem today, we'll look at these three big usability properties that we care about. Starting with how CT is being gradually deployed and what kind of value we're getting from it even before it's fully deployed. Chrome has been the main driver of gradual deployment of CT. There are other clients that are supporting CT in various ways now, but I'm going to focus on Chrome kind of as a case study for what this gradual ramp up looks like. This is a roughly chronological order of the steps that Chrome has taken to gradually roll out more and more CT. We started a few years ago by requiring CT for the extended validation UI. I'll show a screenshot of that in a minute, but that's the green bar that you get when you visit PayPal or a site that has undergone extended validation to get this special kind of certificate. So we started requiring CT in order to show that special extended validation UI. A little bit later, we started requiring CT for certificates from certain CAs that have been known to misbehave in the past. So that was a way of kind of shedding some light on how these CAs were behaving by requiring certificate transparency for all their newly issued certificates. More recently, we added a way for sites to opt in to CT enforcement themselves. So sites that want more security benefits from CT even before it's required, they can opt in and tell the browser, hey, I always want CT to be required for my site. And coming right up in April of this year, we're going to start requiring valid CT for all newly issued public certificates. I say requires valid SCTs here. There's actually a lot of nuance hidden in there. There's a very complex policy, not very complex, but fairly complex policy governing how many SCTs have to be present, what logs they have to come from, things like that. I'm not going to go into those details, but that policy is also kind of an important part of the gradual deployment of CT. So the results of this multi-year process of CT rollout is that right now we have about two thirds of connections in Chrome are CT compliant and almost three quarters of requests. So this is actually quite exciting because it's still a relatively small subset of connections on which CT is required. So it's pretty exciting to see this much traffic already being CT compliant, even though it's not always required. When we look at popular sites, we see about a quarter of the Alexa Top 10,000 are CT compliant right now. And that's out of about a little over half of them that support HTTPS at all. And we see pretty similar numbers from the Chrome user experience report. One kind of interesting difference between these two data sets just in general, the Alexa Top 10,000 and the Chrome report is the Alexa Top 10,000 contains domains that are popular as third-party requests, whereas the Chrome user experience report is domains that users spend time on as first-party sites. And if we compare these numbers to the connection numbers, like two thirds of connections, we can kind of guess that CT adoption is sort of concentrated in the more popular sites where users are spending most of their time. So this is good, but the question is, is it actually getting us any value? There is a very interesting crop of tools that make it very easy to monitor logs and see what's going on in the CT logs. And because of these tools, we've gotten a lot of value out of CT so far. So some of these tools are used by researchers who have discovered mis-issued certificates or other types of CA misbehavior. And some of these tools are used by domain owners like Facebook who have found suspicious certificates for their own domains. So even though CT is not fully deployed yet, it is definitely deployed enough to start getting a lot of value out of it. And that's partially because of this gradual deployment and partially because of this crop of very usable, accessible tools for seeing what's going on in the CT logs. So one might wonder how much pain has been imposed by this gradual rollout? How much, what burden have we placed upon the existing stakeholders in the web PKI? And the two stakeholders that we care about are CAs and server operators. So CAs have had to do a bit of work to support CT over the past few years. As I mentioned, Chrome requires CT in order to show this EV UI. And as a result, most CAs started embedding SCTs into their certificates. So when you go and you buy an EV certificate, it generally will come with SCTs embedded in it. And there's a bit more work coming down the line for CAs in the spring when they'll have to start embedding SCTs in all newly issued certificates when Chrome starts requiring that in April. So this is a bit of work that they've had to do to support CT, but we think that's kind of appropriate. They are the trusted third parties and the holders of all trust on the internet. So it's appropriate that they should have to do a little bit of work. And it's also worth noting that the whole design of SCTs was sort of built to accommodate CA issuance processes. So there was a lot of kind of flexibility built into the design to make the system appealing to maybe not appealing, but acceptable to CAs. So now we wonder, what do server operators have to do in this new world of CT? Today, most server operators don't and won't have to do anything special to support CT. You know, as I mentioned, in the case of an EV certificate, you just buy the certificate and it comes with SCTs embedded and the server operator doesn't even really need to know about it. And we expect this to be the case come April when we start requiring it for all certificates. You'll just get your certificate, it will come with SCTs embedded, don't have to do anything special. So this is great because we can't get every site on the internet to do anything. We have spent like five years trying to just get every site on the internet to support HTTPS. It's impossible. But there are these sort of advanced modes that server operators can do. So there are some sites that choose to support CT themselves by getting their own SCTs and delivering them themselves on the TLS connection instead of having them embedded in the certificate. Today, most of the sites that are doing this are larger sites that are doing it either for performance reasons because they don't want to send SCTs to clients who don't want them or for agility reasons to be able to respond quickly to changes in the log ecosystem, or just because they are very invested in the health of the web PKI and they want to support CT and move it forward. So what's kind of cool here is that because SCTs can be delivered in these various methods over the TLS connection embedded in the certificate, that gives a lot of flexibility and deployment. Server can adopt CT without its CA knowing anything about it, and a CA can adopt CT without its customers having to do anything special. And for bonus extra credit points, a server can do a little bit more work to get a lot more value out of CT. So we recently launched this header similar to HSTS, if you're familiar with that, which allows the server to opt in to CT enforcement. So the server can say, I always want to have CT enforced on my connections. So for servers that really want the security benefits of CT, they can opt in and get those today, even though CT isn't universally required yet. So this is kind of a nice situation for server operators. By default, they don't have to do anything special, but there are these sort of extra credit things they can do if they want to do CT in advanced mode. Finally, we do care a lot about what end users see. And we don't want the rollout of CT to cause a lot of warnings, certificate errors, breakage of the sites that end users are using. You might kind of, it might not be obvious, I guess, why this is something that keeps us up at night. These are just some hypotheses of things that could go wrong when we put on our pessimism hat. One is just bugs. CA Software has bugs. The servers that implement CT themselves have bugs. This policy I mentioned that Chrome has for when CTs have to be required, when what logs they have to come from, how many there have to be. That's a complex policy and it's not trivial to implement it correctly. In general, this is not really specific to CT, but errors can multiply like this. If you have a single broken connection, that can break many hundreds of requests, or if you have a single broken resource like a popular JavaScript library on a CDN, that if that library breaks, that can break thousands of sites. And finally, we have seen incidents of intentional non-compliance where a site is subject to CT requirements, but chooses that they don't wanna do it for usually for domain privacy reasons. And that ends up showing errors to users. Fortunately, this doesn't seem to be a huge problem today. Virtually all of the time that CT is required, so the connection would fail if it weren't CT compliant, we see actually successful CT compliance. Similarly, it's very rare that we drop the EV indicator because of CT. Another way we can look at it is a slice of the overall pie of certificate errors, and as you might guess from the numbers I just showed, CT is a very small slice of that pie. It is kind of interesting to look at that slice though. About 15% of those errors caused by CT are not bypassable, meaning the user has no option to click through, and that's because the site has opted into stronger security settings. Of those errors where the user can click through, the click-through rate is a little bit higher than certificate errors in general, so that's kind of interesting. We can't take this for granted though. When we rolled out CT requirements for a large CA, there was kind of a spike of errors due to a confluence of events, and so this is something that we have to be vigilant about as we roll out CT more broadly. So overall, I think this is a pretty optimistic picture of the usability of CT. We do see the gradual deployment that we want to see. We see a lot of value out of it even early on, and not a lot of pain for existing stakeholders or end users. But I want to conclude the talk by just highlighting some of the open areas, especially as they were open research areas, especially as they relate to usability. I mentioned in my overview of CT that there are these auditing protocols for making sure that logs are behaving honestly. But it turns out that in real life, these are a little bit hard to get right and kind of hard to deploy in practice, and it becomes especially tricky when we think about end user privacy and data consumption. So I want to highlight this as an open, active area of development. Could be a whole talk in itself, but I've included some references here for kind of what the open problems are, what the current state of things are if you're interested. The business of operating a log also turns out to be kind of tricky. It has to be this high availability service that presents a consistent view to all clients at all time, and there's only recently starting to be some more transparency into how logs are operating, how well they're doing. As you can see from this, this is a monitoring dashboard for various CT logs. So there's definitely some open work to be done here in understanding what kinds of log failures can happen, how the ecosystem should react to them, how vulnerable the ecosystem is to these different types of failures. And finally, I mentioned this complex Chrome policy. You're not meant to read this, this is just to say it's a little complicated, and it might become even more complicated as more clients roll out policies of their own. So there is an opening here for some debugging, visualization tools, help the owners of certificates understand if and how their certificate is conforming to these policies and what they need to do to fix it if anything. So there's definitely some work left to do in these open areas, but overall I think CT is in an exciting state. We do see the gradual deployment that we wanna see, and it's really exciting to see so much value gotten from CT even before it's nearly universally deployed. So it will be exciting to see how this keeps up and whether we're able to continue this trends as we roll out CT more broadly this year in April. So with that, I wanna thank some colleagues and collaborators who helped with this talk, and I'm happy to take any questions. Thank you. Thank you. Questions for Emily, please. Hi, Emily, thanks for the talk. It's really great to see you guys working on this. I think the internet really needs this. I was wondering if you have any thoughts on the bandwidth complexity of CT. So if I'm a website, I'm Facebook, or I'm Google, or I'm Microsoft, I need to monitor my public key, and I need to look in this plethora of logs that CT supports. And some of these logs are really big. I can trust the monitor, of course, but now I'm introducing a trust at third parties. And if I really wanna be sure, I guess I really need to download the logs myself. So do you have any thoughts on that and how to improve that? So from the Chrome side, we tend to care, think more about the client bandwidth side of CT, which, like the web client side, which is also a concern. This is part of why servers like Google don't deliver SETs on every connection. They deliver it to clients who want SETs. From the log operator side, yes, I mean anyone monitoring the log needs to see every entry. The guy behind you who's running a log right now might be able to talk about that a little bit more. But these are mostly large organizations right now. So for Facebook to monitor the CT logs, I don't think it's a huge burden for them. Thank you. Hi, Emily, thanks for the talk. At the beginning of the talk, you mentioned DigiNotar and the compromise that kind of led to this line of thinking and said that people were safe, right? Because there was pinning at the time of the DigiNotar hack. And now some browsers are moving away from pinning as a mechanism and sort of towards CT as a replacement. Without pinning, you mentioned how there's the enforcement and then there's the detection and how these are potentially equally valuable. Can you talk through what would happen today if we have CT but we don't have pinning if there's another DigiNotar compromise? Yeah, so I did wanna have a section in this talk actually where I compared CT to pinning for that reason and also for usability reasons. Didn't have time. So I think there are a couple of things wrapped up in there. One is that we find in practice when something doesn't work in Chrome, that doesn't actually protect users that much. If an attack fails in Chrome, users are gonna switch to the browser where the site works. So that's one of the reasons why we focus on discovery. But you're right that there are definitely, so okay, sorry, let me just answer your question. What would happen today is, so the attack would work. The user would accept the mis-issued certificate. When the attack is discovered, it would have to be revoked in some way. And revocation is not a solved problem. That's something that needs to be worked on as well. What would happen in practice today is that the discoverer of the certificate would alert the browser vendors who would generally deliver a revocation via kind of update. And if the certificate was mis-issued, which is one of the scenarios that we care about the most, then action would be taken against the CA and eventually it might be removed from trust stores. So how long would people be vulnerable in this situation? Depends on a lot of things, but it could be for the users of a browser that is able to get an update with the revocation, for it could be six, 12 hours, but for something to be removed from trust stores, we're talking more about weeks or months probably. Thank you. I think we only have one more question. Sorry. OCSP. This is not my field. Has OCSP gone away? I mean. You mean SCT delivery in OCSP? Yeah. No, it's there. I just didn't talk about it, but yeah, there are three ways to deliver SCT's TLS extension embedded in the certificate and OCSP. Okay. Yeah, let's thank the speaker again.