 Hello, welcome to the second part of this lecture on public key infrastructures. Let's have a look at the principles next. If you recall, the goal that we have is to enable key distribution in public key photography. In simple terms, Bob needs to know which public key belongs to Alice before he can securely encrypt messages to her. Analogously, Alice needs to know which public key belongs to Bob before she can verify any signatures that may be from him. PKI public key infrastructures are a form of key distribution, and they eliminate the need for a central directory, as you may know it from key distribution in symmetric photography. However, this is the world you're looking for. The essence of PKI are certificates, a term that was actually coined in the 1970s. The definition is fairly straightforward. A certificate is a cryptographic binding between an identifier and a public key that is to be associated with that identifier. An issuer creates such a binding, and the issuer assumes the role of a trusted third party in accordance with Boit's theorem. How do we do this? How do we create certificates? Well, we do this by issuing certificates between entities that creates a PKI. We call the entity that is responsible for creating a certificate an issuer, or I. The issuer I has a public key, which we call pubI, and the private key, which we call proofI, PVI. X is an identifier that's going to be bound to a public key pubX. Now, let I create a signature, signature, sigI, on X concatenated with pubX. In practice, you wouldn't probably use a concatenation operation such as that, but for simplicity, we write it like this here. In that case, the tupleX, pubX, and the signature, sigI, on X concatenated with pubX is our certificate. In practice, we're going to add much more information, but that is really the essence. Note that you can actually also create chains, right? For example, you might have an issuer I1 who may certify I2, who then certifies X, then you have a chain I1 to I2 to X. Each error would imply a certificate is issued, reading this from left to right. What about the binding? Well, the semantics of the binding are defined externally by us. It is not inherent to the cryptography. The identifier can be the name of a person or a business, and very often S can also be a domain name, as for the web, for example. It is rare, but it can also actually just be an attribute, for example, some access write and access privilege. That has an implication. It means that we must always verify that identifier and corresponding key really belong together before we issue a certificate. If the identifier is a name, then we need to verify that the entity behind the name is really the entity it claims to be. Otherwise, any kind of perpetrator could come and claim to be some entity and get a certificate for that entity. So we suddenly have the aspect of operational security at the right core, deep in the center of PKI. We need a correct execution of verification procedures. That is crucial to the correct functioning of any PKI and to establish a binding that has an actual semantic meaning. Now we are in the nice situation that we can classify PKI's. Using our terminology, we can ask who are the issuers, which issuers must be trusted, in other words, which trusted third parties exist in our PKI. We can also do issuers verify that X and PubX belong together or that X is really X. You will find that depending on the PKI that you're looking at, different words are used for issuer. When you have an hierarchical PKI, as is very often the case, especially on the web, issuers are called certification authority. Let's have a look at the theory behind these hierarchical PKI's. Here is the super duper naive form of a hierarchical PKI. In essence, we have one global certification authority and that one issues certificates for everyone who wants to have one doing the procedures. So we have a global CA and certified entities. The trouble with almost everything that is naive is that it is not a practical form. Why? You may want to think about it a little bit and you can stop this video and then resume. I'll be quiet for a moment so you can stop. Fine. After this poker face, let's continue. This is infeasible and I'll give you a few questions that demonstrate why. Who is going to decide which global authority is trustworthy and can assume the role of a global certification authority or CA? Just imagine how difficult it is to get the governments of the world together just to agree on very simple things. Now you want them to agree on one single body with such power. And what are the agreed verification steps? Every country actually has a different notion of how to verify an identity. Which steps are going to do? I'll give you an example. In many countries, so-called identity cards certify an identity and they can be used to demonstrate that you are who you claim to be. In other countries, such identity cards are simply not used. There you might need to show credit cards, bills, driving licenses to establish your identity. Who is going to agree on the verification steps that apply? Then we actually have a tricky little problem. Theorists call it the global namespace. Think of the following. Who is John Smith? I hear you're asking which John Smith and that's exactly the problem. The which John Smith problem. How do we identify entities around the globe uniquely? Not relying on some kind of arcane bit string or anything, but uniquely with a human readable identifier. That's very hard. Finally, a little bit more than this. It makes it easier to trick the global CA into misissuing a certificate to the wrong entity. Why? Simply because the global CA has to consider so many factors before issuing a certificate. The more factors, the higher the complexity, the more attack points for an attacker. And finally, it's really hard to imagine that any government would rely on authority outside its own jurisdiction, or at least its legal reach. So this form is utopian. How can we do a PKI on a global scale? Well, we could maybe have another level of indirection. Computer scientists like to say that not a level of indirection always helps, at least in reducing the immediate problem. It does not solve it completely, however. For example, you could introduce so-called RAs. We'll talk about them in a second, but the term stands for registration authority that help the CA. And then the final certificates are issued via them in some way. Registration authorities would be responsible for doing the verification step. You could imagine them being responsible for one country, knowing the local laws exactly, and identifying X and verifying it has the corresponding private key for every entity that needs to be certified. They don't actually issue certificates themselves. They're a simple mere proxies for the global CA. However, a few problems remain. For example, you still have a single trusted authority that can issue certificates for anyone. Governments wouldn't be happy about that. And finally, the namespace still remains global, unless you construct very, very complex names that break it down into country, et cetera, and become quite impractical to handle. So we still don't have a solution. Well, when the web was built, or rather when the web became popular, people realized all this, and they took a really pragmatic solution to this. They in essence said, well, why one global CA if we cannot agree on it, even if it has registration authorities, we can just accept that many certification authorities are going to exist in different legislations, and we accept them all as equal. The problem is, this introduces serious weaknesses in our model. If you want to think a little bit about it, I'll again give you my poker face, and I'll stop for a few seconds. Well, welcome back. What are the weaknesses we're talking about? First of all, well, if we say we treat them all as equal, we also say we define them as trusted. In practice, you would probably go ahead and say operating systems and software like browsers come preconfigured with a set of trusted CAs. The problem now is any CA may issue for any domain name, at least on the World Wide Web, which means the weakest CA in terms of its own protection is the weakest link in the entire system. And compromise of one CA means a compromise of the entire system. This is known as the weakest link argument. Still, as you're going to see, people actually stuck with this. CAs and RAs define a hierarchy. The role is important, not the graph structure. So you have a set of CAs at the top. You might have sub-CAs below, remember the chaining of certification, and you might have recitation authorities, depending on the CAs, to distribute the workload of identity verification. If you look at it as a graph, well, you don't have a single graph anymore. What you have is a disjoint set of trees, a forest. You could, in theory, build something that is not hierarchical. And I say in theory, because it has been done, but it never reached any really large scale. The idea, for example, would be that of a web of trust, where every participant may issue certificates. For example, if you have a look at our figure here, for example, at Frank. Frank has been certified by Charlie, and Frank has certified Charlie in term. Frank has also a signature by Daniel, although Frank was not so generous and did not certify Daniel in return. But with Henry, he was okay again and has received and given a certificate, a certification. And then we might have these little cycles, like between Frank, Jane, Carla and Laura. Or we might have outliers like Alice, who is really only connected via Bob. Now, you can do this, and it has been done. In fact, this is a really fairly flexible structure. But you need a few more things now to make it useful. And that's where it really becomes difficult and complex. You now need to reason about your authenticity of bindings, because you don't have a clear anchor point that you trust. You need trust metrics, in essence, algorithms that tell you how to assess whether you believe a certain certification to be genuine, or a chain of certifications to be genuine. You need rules, how you compute trust. And no one has so far been able to come up with such rules. By the way, maybe global or local, there are standards for this. And nothing actually precludes you from introducing CAs as special participants, so you can have hybrid models. In practice, however, these models have not been very successful. Looking at the bottom of this slide, the webs of trust are used a little bit, in the case of OpenPGP for email. And if you have a look at the research on that topic, you'll find we're talking about a few hundred thousand people or a few million at best. But that's nothing against the size of the worldwide web, for example, or the number of internet users. A fairly more important application is code signing. Many Linux versions, distributions, has actually taken to use a web of trust and OpenPGP for code signing. The webs of trust are ultimately extremely simple. They just declare themselves to be the root authority, and then they use OpenPGP to sign. It's not really a web of trust or really a web of trust in its simplest possible form. Let's have a look at the Oracle PKIs at the top of this slide. In practice, on the web, we have Oracle PKIs with many equal CAs. This is based on a standard that is called X549. This standard is so old, it is older than a domain name system. What we call the web PKI is actually using X549 in the cryptographic protocol TLS transport layer security. There's a lot of governance, meaning semi-political rules bylaws almost, that govern how certificates may be issued for web domains and be used in HTTPS. By the way, they can also be used for email protocols like IMFs or SMTPS, but there is a lot less governance for them, and I just mentioned it in passing. The same goes for an email standard called Asmine, where we also have much less governance. The use of X549 is indeed also in code signing. These are the principles. Let's have a look at a few more details next.