 We'll start our discussion of mathematical cryptography by looking at the ultimate and perfect ideal cryptographic system. In other words, nothing we do after this is actually important unless you want to use cryptography in the real world. And this approach is generated as follows. Suppose that Eve intercepts some encrypted message, and so she finds that, oh, I don't know, here is the encrypted message, and she wants to know what it reads. Well, she also has some additional information. She knows that the message is an answer to a yes or no question. And in most actual applications, the importance of the information is that it is an answer to some particular question. And let's say that she also knows that 90% of the time, the answer to the question is yes. So what can she do? Well, rather than going through all of the effort of trying to figure out what this message really says, what she could do to save herself a lot of time is to just guess that the message is yes. And if she does that, she'll be correct most of the time. And this is an important idea that we want to keep in mind. The goal is not really to find out what this message is saying. The goal is to determine the information that's conveyed by the message. And so we don't necessarily have to decrypt this message if we can guess effectively what the content is. And this suggests that there's a natural limit to the effectiveness of any cryptographic system. And it leads to the following definition, which is due to Claude Shannon, one of the founders of information theory. Let M be the event that a system produces some particular message, and C be the event that our cryptosystem outputs a particular ciphertext. An ecryptosystem has perfect secrecy if and only if the probability that the message is M, given that the ciphertext is C, is just equal to the probability of the message. In other words, even if you have the ciphertext, this tells you absolutely nothing about what the message is beyond what we already know about the nature of the message. And this leads to the ultimate cryptographic system known as the Vernum cipher, also known as a one-time pad. And this was invented way back in 1918, well before Shannon developed his ideas of information theory. And the basic idea to the Vernum cipher is the following. We'll assume that our message consists of a sequence of 1s and 0s. And what we'll do is we'll take a key that consists of a random sequence of equally likely values, 1s and 0s again, that's as long as necessary. So again, our plaintext consists of a bunch of 1s and 0s. Our key is going to also consist of a bunch of 1s and 0s. That's as long as the plaintext is. So I produce this random sequence of equally likely 1s and 0s. I have my plaintext message also consisting of 1s and 0s. And I'm going to produce the ciphertext by performing bitwise addition between the key and the plaintext. Bitwise addition, if you're a computer scientist, mod 2 addition if you're a mathematician. So I'm going to add 1 plus 1, that's going to be 0 mod 2. I'm going to add 0 plus 1, that's going to be 1 mod 2. 1 plus 0 is 1. 1 plus 1 is 2, which is 0 mod 2. And so that produces my ciphertext. Now the Vernum cipher has perfect secrecy. And we'll prove this for a one symbol message. The formal proof actually follows by induction or any of a number of other ways. And if you're not familiar with the theory of probability, don't worry, we'll give a different proof following this. So let's take a look at this. So assume that the probability of our symbol 1 is p. And so the probability of our other symbol is going to be 1 minus p. And let's consider the following table. This includes all of our possibilities. The symbol is either 1 or 0. And the key is either 1 or 0. So now let's fill in the probabilities. The probability that we can start with is that, well, if the symbol is 1 and the key is 1. So the probability of the symbol is 1 is p. The probability that the key is 1 is, by assumption, 1 half. So the probability that the symbol is 1 and the key is 1 is going to be p over 2. Likewise, the probability the symbol is 0 is 1 minus p. The probability that the key is 1 is, again, 1 half. And so this probability is going to be 1 minus p over 2. Again, probability the symbol is 1 is p. Probability the key is 0 is 1 half. So there's our p over 2. And then finally, symbol is 0, probability 1 minus p. Key is 0, probability 1 half. And so our probability is going to be 1 minus p over 2. Now our goal is to show that the probability of a given message is independent of what the ciphertext is. So let's consider one case here. So suppose our ciphertext is 1. Now that's going to occur in one of two ways. Either the symbol is 1 and the key is 0 and our bitwise addition 1 plus 0 will give us 1, our ciphertext. The other possibility is that the message is 0 and the key is 1. That's going to be here. So again, if our symbol is 0, the key is 1, our bitwise addition is going to produce ciphertext 1. So now let's consider this. If we receive ciphertext 1, we know that one of these two is going to be the case. So the probability of the symbol is actually 1. Well, it's this many cases, p over 2, out of a total of p over 2 plus 1 minus p over 2. So given that the ciphertext is 1, the probability our symbol is 1 is going to be this fraction, which after all the dot settles is going to be p, and that's the probability the symbol is 1 to begin with. In other words, knowing the ciphertext is 1 didn't tell us anything extra about the message. We already knew the message had probability of p of being 1. In effect, we don't have to go through the problem of decryption, we can just guess the message. A similar argument is going to hold if our ciphertext is 0. Let's take a concrete example of this to see how this works. So I suppose that we send 10,000 messages, again, assume 90% of the messages that we send are going to be 1s, and 10% of those are going to be 0s. So let's use a Vernum cipher. So again, we know that 90% of our messages, 9,000 of those messages, are going to have symbol 1. And the remainder, 1,010%, are going to be symbol 0. Now we're using a Vernum cipher. So half the time, the key is 1. If the symbol is 1 and the key is 1, the encrypted value is going to be 0. And so of those 9,000 message 1s, then 4,500 of them are going to have key 1, and the rest will be used in key 0. Likewise, for the 1,000 message 0s, half of them are going to use key 1, half of them are going to use key 0. So now let's suppose that Eve intercepts a message, but just before she's about to begin work on decrypting it, she spills coffee on the message and obliterates it. Now this is a serious problem because this is her job that depends on her ability to decrypt the message. So what can she do? Well, she knows that 90% of the messages are going to be 1. So rather than try and decrypt the message, she can simply guess that the message is 1, and 90% of the time, she's going to be correct. Now, maybe she isn't quite so confident that this will work, so she cleans up the message and she finds the intercepted value was 1, and so that tells her one of two things. The ciphertext message was 1, which means either the symbol is 1 encrypted with key 0, that's this 4,500 group here, or the message was 0 encrypted with key 1, and that's this 500 messages here. So she knows that she has one of the 5,000 messages that fit into these two categories, and the symbol is 1, is going to occur 4500 out of 5,090% of the time. And so again, at this point, even with this information that the intercepted value is 1, if she simply guesses that the message is 1, she has a 90% chance of being correct, and it really wasn't worth cleaning up the coffee. So here's an important question, what if our system isn't perfect? So again, if we use the same assumptions, but we're not using a Vernum cipher, but instead we use key 1, 30% of the time, and key 0, 70% of the time. Now, just as a note here, these are still randomly determined keys. What makes this not a Vernum cipher is that the first key and the second key are not used in equally likely frequencies. So the cipher is going to produce the following results. Again, our 9,000 messages symbol 1, we have 1,000 messages that are symbol 0, but this time 30% of the time we're going to use key 1. And if the symbol is 1 and we use key 1, that's 2,700 messages are going to be encrypted that way, and the remainder are going to be encrypted using key 0. Likewise, for our symbol 0, 30% are going to be encrypted using key 1, 70% are going to be encrypted using key 0, and as before, if Eve takes the intercept but accidentally spills coffee on it, wipes it out, whatever, she can still guess. And again, 90% of the time, the message is actually 1. So if she guesses the message is 1, she has about a 90% chance of being correct. On the other hand, suppose she finds out the encrypted message is actually 1. She cleans up the coffee and discovers that the encrypted message is 1. So again, this can happen in one of two ways. Either the symbol is 1 and the key is 0, or the symbol is 0 and the key is 1. So there's 6,300 plus 300, there's 6,900 times that message is 1, 6,300 of those times, the actual message is going to be 1. So that's gonna be about 95% of the time. So this time, with the extra information that the encrypted message is 1, with that extra piece of information, she can make a guess and her probability of being correct has increased, or more generally has changed from what it was without that extra information. And so this time it's worth cleaning up the coffee because it does give her some extra information.