 In this lesson, we'll be looking at how we can use a known plaintext attack in order to determine the seed value for an LFSR stream cipher. So let's start about how we're going to go about doing this. Like I mentioned, we're going to use a known plaintext attack, meaning we're going to be able to make a reasonable guess about what the contents of the plaintext is, even without knowing that ahead of time. We're just going to make a guess and see how it plays out. So we're going to guess that piece of that plaintext. If you don't have to get it all right, we'll see just getting just a part will be enough for us to retain the seed value. We're going to generate the key stream by using the XOR operation with the ciphertext that will have intercepted in order to generate that key stream. And then we're going to try and use that key stream to determine the C values that were used in the LFSR. And then once we know the C value, we should be able to generate the entire key stream to decipher the entire message. So suppose we've intercepted the following ciphertext, which we've converted to binary. We know that our message sender used a seven bit LFSR that the taps of two and five. Now we're going to assume that we know this, but in practicality, this is a reasonable assumption to make for a lot of crypto systems that are out there. This type of information will be public because it's often made in hardware. So all you have to do is capture one of these pieces of hardware and you know this portion of the system. However, the seed value is what is the secret. So we never know really know what the seed value is. So that seed value is the secret information that we're after using this known playing text attack. Now it's worth pointing out that this is a somewhat trivial example because this seven bit LFSR only has a period of 10, meaning that there's random pseudo random ones and zeros that pop out at the end. We'll start repeating every 10 digits. So this is not a particularly challenging system that we're attacking here, but it should make easy for us to understand the general idea, which can then be applied to a much more complicated or more bits LFSR system. Now, in order to make our known plain text guests, we usually need to know a little bit about the subject of the message. And since this is a programming course along with some mathematics, let's guess that this plain text has something to do about a programming language. And we will guess that one of the programming languages we know about is in that plain text. So let's see that when this is used for other types of files, not just text on a computer, there's a lot of structure and known information about the ways that files are stored on a hard drive that would make a known plain text attack much more reasonable to do, even if you don't know exactly what the topic of the message is. If you knew it was a particular type of file, you might be able to guess about some sequences of ones and zeros that show up pretty commonly. So let's make a guess we're using text here. Again, here's our ciphertext represented in binary. And we're going to guess that this is about the word Ruby. Ruby is another programming language that we have not talked about in this course. So here's how we're going to go about implementing our attack. We're going to take our plain text and we're going to convert it, our plain text guests and convert it to binary using the same ASCII system we've been using. And I'm just going to focus on the first eight bytes, our first four bytes of the messages here just to make this a little bit more digestible. Now we're going to take our ciphertext binary and our guest plain text binary. And remember, we can now XOR those two streams of ones and zeros in order to recover the key stream. This XOR operation is the same thing as kind of the inverse of the XOR. They're their own inverse operations. It's itself. So there's our ciphertext. There's our guest plain text. We XOR those and we get our guess for what we think the key stream is. Now all we need to do is figure out, well, did we guess right? So let's think about how we could do that. So let's go ahead and see if we can verify our information that we've deduced. So here's our guest key stream 110111 and so on. And as a reminder, our definition is a seven bit LFSR with the tabs that bits two and five. So we're going to add the information contained at those two taps, those two bit registers, add them together mod by two or just XOR them to determine the new value that goes into bit seven. So let's start by putting our guest key stream down the bit one column. Remember if that was the in fact key stream, that would have been the order we would have found it in this table. And now we can kind of work some of this information back upwards. So if we take each bit in column one and work it up into the left, we can now figure out the rest of the information that would have been in this table. No arithmetic even needed. So we've actually been able to recover the first row of this table, which would have been the seed value assuming that we get our key stream right. So keep that in mind if this is correct, our seed would be 11111011. Now let's figure out if it's correct. Now the way that we're going to do that is just looking for any flaws in our calculations. If this key stream was correct, it should abide by the rules of the definition. Meaning that if we were to look at say bits two and five in one row, and we added them together and modded by two, or x-word that is those two bits, we should get the result in the next row's bit seven. So these two seem to work okay. One plus one added together is two, mod two is zero. Move on down to the next row, one and zero, x-word those to get a one, that looks good. And we go to the next row and we've got an error. One x-word with one does not return one. The only reason that would have caused this error is that if the key stream in fact was incorrect. So we now know we have enough information here to figure out that this was an incorrect guess at the plain text, because it yielded a key stream that would not have been able to be produced using the LFSR that we know should have generated the key stream. Incorrect guess. So let's try again. Let's say we've got the same ciphertext, but now we think it's about Python, the language that we've been using in this course. So we'll follow the same steps. We'll turn our ciphertext and our plain text into binary. Again, we'll focus on just the first four bytes or 32 bits of information. And we'll take the ciphertext in the plain text, our guest plain text. We will XOR those streams of ones and zeros together to yield a guest key stream. We'll take that key stream and in order to verify, we put down column for bit one, follow those bits back up and to the left to recover the rest of this table. And now again, we have our top row, which would be the seed if we got this key stream correct. And let's verify one and one is zero, one and zero is one, zero and one is one, zero and one is one again, and all the rows that we have appear to be correct. And we mentioned at the beginning of this lesson that the key stream does repeat for every 10 digits. So if we wanted to truly fully verify this, we have to keep kind of recreating the rest of this here, maybe with some more plain text and ciphertext that we've recovered. Or we could just kind of follow this out assuming that these four verifications actually are implying that the rest of the key stream is correct. We could fill out the rest of this table best just by using the rules and then following that out and see what does that give us for our actual plain text if we were to use this key stream for the rest of the message. Let's go ahead and try that. Let's just assume that these four guesses that we got right, these four rows that seem to check out are indicative that the rest will as well. So let's finish this all out here. There's the full ciphertext. There's the key stream that I generated just by taking the first 16 digits that I was able to produce and then just keep running with it using the definition. Got a lot more now, in fact, equal to the length of the ciphertext. And if I were to XOR those together, the ciphertext in the key stream, that should recover the plain text again assuming that the rest of the key stream ended up being correct. And now how would I know if the rest of that key stream ended up being correct? Let's convert all of these bits back to characters and see if it's actually readable. And in this case, we get the ASCII message Python is the best. So it appears to be that this was in fact the correct key stream, the correct guest plain text because it returned an entire plain text message that was readable in a format that we expected. So in general, that's how we do the known plain text attacks using this LFSR stream cipher, is that we're going to make a guess about the location and the contents of a piece of that message. And then we'll XOR that bit with the piece of the ciphertext that we might have intercepted to get part of the key stream, or at least our guest guest at it. We'll work that guest fragment of the key stream back up through the system to determine the seed value and then try as best we can to verify that that seed value actually did generate a key stream that should have been able to come from the LFSR system as defined.