 In this lesson, we'll be looking at a very well-known encryption scheme called CSS. CSS stands for the Content Scramble System. And the CSS system was a digital rights management, which often goes hands-in-hand with some cryptography work, and it's abbreviated by DRM. A system that was used for pretty much all DVD video disks that were produced back in the day. So I know DVD nowadays is less popular, they're still out there, but every disk that was made, that was a DVD video, had the Content Scramble System implemented on it. And the way that it worked was it used two LFSRs, like we've seen in the past. One of them was a 17-bit system with the taps at 1 and 15, and the second LFSR was a 25-bit one with four different taps. So, since together that makes 42 bits of information, and like we've seen in the past, we always set that lowest bit as a value of 1 in each of the LFSRs to make sure that we don't get stuck in an all-zeros condition. That means that there's 40 other bits that you need to know in order to seed these two LFSRs. Now, what makes this LFSR system different than the ones that we've seen in the past is that it's not a singular LFSR, and it's not two LFSRs that are going to be used with the XOR operation. It takes the output from these two LFSRs and combines them in a new way that we haven't seen before in order to create a system that avoids the linear congruencies that we've seen in the past, which allowed us to determine the seed. Now, the fact that we're talking about this now implies that the fact is a way to crack this message, but it is much, much harder than we've seen in the earlier systems, and this is why this method was used for so long before somebody figured out how to ultimately get to the seed. Now, how you do that is beyond the scope of this course, but we will look at is why the system is fairly good and what prevented it from being discovered for so long. Now, in order to do that, let's look at a somewhat simpler version of the CSS. We'll call it baby CSS. It has a lot of the same principles, but it kind of shrinks the size of the system down so we can better understand the workings. We're going to use the same two LFSRs that we saw in an earlier lesson. So a three-bit LFSR where you get to the new third bit by adding the previous rows, first and second bits, and a five-bit LFSR where you get to the fifth bit by adding together the previous rows, first, second, and fourth bits. The key we'll start looking at is exactly the same and same kind of operating procedures. It means we need two bits. The first two of those will go to seed the LFSR3 and the next four bits will go to seed the LFSR5. So all of this starts to look identical to the old LFSR sum system we saw previously. Now, let's see how it's different. The way that this differs is the way that we generate the key stream. So with the LFSR sum, we would just XOR the output of the three-bit and the five-bit LFSRs to generate the output key stream. What we're going to do for our baby CSS, and this is the way it works with real CSS, is that you add the two LFSR output streams with carrying. So it's not the bitwise exclusive or XOR like we saw before. Or we're going to actually take blocks of these bits, and we're going to add them together using kind of the standard addition algorithm that you've probably learned for decimal or base 10 numbers. We've actually seen how to do this in previous lessons when we first heard about the binary number system. So let's take a look at how this would work. We've got our three-bit LFSR, and we can quickly generate its output like we've done in the past. We've got our five-bit LFSR, and we can generate its output like we've done in the past. Now let's take those outputs, and we're going to group them into blocks of eight bits or one byte. So we're going to take the first 16 bits from the LFSR3, the first 16 bits from the LFSR5, and lay them out here and then group them together. And we will add those together to generate our key stream. Now remember, we add as a whole group mod2 with carrying. So the first output of the key stream there would be a one, and then a one, and then a one, and then a zero, because one plus one is two, which is represented by one zero. So the zero goes down in the key stream, and then we carry the one up. And we follow that through the rest of the way. And you'll notice here that in this particular grouping of characters, we have this extra one that comes off the end here that we don't have anything to add it onto. So we kind of bring that over to the next group of eight bits in our output. So we kind of save it and carry it forward. So it appears here when we start adding the next grouping of eight bits. So here we've got one plus one plus one, which you know is three. So we'll have a one down below and then a one that gets carried up. And just by happenstance, we have a lot of carry bits here and this grouping of eight bits from the LFSR3 and five. And again, we have an extra one that gets carried at the end. Now, we would go ahead and carry that onto the next grouping of eight bits. But if there is no next grouping of eight bits, we just kind of toss that out into the trash. It goes out at the very end. So you can see that we always are working in these groups of eight bits, eight bits from the LFSR3 paired up with the corresponding eight bits from the LFSR5. We work with those. We might carry that output bit to the next eight bits that we work with. And we just keep doing that over and over again to generate our key stream. Now, once we've got the key stream, we would take our plain text and we would just XOR that with the key stream, exactly the same way we've seen with all of our LFSR systems. So that part doesn't change. It's just how we generate the key stream that differs. So we'll go ahead and XOR that and get our ciphertext. And we can see we do go to slightly different ciphertext for these two characters that we've used in previous examples. We have our NC, which now goes to a nonprintable character, hexadecimal19, and then an uppercase letter D. So a different output so we can see it does result in a different ciphertext. Now, how can we think about cracking this message? Let's say we try and use that same method that we've used for the LFSR sum. So we would take kind of our theoretical LFSR3 with unknown bits in the seed, or in the key, K1 and K2, and follow that through to kind of create this output stream from LFSR3 that doesn't actually have numerical values, but kind of these theoretical placeholders for what those numerical values would be. And we would do the same thing with our 5-bit LFSR. So if you need a refresher on how we did that, I would consult our previous lesson on the LFSR sum cipher to make sure you understand with these output streams for the 3-bit and the 5-bit ciphers are coming from. Now we'll take those and suppose that we have an interested key stream of 0, 1, 1, 1, 0. We could take our outputs from the LFSR3 and 5 and say we get this condition that would kind of correspond to those bits in the key stream. We could add them together to generate what we think would be kind of a theoretical output of these two LFSR3s and 5s. And the first bits here on the very end in our grouping, K1 and K2 added together with K3 gives us an output of K1 plus K2 plus K3. But when we get to the next column of bits, the output there would be the sum of 1 plus K2, that's the LFSR3, and K4, that's the output of the LFSR5, plus there we go, there's that red box that carry over from anything that came from the first column that we just finished computing. Now how would we know if there was a carry over from the sum of K1 plus K2 and K3? Well, there'd only be a carry over from that column there in the blue box. If both of those outputs, the one from the 3-bit and the one from the 5-bit, were equal to 1. So if K1 plus K2 and K3 were both equal to 1, then we'd know we'd have a carry over. And it turns out there's a pretty easy way for us to compute that carry over using that logic. We can compute the carry over by just taking the product of K1 and K2 with K3. Because think about that, if K1 plus K2 were a 0, then that product would be 0. And if K3 were a 0, that product would also be 0. But if we were to multiply those two things together and they both had a value of 1, then the output would be 1. So if we just kind of replace that plus carry over there with this K1 plus K2 times K3, well, we've had a nice way there to compute either a 0 or a 1 depending on whether or not we needed the carry over. Now, this right here is the crux about what makes CSS and our baby CSS system so hard to crack. If we expand out that product of K1 plus K2 with K3, you see that our key terms are now being multiplied together. We have a product of K1 and K3 added to the product of K2 and K3. Whenever we have our kind of unknowns or our variables being multiplied together, we no longer have a linear system, we now have a nonlinear system. So if we were to try and follow through on the next steps like we did with LFSR sum and set up a system of congruencies, you see we'd have our first output K1 plus K2 plus K3, which is equivalent to 0 mod 2. That's because that's what the equivalent bit in that column that we intercepted in the key stream was set to 0. But then our next one would be much more challenging. It would be K2 plus, here's our carry over, the K1 plus K2 times K3 plus K4. And this is our nonlinear term, that K3 term has a nonlinear component to it. And as we continue to follow even more and more potential carry over bits into the subsequent columns that we didn't even look at yet, these terms are only going to get more and more complicated and have more and more nonlinear terms in them. You might have quadratic or cubic terms that arrive. This nonlinear complexity is what makes it very difficult to solve the system efficiently. It's not impossible mathematically, but we're going to see it takes far, far more computing resources and far, far more time than our linear congruencies that we saw when we were working with the LFSR sum cipher. This is a good lesson that whenever you're working with cryptography and trying to devise a new cryptography system, working with nonlinear systems are going to be a much more complicated and harder to crack system than if you work with just a nice simple linear system. Now let's take a look about how we could still break the CSS cipher and let's just think first about our baby version of it. So what options remain, we could certainly brute force this. If you think about our six bit key, that means we only have two to the six or 64 possible keys that we could use to seed our two LFSRs. That's very quick. It would take very little time for your computer to try all 64 keys and seed them up, create the outputs, add them together, decode the message and see if it makes any sense. That would be pretty easy. But certainly the real CSS must be better than that, right? We got larger keys, bigger systems of LFSRs. Let's take a look. So let's take a look at breaking real CSS. So 40 bit seed that we talked about earlier gives us about a trillion or so possible keys. That's a lot. In fact, with modern computers, it's pretty trivial, but when this was being used in the mid 90s, the processing powers on our computers was much, much lower. In fact, the processing power was low enough that the government decided it would be illegal to have an encryption scheme with any more than 40 bit keys, because anything more than that, and the government wouldn't be able to break them if they had to if you were passing along government secrets or threats of terrorism. They wanted to be able to read everybody's messages and they decided that 40 bits was secure enough that most people couldn't read it. But the government had enough computing power that they could if they had to. That rule has since changed. But if you ever look at historical ciphers and you see kind of an arbitrary decision and everything used 40 bit keys, that's the reason why they were legally not allowed to use anything more than that. Well, it turns out though that it didn't remain secure very long, both from an advance of technology, but also an advance of mathematics. A cryptographer studied this system for quite a while and they were able to find a way to crack it or attack it that brought it down from having to try all two to the 40 power, but rather just two to the 25 power. Now, why 25? Remember, one of the two LFSRs was a 25 bit one and that's exactly why. They found a way that you could just work with one, the more complex of the two. You actually get enough information out to break it. And you might not seem like two to the 40, but going going from two to the 40 to two to the 25 power would be that much better, but it really was. If you have to remember working in powers of two here, we effectively have to the number of possible keys from a trillion 15 times. So that's that's pretty substantial. In fact, the modern computing at the time, the 450 megahertz Intel Pentium 3 processor only took about 18 seconds to crack the keys for your DVDs. And I think it certainly led to an advance of people being able to back up their movies to their computers, but perhaps the more nefarious I would maybe lead to an increase in piracy of DVD videos back in the 90s and early 2000s. A quick historical note about that Intel Pentium 3. If you've got an iPhone 13, the processor in your iPhone 13 that fits so nicely in your pocket has at least 100 times more processing power than that Intel Pentium 3 processor from back in the day did that we're seeing that this is one of the real reasons why modern cryptography has had to up its game a little bit from these initial systems back in the 70s, 80s and 90s is that computing power has grown very, very quickly in modern times. So if your whole system relied on inefficient computing power, that system is probably out the window. So that about covers it for the basics of CSS encryption. We're not going to go into how to break it the full system. In fact, that is not legal. There's many court cases that are on the books about trying to hide all of the nuances and details of this because it's still being used today despite the fact that it has been cracked. So we are not going to break the law in this course, but there's certainly a lot of interesting discourse out on the web about the ethics and legality of CSS. I would encourage you to dig in and see what that tells you and where you're following that debate.