 Oops, it started today. I saved a little out of it because I had a cup of coffee that I made at my desk in my office, and it is not with me. So it's going to be fun today. Let's get into it. So we had homework time a few last Thursday. We will have another one assigned very shortly on the separate talk about now the cryptos. So that is going to be super fun. I'm actually really excited about this. Questions? Less people in here, I noticed. You're going to put out your test for the first assignment. Oh, the test. I won't put out your test. Oh, the secret test. Well, all of what we missed, you know, so we can check and find out why we missed what we missed. What did you miss? We missed the secret test and missed the key change. But everything I did, everything I could think of trying, all worked, and it didn't work for yours. And then the one that says input really long. So that was the input really long, was the basic test case, but just insanely long people names. So if you're using some buffer, a fixed length buffer, to read the input, that would fail because they could be arbitrary long characters. But other than that, like the format, I believe, was exactly the same as the basic example. Then the, oh, maybe it was, actually, I don't remember. Maybe it was the same as the rekey example. That would probably be why. When I got it at the same time, they might be tied, sounds like they're tied together. Yeah, they're also, the secret was just rekeying it while there's essentially a key in the lock. So thinking that, okay, so then the locks change. So the idea is the access control checks occurs when you try to enter, not when you enter your key in. Similar to a normal door situation. What's up? We were talking about how to deal with transposition ciphers. So we looked at substitution ciphers, which try to substitute one, which substitute one character for another based on some kind of algorithm. And we looked at transposition ciphers, which kind of switch around the outputs based on the inputs. They don't actually change the characters that appear here. And so we looked at, well, we can solve this, we can't use one gram, the just letter frequency of the English language to help us solve this, because why? Which one? Any type of transposition cipher. So, anything that switches the letters around. That'd be 20 text answers. Because, so we can't use the frequency of letters in English because the letters have not changed. It is still English text. It's just all jumbled up and mixed around. Right, so the frequency distribution there should be exactly the same. But as we saw, what we can do is we can look at the two-gram frequency. So one character's, what are the probabilities of one character following another character? And even the trigram, the three-gram, so what characters follow each other one after another in English. And just like before, I think an important thing here is so you can, as we talked about earlier, with the one letter frequency of the English language. You can download these, find these values, or you can build them yourself off of your own training data. And maybe that would be better if we made that training day will be more tied into what text, what the plain text is you're trying to decrypt. Cool, so we looked at this and we saw that, hey, based on our examples here, h and e was probably the most frequent. So that is probably how we should try rearranging it. And w, h, so in case h was the very last letter of our cipher text, we want to check to see if, see any two grams ending with h. And so we tried, we tried that out and this is obviously a very simple cipher text, so this breaks it. So how to encrypt something so we can, there's different types of ways to do this, so we can have a three by four matrix so we can put our plain text that ASU is awesome. And so we write it in terms of, I totally forgot everything I had my blues in. So we write this in a three by four matrix and we're gonna read off the letters and read off the column in terms of this key. So the first one will be awm and then sao, so it'll be like this, this will be our cipher text. So it's just another, a different type of doing this. So again, these we're not gonna get into too much because we already did a lot of the substitution ciphers and so we can do similar types of things here with this columnar transport cipher. But key question is when you have some message, how do you decide which cipher text is which algorithm? So I were to, let's say, give you some cipher text. How would you know which algorithm was which? How'd you know? Sure, genius. We can come back and be on that side, yeah. For the Caesar cipher, you could look at the one gram distribution if you see that it's very close to the English or the regular English distribution, then you can make an assumption that it might be the Caesar cipher. Say that again? That if you took the one gram distribution of each of your cipher, or of your cipher text, and it was relatively close to the normal English distribution, then you could make an assumption that it is the Caesar cipher. But would not be the same, right? Yes, yeah, yeah, it's not the same, yeah. So it's still have frequencies in the peaks that we would expect based on English, but shifted maybe. So yeah, Caesar, we can easily test that. How else should we approach it? Assume it's one type and then try to break that and if it doesn't produce something, why don't you both try another type and keep going through all the ones? Yeah, well, all at once. All the ones you know. Yes. Yeah, you could try that. It's gonna be a little bit more difficult, right, because you have to think through. I mean, if you wanna try all of them, that's a lot, right? I mean, there's a lot of different ways you can kind of do this. So we can use some of the statistical tests that we talked about, right? We can use the index of coincidence, which is the probability that two characters in our cyber text will be the same. This can kind of help us decide if it's a VINRA cipher. We can use the correlation metrics that we came up with with the Caesar cipher to try to determine what could be the best closest fit. We can look at, like we talked about, right? We can look at exploiting the English language and we can use the distribution of letters, bigrams, trigrams, n-grams. We can exploit, like we did when we were trying to break that, the cipher text. We can exploit information like Q is always followed by U, or like we saw, we, I believe, what was that example that we found? Was it R? I think it was U. Not the letter U, but I think U said it. There was like a three-letter word and we tried breaking that cipher text based on that. I think it was R, or like A-R-E. And we thought one character would go to R. So yeah, so looking at the, and looking at the cipher text as we're doing it kind of piece by piece in order to decrypt this. Some fun real-world examples. So modern cryptosystems, as we'll see, they don't use a Caesar cipher. Why is that? Well, it's too easy, and why else? You can, yeah, you can brute force with only 26 tries. Why else? What is a Caesar cipher? What's the space of the messages? We're talking about the cryptosystem, we modeled the cryptosystem. What's the set of the messages that you can set? Yes, English alphabet, it's a only, and we only looked at really uppercase characters, so there's only, you can only send messages that are composed of uppercase characters, right? Now, a real system, do you only want to send messages of uppercase letters? Now, what else would you want to encrypt? Digits, numbers, yeah, number, what else? Trimble, symbols, yeah, anything. You want to encrypt any random bits of data, right? But it's important to remember that the data in our digital systems are just ones and zeros on the hard drive, right? So we want to actually encrypt those with ones and zeros. So a Caesar cipher doesn't make sense because we only have a key size of 26. So what would you use then? How would you change the Caesar cipher to be relevant? Let's try that one. Do that, although that's then tricky because with Unicode, you have not a one-to-one correspondence with bytes and characters, so shifting would be, I don't know that you'd actually get an even distribution, that would be really interesting. I don't know how that would affect it, but you could possibly do that. I mean, they all are, they do get mapped to a code point so you could shift based on that. And I don't even know if you could, what would you want? What would your key size be in that case? I don't even know how large the Unicode space is, it's very large, so that's a little bit too big. What would be down smaller, what would be a good unit? Do we want it at the bit level? We just encrypt a zero and one. Why doesn't that make sense? How many possible keys are there? Two keys. Two keys, two keys, and they either don't do anything or it just flips all the bits, right? So clearly a key length of one bit doesn't make sense. A key length of Unicode probably is too large and I don't want to do the Unicode path. You do it at the byte level. You do what? At the byte level. At the byte level, why does the byte level do it at the end? I don't know, I mean it's like, you have a decent key size. Let's try to set it up. There's like 256 keys that you can choose from if you're gonna do a Caesar Cypher cycle, which is decent. And how would we do the shift? Well, I mean, I would just represent that each byte is a decimal number between zero and the turn of 55 and then shift around like that if I was gonna do a Caesar Cypher with it. So how do you actually, I mean, how do you perform the shift, that's what I mean, like do you take the number, so like take your byte, right, the input, take your byte on your key, you add them together, right? Let's say your key is all Fs, you add them together then what? What do you worry about? You overflow, but then it just wraps around. Do I believe? It does, and it's a super efficient operation for that. What is it? Okay, well, when you do addition, I'm not a home center, but when you do addition like a regular CPU, it'll just wrap around naturally moment. It will, although you may get, I guess it depends, well, that it depends on how you do the addition in the assembly language. And I, because I think you can get exceptions or fines that get set if there's an overflow, because in this case, we would essentially be overflowing. Well, it's equivalent to addition in binary, and we just, CSE major, help out the CSE majors, we don't ever think about hardware recovery. Looks like we have, let's go with just two bits. Zero, zero, zero, one. That sounds a little weird, don't we? Zero bytes shifted zero. Zero, so these are all the same, right? What about zero shifted one? One. And zero shifted two. This column is same as the first row. This column is what? Same as the first row. Two. And zero shifted three. Three. One shifted one. Four. What's one shifted one? Two, so does that work with our ORS? One or one is one, and zero and zero is one. So it's one shifted two. Three. And one shifted three. Two shifted zero, so two shifted one should be three. Two shifted two. Two shifted three. One. So, ta-ta, maybe we started doing it, but that's fine. We're in it. Zero, zero, one, two. Let's go back to our basic, our super simple example. So, at zero shifted zero, zero, zero shifted one, can't possibly be the same thing in the column, right? So this is, process like a CQ2 puzzle. The column has to have unique numbers. All right, one shifted zero, one shifted one. What is this? What operation is this? XOR, our super handy friend, XOR. And what is this? Should be also XOR. So, let me go ahead and check. Should be one, one, one, two, three. What's the one? The bottom right, should be zero. Yeah, should be zero. Did you guys mess up on doing this? Yeah, should be two. Except, it should be two. It should be two. It should be two. If you just, if you just added three to three. You can blame us on you guys. You're not my hero. And what is that? XOR doesn't carry, but addition does. Correct, okay, well, false exhibition. Yeah, but the mod drops off the, because it's still wraparound, so three plus zero is three, right? And then, three plus one, four should be zero, or? Should be zero. No, no, it should be. Yeah, four, should be zero is here. Zero, one, one, zero. Just stuck with this guy, it would've been fine. Going back, the main reason is, unlike addition, because about, oh, oh, oh, okay, I think we could go, we could establish this situation. Okay, so the properties that we need, right, are that every character is mapped to some other character, right? And you can go backwards. So this is the other thing, okay. We're gonna make another table now, so we'll take it. So what properties did we want from this table? What properties emerged? How many values do we have in one column of our table? It happened, and each has an independent value, right? We can't have two zeros, we can't have two ones, and we have to have one of each value in each column, and also each row. If we do this again, one, zero, one, one, zero, zero. Now, so this was addition, how do we go backwards? Subtraction, we need to subtract, right, and we need to make sure we get it in the proper order, right, because subtraction depends on which order you do it in. Now, if we leave this XOR, so XOR based on our table, which we can use here if we've forgotten what XOR is, so if it's the same, it's zero, and if it's different, it's one, right? So zero, zeros, they're all the same, so what's that gonna be? Zero, so zero, zero, zero. So this will be zero, zero, and zero, zero, zero, one, XOR, and zero, zero, one, zero, one, zero, zero, one, one. So our first column is actually the same, so let's think, do our properties maintain this column as they're one of each of the zero through four that we're interested in, yes? And if I tell you that the output is one, one, and our key was zero, would you be able to tell me, using XOR, what the original value was? Yeah, by performing the XOR, right, you take one, one, you XOR is zero, zero, you get one, one, right? Cool, okay, so zero, zero, zero, one, zero, one, zero, one, zero, zero, one, zero, zero, one, one, one, one, zero, zero. So does the same thing hold? Yeah, does the same thing hold on the row so far? Yeah, okay, and then we'll do it one more time, so this will be one, one, zero, this will be one, one, this should be zero, zero, and this should be zero, one, is that correct? Yes. And we'll do it on the last one, so we'll do one one, that'll be one zero. Our properties hold, right? We still have every column, every row, we have the same, so this shift is doing a different kind of shift, but the great thing about XOR as opposed to addition is that it is easily reversible no matter which way, right? So you give me any one of these two that I could perform XOR on either of them, it doesn't matter which order I perform it in, and then I can go back and get the whatever row was that, or degree. And it's super fast because you don't have to worry about overflow, computers can have dedicated XOR chips, like this stuff is incredibly, incredibly fast, cool. And as we'll see, so XOR essentially forms the basis of most of the modern crypto systems. And okay, so this actually, I should have brought up this example. So there's an example of one of these new cryptocurrency initial point offering things that implemented their own crypto and did their own like hashing work algorithm, and it turns out they completely messed it up and have completely ruined their coin. So, so as many times, don't do your own crypto. There's many, many, many, many, many ways to get it wrong. One of the ways, it's actually super cool, so if you wanna look at more like real world types of attacks is site channel attacks. So, this is essentially where some type of information is being leaked through a different channel. So, anybody have like a pretty old computer at home, like a tower with like a fan that spins? Yeah, is it always spinning and making the same noises all the time? No, it makes different noises depending on what. On your task, what you're doing, right? I have a desktop at home, I basically convert it into a server and every time I have this agent to it, I can hear like the hard drive click and move and the fans start spinning. So actually people have demonstrated that you can like extract secret information based on site channel of even the noise that the system is making. You can also use power analysis, so you can figure out how much power is being drawn by the CPU in order to figure out what kind of secret keys and things that's trying to encrypt. One of the kind of the key site channel attacks is timing attacks, where the idea would be if, so for instance, kind of the naive way to do, and this is a little bit different, but the naive way to do a password check is to do what? Like how would you check that two strings are equal? How would you compare it, right? You say the first byte's equal and the second byte's equal and the third byte's equal and the fourth byte's equal, right? And then as soon as that first check fails, you return, right? Because you already know that the strings are not equal. So if you do this on a remote web server and compare people's passwords based on byte by byte, what this means is that there's a timing difference between, let's say there's 256 possibilities for that first letter of the password or character of the password or byte, say byte, first byte of the password. So if you tried 256 times with different bytes as the first one, 255 of them will execute check, see that it fails and stop, whereas one of those will execute one additional check of the second byte and you can actually measure this for a remote timing difference, especially if you do 100 or whatever, 10, 100, 1,000 different tries, you can crack the password byte by byte just by using this timing attack. So there's lots, and this is kind of, so in all basically all crypto, you need constant time operation. So you need that password check to take the same amount of time, no matter if the password is correct or incorrect. So this way you're not leaking any information. Yeah, so crypto is incredibly difficult to get right. I'm gonna talk very briefly about, there is a challenge at the DEF CON. So DEF CON, cap to the flag is like the Olympics of cap to the flag, and they had a qualifying event in 2011 that's open to all teams. There's a category, I don't know why it was called binary weakness, it didn't really have much to do with binaries, but that's part of it. So, and this was pretty early in Android's life cycle. So there was a tarp, so there's an archive that they gave you with a .vex file and some JDGS, what is it, JPGS? What's a DEX file? The DEX file just contains, if you look it up, you can see that it's actually a zip, it's kind of like a jar for an Android app, it's essentially a zip file that contains, yeah, it's because it's a zip file but it also has additional resources and information there that has the Shopify code of the app and it has, and so basically, and this JPGS, you're like, this is dumb, I mean, JPEG is, there's no JPGS file extension. So what you had to do is by analyzing this Android app, you could see that it was encrypted pictures, it was an encrypted picture app, and actually what was super interesting at the time, and apparently this app still exists, but the organizers did not create an app for this, they like found a real world app, and this was what you had to reverse. So this app, so you can see it's a light version in the ID string, you can see calm, dot whatever, dot whatever, light, and so this is why it's light and not the paid version, is because when you dig in, you find out it was encrypted with an XOR eight byte key. So how many possible values are there for this if we wanted to brute force this? So let's step back. So with a Caesar cipher with a, let's say, so a Caesar cipher with operating on bytes, with a one byte key, how many possibilities would you have to try? Because it's either unencrypted or encrypted, right? So you either have the original data, so you can throw that one out. Or what else? Or the one, you can say one bit. Sorry, one byte, oh, one byte. Yeah, like we just talked about. So we want to now do XOR encryption on the Caesar cipher essentially on XOR, on the byte level. So let's say like a one byte key, because that would be essentially similar to a Caesar cipher, right? You take that one byte key and you XOR every byte of the input, the plain text, to get the cipher text. 255? 255, let's go to 256, because that's gonna be a power of two. Just easier to think about that way, right? But yes, I mean, technically it's like 256 minus one, because who would use a key of zero? Right? So one byte, a two byte key. Yeah, it's 65, 536. So we have a one byte key. So to brute force that, you need 256, right? So you need to try all possible combinations of that key. And that would be super easy. So the two byte key, now you need to do what? 65, 536. Yeah, 250, but how'd you get there? The broad number doesn't matter. Oh, two to 16. Yeah, so you square this, right? Because you need to do 256 for the first byte while also trying all 256 combinations for the second byte, right? So it's 256 times 256. So if we now go to, but how would this work? How would you use a two byte key? Yeah, so let's say it's, I don't know. Let's say it's randomly, like this is the key. How would you encrypt the plain text with this? We'll go around town. We could, but, so how does the, so this is our analogy, so the one key was analogous to the Caesar cipher that what would a two or greater key be analogous to? The vignere and cipher, right? So you basically just repeat this key over and over again, right? And encrypt each of the plain text with that. And it turns out that this encryption was an eight byte key. So we knew from analyzing the app itself, it was some, some eight byte key. So how many possible values are there for an eight byte key? 256 times eight. Now I am interested in the number. What's the number? Here we have it. It's a very large number. That's not very good. No, I mean, yeah. Yeah, 1.8 times 10 to 19. So how would you break this? Should you try brute forcing? You will never finish this. This is a lot. Well, maybe you would actually. I don't know, I'd have to look at some things, but you could potentially brute force it. But how else would you do it? What could you treat this as? It's pointed out the same way, like where, you know, where you take it, eight bytes, or a byte at the time and then line it up into eight different columns and then try and break it. That would then run in the same issues where you'd have to match them all up properly to get it. Yes, you would do, definitely. And what is being encrypted? A picture, right? So it's a jpeg, so it's not necessarily a English text. So you can't use the frequency of English text. But what can you use about the jpeg, yeah? Don't jpegs or image files have headers? They do. Why is that important? Well, they're standard. So you'd know for sure what the first couple bytes are. Let's see if this will work without that feeling. Yes, you would look at the jpeg file format. So, I think it's the same, right? Is it jfif, I actually don't know. Yeah, let's say it's the same. But you would look it up. So, we know here that the first byte is always ff in a jpeg. And we know that the second byte is, is it always xx or is it something else? I think it depends. So what we can do is we can use this information to already break one of the alphabets because we know what the plain text is. We know what the cipher text is. So how do we get the key? Super handy xor friends that we just saw. We xor and then we get the key, right? It's done. And then from there, I think, I actually can't remember exactly how we did this, but we were able to do that so you can decrypt part of the headers. You can use that to find more parts of the header that are predictable. In the end, I think we then had the brute force. So I just wrote a program to spit out all the, I think it was probably like 65,000 images and just saw which ones were actually valid jpegs. So that would be the test that you actually did. And then when we opened it up, there is they had taken a written the flag on a whiteboard and taken a picture of it. So that was the flag that we were supposed to submit. The point here is, A, that these ways that we're talking about breaking these things are actually real. Like these aren't just old ciphers that we study because they were used at some point. People actually do this in this app. I can't remember maybe the site says that it uses not good encryption, but you would probably agree that if you could write this in a day, it's not good encryption. And I think the full version that you actually pay for uses real, like a real crypto algorithm. So that would be much better. But there's a lot of people who think that, well, this is great. I have some key, a password or whatever, and I just XOR it repeatedly with the text. I mean, it's incredibly easy to implement. And then you can claim that you have something that is secure, but it's not really. Questions on this? I think we did that here. That's when I saw the Santa Barbara. So I'm like, yeah, we qualified every year I was there. So yes, we did. I can't remember how the actual death fund CTF went. I remember pretty much every year they threw us like huge curveballs, like one year the entire network was IPv6. And so we'd set up all these awesome network defenses, but that assumed that it was IPv4. And so we're going, well, like nothing was working. So we had to disable everything. That was a heck of a day. One year it was ARM. And this was, I think probably 2012. And ARM wasn't super popular. So you're looking into binaries and you just don't know what's going on and have to learn a whole new computer architecture just to understand what's happening. So yeah, it's always fun to learn new stuff. Cool. Okay, so modern symmetric, so modern symmetric cryptosystems encryption are basically combinations of substitution and transposition cycle. So that's why we look at them. So we can get kind of an intuitive understanding of what they do, how they work, what kind of properties there are, what kinds of things that we can try to learn from them. They have, there's a long and active history here. They're an active area of development. Again, I do not do any research here, but we have crypto courses. You can take to learn more about this kind of thing. So what properties would you want for a modern symmetric encryption system? The designer or the, even the user, I mean, you're now an intelligent user. You're not just a random person. Hard to crack in what sense? Either in a long period of time or basically in something that holds impossible to crack. And crack what specifically? Wherever we're able to try to crack it. Yeah, okay. So, well, there's a couple of different ways they can do that, right? So they can try to brute force the key. So the key, the size of the key, like in that eight bit, you know, in one byte XOR key is not very good because it's very easy. But, so they can do that. What kind of attacks would we want it to be resistant to? In addition to being able to crack the key, you want the encryption to off-skate the XT not to where they could figure it out by inspection. Yeah, so we want it to be, we'd essentially, I mean, there are formal ways of defining this, but you essentially want something like the output is essentially random compared to the input, right? And there's usually another property that you want where basically, if you change one bit of the input, it should not be output correspondingly, right? So that would be part of not being able to kind of link things together. Yeah, that's a good one. What else? Yeah, so you need, and this is actually something that's really tricky. So you don't want, well, you want it to be in the sense that you can't, you don't want anything about the key or plain text being leaked into the cipher text, right? So you don't want to be able to easily work backwards in that sense. Or like we saw, if you know a certain byte is a certain value, you shouldn't necessarily be able to use that to extract the key year to break it. Yeah, what are the properties? You don't want two different plain text to have the same kind of encryption? Why? Because then, well then, you can kind of, didn't you map out, like figure out how it works, what you have one that encrypts to the... So say that again, so you need the plain text. Like if you have two different plain text, you don't want them to produce the same cipher text. Yeah, that's a tricky one, right? So if we are encrypting the same file, do we actually get different files or do we get the same file? Cool, all right, let's think about these things. So DES was the first, which is the data encryption standard, was the first widely standardized, used and studied crypto system for symmetric encryption. And super interesting, so in the 70s, IBM proposed this, I think it was in like 72 or 73, as a standard for encrypting sensitive but unclassified government information. So the government was actually looking for a standard to do this, IBM proposed this. It was standardized in 1976 slash 1977, although a super interesting thing is that there was tweaks made to the algorithm that when it was submitted and what was actually standardized after some consultation with the NSA. So what do you think they did? I think conspiracy theorists. They didn't want IBM to know how to encrypt it. Yeah, well, ideal, right? So an ideal algorithm, it doesn't, a crypto system, it doesn't matter if you know the algorithm, right? As long as you don't have the key, you still can't decrypt it, right? If I remember right, it was to help them and they were still having trouble with the substitution, the way the substitution was being done and so this, what the NSA helped with were the S-boxes so how that substitution was done from one to the other to make it sufficiently random in the output. To make it, so then when you make the one bit change, it changes the rest of the socrotex sufficiently. One bit? No, I'm conspiracy theorist there. Yeah, we'll talk about that in a second. I'm building suspense. Yeah, I mean you could claim that they put a backdoor in there somehow, that they tweaked it to make it easier for them to be able to decrypt, right? I mean that would be one, one theory, especially, well I guess actually maybe at this time they didn't know it would be such a widely used standard but once the government kind of put, gave them, like after the standardization process, it was used in tons of places. I mean this became kind of the de facto standard of encryption. So it would be interesting to think about, well if you could, if you are a government agency that's dedicated to spying and getting intelligence, if you could tweak something, what would you tweak and why? So yeah, we'll do this in a second. Important facts, so 64 bit of block size of the data block size. So basically the data, so you think of some plain text that's split up into 64 bit data block size. So it's actually a fairly common with these types of encryption algorithms where they work, they have a defined input, so the input here is 64 bits and the key size is 56 bit key size. So how many, so then if this is the length of the key, what would be the guessing size here? Two to the 56, yeah whatever that is, we'll look at that later. Right, because all you have to do is try all those possible combinations and one of those will be the plain text that you're looking for, right? Cool. Okay, so we're gonna go into the details here because I want you to have a feel for what these things look like and what modern kind of crypto systems look like. We're not going to go super into the details. I don't like dissect every aspect here. I think it's important to know at a high level what something like this does. So at a high level here, so you have your 64 bits of plain text. Now remember at this point, we're only focusing on 64 bits of plain text. We don't care about anything else at this point. There is, at the start, these blue boxes, the IP and the FP, the initial permutation and the final permutation, they're actually exact inverses of each other and there's no cryptographic benefit for having them here. Allegedly that's something to do with the way data was loaded on old machines or something. I don't know, apparently this helped out performance-wise, but it has nothing to do with security-wise. But after you have this initial permutation, you shift things around, then these two output edges, are you split up the plain text into, what is this, 32 bit now? 32 bit blocks. The lower half goes into this function F and the upper half then gets XORed with the output of this function F and that is considered one round of DES, where what are we missing from the picture? Right, recurrence system outside of plain text is input, and a key and we'll produce some ciphertext, right? We don't have a key here, we need a key. But we'll see the key gets fed into this F function, so this F function takes in 32 bits plus some key and then whatever that output is, is XORed with the upper half and then the upper half XORed with the output of this F, so this is considered one round, so one step here is called a round and there were 16 of these rounds. So that the next time you can see here the original lower half of the last step is then XORed with the result of the upper half XORed with the F function is passed into the F function with key to get more stuff and you can see this kind of happens over and over. So the key derivation, so the keys, and again, I'm not a crypto person so I don't know how much of these things are like practical reasons why they have to do this and why they're not using the key over and over again or why they specifically have a 64 bit key that they then churn around and move around to generate different 48 bits for every phase for this F function. But all that, I mean, this standard is there, all of the, like, this is one of the most widely studied site crypto systems, so if you're interested in this, there's tons of information on there to dive into. But basically the key, so the key was, oh that's right, so the key we said was 56 bits but you'll notice there's 64 bits here. There's actually like a parity bit that was used here for every, was a byte. So you technically have 64 bits but the parity bits didn't add any actual key information so the size of the key was still 54. And then you split that in two and then depending on the round, so these left shifts, each round would left shift with a loop, a certain number of rounds, it's like one to two depending on which round you're in. So you shift it, you then take it and then this sub key, so it gets passed in as PC2, this is a permutation thing, so this is gonna mix things around. And I will show that in a second, but then this sub key one, so these 48 bits are all related to the key. This all, oh yeah, I'm sorry, I forgot to say, PC1 actually I think shifts the bits all around to output. Well, we'll look at it in a second. But anyways, these 48 bits then get fed into this F box, so this F box is, the inputs here are some part of the playbacks, some part of the key and output some shuffle around bits. Same with sub key two, and so this is what PC1 looks like, so these are all of the, so I think it's mapping 64 bits to 54 and I think it's just taking out the parity bits, so I don't think it's actually doing anything interesting here. You can kinda see that in the structure here if you're standing up here and looking very closely at your computer. And PC2, these are all the boxes that actually do a lot of shifting and permutations, so this is kind of permuting the key, shifting the key around, and then output of that is used as part of the sub key. And you can see that the input here, the space is also smaller, so it's not, it's creating 58 bits from I believe the, what do you say the key size was, 56? So I believe this input is, I think this is taking 64 to 56 and this is 56 to 48 bits. So for what we've seen so far, all we have are permutations, right, so our transposition ciphers. So all these permutations are transposition ciphers. And so then we need to, inside this F box is where these actual substitution happens. So the F box takes the 48 bits from the sub key that we saw that gets generated from those permutations. It takes the 32 bits, half of the block size that we're trying to execute. It uses this extend function, which you can kind of see here actually widens the plain text from 32 to, I don't know exactly how many this is, but you can see it actually doubles, I mean, I can't see, but it doubles up some and switches bytes around. Probably, well not probably, to match it up to the sub key of 48 bytes because the sub key is 48 bytes and our handy-dandy operator here in the middle is what? Fashion XOR that we know and love, right? These get XORed together, so now you essentially have something from the plain text being XORed with something from the key. Then each of those bits, so those 48 bits are split and set to different, what's called XS boxes, which are substitution boxes, that these do the actual substitution and then that is then passed to another permutation, pass to shift things around again and that's the output of one F function, right? And remember the output of this is then XORed with, so this was the input here in the first round was the second half of the input data and so that second half of the input data that mixed in with the key and then XORed in with the first part of the input data and this keeps going. So these XS boxes are super interesting. These are the actual values you need to know, like this is all, so none of this is hidden information, right? All of this, these permutations, the how, what substitutions actually occur are all publicly available and known because that is part of the encryption algorithm, right? Because the idea is I can send you ciphertext that I've encrypted with this algorithm, you could then take, if you know the key, you can take the key and decrypt it with your own copy of a different implementation of this algorithm. Questions so far for you, look at the tables. So you can all see that, I mean, you can see how they're using substitutions, well, we'll get a substitution, but also permutations, right? So you can see they're basically, I mean, I believe this is kind of the key here is they're doing this substitution followed by a permutation and they keep kind of doing this over and over again. As far as like why this works for encrypt, I mean, to encrypt something and to decrypt something, that's beyond my, beyond my pay grade. See, kind of zoom in here, but I think, okay. So these XS boxes basically say the way to read this is the, so for S box one, they're obviously all different. So you can see the input here is, well the input here is one, two, three, four, five, six, six bits. So the left, the row depends on what the outermost bits are. Can you all see this? It's literally the most it will zoom in. So this row is if the first bit and the last bit are zero, this row is if the first bit and last bit are one, and then to figure out which column here, you then look at the inner four bits. So if they're all zero, the output will be 15. If it's zero, zero, one, zero, the output will be eight and so on and so forth. So this is essentially doing the substitution part of the cycle. And then they have these for all of these. So these would be called S boxes. Questions? You have to memorize this. That'd be super mean. Well it'd probably be doable. Not that I would ever ask you to do it. I was thinking theoretically, would it be possible to memorize all these values? Probably. So this is the big scary DDS. I mean this is everything that it's doing. So you can implement this. I mean it's not crazy. There's descriptions of the algorithm. You can definitely do all of this. So now that we have this, we actually use it. So if I want to send you a message that's 64 bits long, how would I do it? I won't say we've already exchanged the key. How would I do it? Assuming you already have this implemented, you want to implement it, right? You have some function you can call. D, S, and Cript. What is the function taken? 64 bits, flight task, what else does it take in? The key. The key? What's that output? 64 bits of cybertext. So I just send you that cybertext and I know that you're the only person who should be able to read that key because this process is reversible. You're throwing the key, the Cript, the secret text and then it will spin you back out of the flight text. What if I want to send you 65 bits of information? How do I do that? If I have one idea, I could do two cybertexts. So what? Take the first one and then put the second one through and then come right through. Go there. So we have a function and an output cypher. Now the porn point here is plain as 64 bit. That is what we said earlier. Yeah, 64 and 64, right? So it's very clear if my string is whatever is 64 bits at most, right? So I say like S, then I can just do PS, S, key, and then I can send you that, right? So we just said, cool. What if now S, so we're just talking about this. So what if now S is 65 bits? Or that's actually super annoying. What is it, 128th bit? Oh, no, yeah. It's gonna be hard to multiply that by two on the fly, but perhaps it's harder to. So 128 bits. How would we then send that? So can we just call DES too big? DES only operates on 64 bit inputs, right? It would throw an error or exception or whatever programs back in the 70s would do when they were fed too much input, right? So the first thing it would do is check and say, hey, if you're trying, you can only call me with 64 bits of information. I refuse to operate on anything else. So how do I send this message S to you? So you break it in half. So you do, let's say like this, we're gonna do this like Python syntax. So we can call DES with the first 64 bits. Let's say that's how you do it. Like probably 60, actually I don't remember. Let's just do it like this. It's my own language now. And then we'll set all of this equal to, we'll call it message, the first 64 bits of the message, and then we'll call DES again on what? From 64 to the end, or from 60, I think it will work like this, right? Yeah. And we wanna set that to the last 64 bit. So what should be the length of message of M now? Should it be 128? Now if I send this to you, how do you decrypt it then? Split in half, cool. So yes, this was the, so this is ECB mode. So this is very different because the underlying crypto system is just a primitive, right? So it's just a only encrypted data block size. We wanna send arbitrary amounts of data. And so there's two different modes we can use. So we're gonna use the DES block like this. This is the nice part. We can just abstract it. We don't really care. And we can substitute in any block based crypto system. So anything that will encrypt a symmetric crypto system that will encrypt based on the key and some data and output cyber text. So the idea is we take our plain text, split it up into blocks of the size that we want to encrypt. And then we can just take the key K, feed it into each of these, feed it into this DES to get some cyber text, do that for all of these and concatenate them together to get the cyber text that we send over. And to decrypt, we do the opposite, right? Go to the top and then we just decrypt. So what's good about this method? Simple, I'll say that first, right? This is literally the first thing that we just did, right? It's simple, why is that a good thing? Yes, complexity and I'll tweak it a little bit. It's a complexity brief vulnerabilities, right? If something's incredibly simple to implement, it will likely not be messed up when it's implemented. Does this mean it's secure? Not necessarily. What are some other good aspects here about this mode? Scalable in one sense. Yeah, it's scalable in the sense that you can send any amount of data. So we'll see that all the modes you can send any amount of data because that's essentially the problem we're trying to solve here. But how is this scalable in a different way? Scalable is a good way to think about it. And will it parallelize it? Yes, it's stupidly parallel, right? And in the modern era where CPUs are not getting faster, they're able to put more and more cores on one chip, right? The fact that you can do this in parallel may be very nice because now if you have something very large you want to encrypt, you can break up into chunks and each processor can run in parallel each of these DES algorithms. And then the results get concatenated together again. So you can split this up that way. Any other reasons why it's good? Or some of the downsides? So let's go back to what we talked about our properties we want. What happens if I put in the same plain text with the same key? Same hypertext. Same hypertext, yeah. So that's not necessarily what we want. That's one thing. How did the deal of just having one bit of data in the rest of the space? Yeah, I was debating whether to talk about this or not. I think we'll talk about that on Thursday. We need, in general, using these kind of block-based schemes, we need some way to add padding to the end. To be able to pad the send 65 bytes. We need to be able to send 64 bytes that we want, one byte, and then pad 63 bytes there, sorry, bits. 63 bits in the end to then actually be able to do that. So yeah, we'll look at that, I think, because it definitely comes up. So let's think about it this way. Let's say I send, let's say, 64 bits of, let's say you're trying to decrypt this. You have the ciphertext, and one of our properties is we don't want to be able to know anything about the plain text, correct? That's what we said. So let's say I take the ciphertext, I split it up into blocks like this, and let's say I notice that block one and block three are the same ciphertext. What does that tell me? They are the same, they must be the same in the plain text, right, because this is a deterministic process. You have the same block, if the blocks are identical, the key is the same, the output, the ciphertext will be the same. So that actually can tell me something super important. And there's actually also another interesting thing here of what happens, well, I guess we won't talk about that. Ignore integrity for now. So these are actually a major flaw, right? Because let's say you're encrypting a file, I don't know, it has a lot of zeros in it, it has a lot of repetitions in it. These are likely things that are going to happen, and really you're leaking much more information about the plain text than you would want to in this output. So you should burn into your minds that ECV mode is horrible and you should never use it. And I'm going to help burn that into your brain. It's actually insane that there's a lot of, if you think about things from a programming and API design standpoint, there's a lot of crypto libraries that default to using ECV, even though it's known to be horribly, horribly, terribly terrible and insecure. So one way we can try to deal with this is essentially we can, so the CVC mode that we'll look at essentially has the idea of what if we chain these together? So what if we actually use this ciphertext that happens to be 64 bits? We XOR that with the plain text, the next plain text box that we want to encrypt before we pass it to DES. So the idea is we encrypt the first block and then we XOR that block, the ciphertext with the next block, because the ciphertext should be sufficiently random, right? And then that will XOR with the next block, which we'll use the key to get some ciphertext, and that would be fed into XOR with the next block before it goes into DES and so on and so forth. So now you've kind of created this chain in some sense between all the blocks. So now, even if you repeat the same content, because let's say blocks one and two are exactly the same, right? Block two was XOR with the ciphertext output of block one before it was passed to DES, therefore the ciphertext will be different for each of them, does that make sense? But is that true for all possible, what if you encrypt, let's say the message foo, because we plus than 64 bits with this system? Say I encrypt it and you encrypt it. What's the ciphertext going to be in that case? So let's think of just 64 bits, so forgetting just in the case of one block, right? Let's say I encrypt this block, it's hello world or foo, right? That's the data that it contains. And then I encrypt it tomorrow, the same block. What's the ciphertext that's going to be? The same. The same, which was the problem we were trying to avoid in the first place of the same plain text being output to the same ciphertext. So the way this is normally done is saying, well let's make, essentially you can think of it as a fake block, starting block. Let's just randomly create some fake starting block that we will XOR in with the first block before outputting the ciphertext. And this is called the initialization factor. Remember because the idea is you can't go backwards, so you can't take the ciphertext because you don't know the key and derive go backwards. So in this case, the initialization factor, this IV, is the size of a block and it is randomly generated and it's publicly known. So you would give somebody, here's my message and my initialization factor. And you know the key so you can decrypt it. Somebody else can't do that. And because you have random initialization factors, each of your ciphertext for the same message will decrypt to something completely different. So this is why I want this to burn into your mind about why this is bad. This is the, any of this person? This guy, Tux, the Linux penguin. So this is the original and this was ECB encrypted with the password ANNA, ANNA, all of our tests. But is it looking encrypted? A little bit, right? You can see like, you think about the block size, right? Is blocks are probably encrypted, right? But what can you still tell? Yeah, you can still tell that it's, that it's Tux. It doesn't matter that you use some encryption algorithm but you can still see the intent behind the image because the blocks are all the same, right? Because all this white space is encrypting to this pattern, right? And all of the other white space encrypts to the same pattern. And all the black encrypts to the same pattern. So this is super cool. There's research you can do to look to do this. And if you did CVC mode, it looks completely random. Just realize that I don't have a cycle. So we'll learn about the fall of DES on Thursday.