 Oops, it's noon. Let's get started. Actually, I'm a surprising amount of people here. So we don't have you for coming. They have an assignment to do. That's always a weird strategy. It's done in multiple different ways, or something. The assignments do that day. Some of my people don't come with the assignments to do. I made the mistake one time, if I had the assignment to do right before class. And so literally, nobody showed up in that case. So I guess midnight is far enough away that you're all feeling very confident, which is good. Questions on homework boxes? So I subscribed to the mail list before we actually Yes, we marked everything. So if there's any problems, we will contact you. OK, cool. So I just said the first part was, with the ASU right, I didn't subscribe to the ASU. We can usually find out by your ASU ID that you use. It's usually very clear to yourself. If there's any problems, we will contact you. Thank you for printing the first output. Oh, is that helpful? Yeah. Yeah, I should have done that. Yeah, I didn't think about that. But yeah, for the other ones, it's very clear that output is literally exactly basically what the input is. So you can derive the test cases almost exactly from the input. So yeah, hopefully that helps. I should have posted that on the mail list. But I had a problem with my policy critique. Somehow I got a unicode character in it that I couldn't see. Yes. And even though I was just using G at it, the submission server was denying it. Yes. And so I had to use some online service to strip those out. Yes, be careful. Kind of frustrating. Another thing that I know comes up, is especially if you try to do this in word. And it auto does your, instead of a single ASCII quote character, it turns into a smart curve quote. And same with double quotes. So if you try to copy and paste that into a text file, you will end up with problems. So you're learning awesome things about filing coding. It's a side benefit. It's a perk. Yes. I think it's a good one. Yeah. I just got a 4K monitor and virtual box don't get along. Can you what? I just got a 4K monitor and virtual box don't get along. I can't hear you. What are you saying? 4K monitors and virtual box don't get along at all. 4,000? How many are there? 4K and monitor? Oh, monitor. Oh. Oh. Yeah. Basically, you drag the goot to down and you do like this. Nails paste because it doesn't want to do that. Ah, well you know what the fix to that is, right? Just run a server goot to an image. So you, as this agent to it, you don't need to do it. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. The way you should approach that is think like me, right? Pretend you're me, pretend you are making a secret, more difficult test case. That should help guide you. It's not a crazy test case. So it's not like, I mean, I'm not breaking the laws of physics where there's not a person inside and outside of the house with the same name, right? So the same name is the same person, it's not doing anything like that. None of these tests, they're all valid inputs, so they're not anything that's gonna be crazy input-wise. So yeah, you probably need to think through, so if you think of it like, I mean, one thing you could do, right, is to think about it like an event, because of all these different events that could happen, right? Do you think about what combinations of events have you tried and which ones have you not tried, or which ones would be tricky and actually make sense based on what the description says. So thank you. I think, well, after probably my next Tuesday, I think it won't matter, so we can talk about what that test case was. I'm sure it's just, it'll be a thing, you're like, oh yeah, I'm doing it this way, and that means that this fails in certain things. But that, I'm fine with it. All right. That was probably the biggest thing of all that you just said is that I'm doing this thing in this certain way and that's why it failed. So you need to think about it. But that's what always happens. That's what you realize. This is literally every possible case is something that you did wrong. You either have an infinite loop for your, misread the policy or something, right? Like that's, yes. But if that helps, then I think that's awesome. Time to have you do it helps. All right, we're gonna send out a clarification. No, by what? Hope you said on the policy and the two written ones, you were gonna just... I mentioned it in class, let's see that. That's more of a hint to the people who are here. It wasn't a clarification, it was a reiteration of what I already talked about. All right, can we break this cybertext? The message says, you guys have homework to work on or something? What about giving an extra credit for doing this? Was one of you guys trying to solve it? There's only a half point or a quarter point. Doesn't matter. It's extra credit, it's extra credit. You had bragging rights, I told you to do it and then we did it. So none of you can brag. If you're like our only person who's bragging, then show me the bragging part. Trying is good. Okay, so this is where we left. So we left off with the then-by-day cypher. We needed to, so to break this cypher, what was our first approach? What's the first thing we needed to do here? Figure out the key length, why? Well, if you know the period of repetitions, you divide it into chunks of data that length and then you take it, call it by call, and you can solve it the way you do the supercypher. Yeah, so we need to know the key length because we know that every character that's a modulo of the key length will be all encrypted with the same key and therefore we can break it. We're kind of reducing the problem instead of, instead of the Caesar cypher where we only had one thing here where we can attack six different Caesar cypher or 10 different Caesar cypher that once depending on what the key length is. So when we already solved the way we did this, that's where we left off at the end was to try and look for repetitions in the cypher text because it is likely, especially with a long cypher text, what is the, that, the reason why the cypher text is repeating is because the same plain text is being used by the same key, encrypted by the same key. So we saw that here, and so basically this tells us that it's either, usually it will tell us that this is some factor of nine or a combination of the factors of nine, which makes sense because this was three. So what was the really long repetition we had in here? What's the length? The, so it was the O, E, Q, O, O, G. Six, and how, what's the distance between that and the next repetition? 24, if you're right. They need to start from beginning to beginning, yeah exactly, from one character to the other, exactly, right, it makes sense because you're trying to figure out what's the distance there, so it includes the six characters of the one we just talked about. Cool, so we can do this for every single repetition in the cypher text, right, and this tries to give us some idea. So what are we maybe trying to start with? What are some strategies here? We, I think Brutus mentioned on Tuesday that hey, let's try to start with the, where the repeated letters are the longest because that's the longer the letters are, the less likely it is to be due to random chance. But let's say we do that, we have a distance of three, so now we think well it could be a factor of three or it could be maybe any of the multiplications of any of the factors of three. So what are we actually trying? Do we try all of them? I mean, we've narrowed it down kind of a lot, right? It could be 30, it could be 10, it could be 15, it could be five, it could be three, it could be two, it could be six, it could be the dual combinations. So from all of those, what do we choose and why? Would we choose the, it'd be kind of touched on, I think at the end the most, if we're limiting ourselves to two, three or five, try five first, because it's the most restrictive. So we, so let's think about this. The reason why, our reasoning behind choosing this repetition was that in the psychotext, the occurrence, assuming that a character of length six repeats itself, assuming that it's by random chance that's highly less likely than two things repeating themselves. But what you're saying is kind of what's, the question here is more, what's more likely of a key length? Is it two, three or five? Or one of the multiples here of six or 10 or 15 or 30? Right, so I'd say maybe by that, I don't know that you can say necessarily, like how do you decide between two or three, whether the, like what the key length is likely to be? Just based on that information, of the length of the key. I'd go with, I'd probably go with three just because it'd be easier to identify if some, you know, fits a multiple of three or five. So I'm thinking of just going with three firsts since it's shorter, you maybe could find more meaningful three letter words. We mean more meaningful three letter words. Sorry, not meaningful, but you could find three, three letter words, they're more of them on average. And so you might get better combinations of instead of randomness, just stuff that falls in and you could search for three letter words pretty fast through the text. But you still have to split the text into three different ciphertexts, right? And try to break each of those as a Caesar cipher. But you don't know whether breaking each of those Caesar ciphers correct, so how do you know the look for the three letter words? Magic? There's no magic, magic doesn't exist. Sorry, shattering illusions, can't it? I think it's sufficiently complicated so it's not to be understood if it's true it's magic. Yeah, so how do we do this? No magic. Well two and three occur roughly the same amount of time, like two is a factor of eight out of the 11 and three is a factor of seven out of the 11. So does that tell you, or what, what does that mean? Why is that important? Well, I mean if there are recurrences, possibly I mean some of these could be due to just the random thing. I think maybe the seven. Right, yeah, that's a wire. But so two or three, two maybe isn't enough, right? Cause like A, A next to each other. Well I guess it would be enough, it was all two, but. Two, two or 11, so it's 44. Yeah, so this is kind of the core idea here is rather than just looking at one, right? Because we can look at this one and we can try to determine it, but this maybe indicates that okay, the key is one of three, 30, 15, 10, five, no 10, six, five, three, two. But among those we don't really know. But if we assume that this key length will, the same key length is obviously used for all of these repetitions. And so some of these repetitions are likely due because of the actual plain text and not due to random chance. So therefore those should have either similar factors or be some multiple of the key length, a factor or yeah, or a multiple of. Right, so we could maybe think about that and look at what's the most common factors here, right? So the problem is we have all, and I guess we didn't talk about one, but we've already kind of ruled out one based on key length because we did an analysis and we said it doesn't look like an English distribution, right? So this is exactly what we look at. We look and we say, hmm, a lot of them have two in their factors and a lot also have three in their factors. So we try to, I mean, yeah, will we try two? Every even number has two as a factor, right? So just because two is in everything doesn't mean that it's a good idea, but the fact that two and three are everything, that would make me want to pick like six. Why? Because two and three are both factors of six. And six is, I feel like a reasonable length to start with because you don't have too many combinations that you'd have to go through. It's still a lot, but for a computer it's not that bad. Yeah, so I think here's the way I think about it, right? So you think about two, right? So if, let's say, assume the key is two, then how many of these three factors would we expect there to be? Would we expect there to be equally as many? No, because not every even number is not evenly divisible by three, right? I don't know the exact, I'm not a number theory person, so I don't know these exact numbers, but six would be, nine would not be, 12 would be, 15 would not be, right? So, and similarly with three, right? If three was the key, you wouldn't necessarily expect, you know, not necessarily expect all of these twos to show up, so then the question exactly becomes, well, it looks like it's indicating two times three, right? So that two and three are factors of the key, and so if the key is six, then it actually covers all of these kind of cases, and we would expect twos and threes and roughly ish the same amount. And the tricky part is some of this is due to randomness, right? The twos and the threes, so you can't take it hard and fast. And of course, what we're doing here is, I know the answer, we are guiding towards the answer, you will not know the answer, so you don't know if you're going in the right direction. So, oftentimes we will guess and you'll guess wrong, and maybe you'll have to then backtrack to say, okay, let's read a diagram and maybe try a different key size. So what do we do then? How do we check this? Do we just go ahead and try to break it? Why do we want to check? Why do we want to check? So you have to iterate through all the possibilities? Yeah, double check that our intuition, right? So we try to derive the key size from the information that we have based on the factors of the distance between repetitions and the cybertext. Maybe there's a lot of random, maybe things, maybe they're choosing really weird words that just happen to be by chance, have repetition. So this is kind of a good thing throughout, I mean, not only breaking crypto with all types of security areas, is trying to verify your assumption. So you make some kind of guess, like you say, okay, I'm gonna guess that the key size is six, but then how can I verify and try to convince myself that yeah, that key size actually makes sense without trying to do the full encryption, or full decryption and breaking because that's kind of an expensive thing. So how would you do that? How would you try to convince yourself that this is a reasonable key size? And this applies in general, right, not just for us, but do the same statistical analysis we did on the other cyber breaking it into sixes. And then do what? And then compare it versus the Wikipedia percentages. Yeah, so we could break the alphabet into six, right? So every six word, yeah, so we break it into every six word and then we can, and then, yeah, so we break it into the six alphabets, then we can run the analysis we already ran on the frequency analysis to see, or we can try to compare it to the English distribution so we can do the, yeah, we can try maybe the correlation frequency to see how close it is or we can try graphing it to see how close that is. Can we use bigrants? Two letters, three letter frequencies? Because we can do that in the Cether Cypher, right? With the Cether Cypher we can say, we cannot really look at a single letter and say with the frequency of a single letter we can look at pairs of letters, like we looked at the LL and hello or world, right? There are pairs of letters that are kind of likely to occur and there's a frequency of which they're likely to occur, yeah. Can you break it up since the characters are six characters apart? Exactly, the six character, exactly, the key breaks it up so that every sixth letter, right? So that really doesn't tell us anything, right? So we don't want to do that. Okay, so what we're going to do is develop a new metric because actually, so this is, I think, a good reason why. So if we look at, we did the, yeah, so here, this is the, we did the frequency analysis of the Cypher with the boy has the ball encrypted with a three letter key, right? So we did the frequency analysis, we calculated the correlation between this and the English language and the top ones were 0.08. Ah, yes, the top ones was very high and they were actually just as high as the Caesar Cypher 0.0539, right? And the actual one was 0.0518 and if we looked in our Cypher, these ones actually had really high calculations, of course, right? So that actually may not tell us, this is not a good way to tell us whether we're on the right track. We really want a different metric and so, I guess that works, can you all see this? So we're still, we still need to use and we still want to use the frequency, essentially the distribution of letters in English. But let's say these are 100% random letters, like we have a 26-sided dice that has all of the letters of alpha and we just keep rolling that dice and putting characters out there, right? So I'm not gonna do that, but let's assume we have some, well, okay. Maybe you think about how you guys shout out random. See, but this is even biased because I had to do my left hand by the shift key while I was hitting numbers and I alternated left and right sides when I was hitting so you could probably create a heat map of keys I was likely to press. So it's definitely not random. So what do we, so assuming it's completely random, how many Zs would we expect there to be? So, okay, let's rephrase it in a different way. Okay, I randomly pick one character from the string. What's the likelihood that it's Z? I just change this so you can be right. So, okay, so then, let's go back to the original one then. So if I ask you, what's the, what is the, I see, I guess how many Zs would we expect there to be in here would be one over 26 times the length of the key or the length of this side of the string, right? And spoken another way, what's the likelihood that we grab any random character in here and it's a Z would be one over 26, right? And that would be true for basically any letter that we could choose, right? The distribution assuming this is completely random is one over 26 for all characters. Everybody agree? What about pulling, so, and we know, so we do know that the, that the statistical analysis of English is not this way, right? But we want, we need some kind of a measure. So, another idea would be randomly grab one character in this string, what's the likelihood that it's grabbed? I'm not a big statistics worker person, by the way. Yeah, okay, good. So, so the idea is that we pull one random letter from the psychrotext and then we pull another random letter from the psychrotext. What's the likelihood that those are gonna be the same? So it's just completely random, what would that be? So it'd be one over 26, so we grab one letter, so it's a random letter and then the next one would be one over 21, one over 26. It's a problem because we've taken one out of the string. We're gonna do two things at once. We're gonna go forward and backwards. Okay, so what we're going to is a measure called index of coincidence, which is the probability that two randomly chosen letters from the psychrotext are the same. I think what's messing us up is the fact that I created this random string. And so the idea here is that in English, this will follow the distribution, right? So your first randomly chosen character will be from English and then the odds that you grab that same character are also different. It's a fairly easy calculation. You take the n times n minus, so it's the, you guys can't see that, did you see this before? Yeah, you did. Of course, yes, I wanted to walk through this. My man. Okay, so the idea is we calculate for all, we calculate, so fi is just literally the number of times the character i occurs in the psychrotext. So it's a very easy calculation. You're literally just counting. So you calculate the number of times that something appears. So in this case, for all characters, so you go a through z, you calculate how many times do they appear? Times how many times do they appear minus one? And then you divide by the size of the string times the size of the string minus one. And it's based off of, so the idea is what's the size of choosing, or what's the, so let's say a specific character z, so f of, I guess that'd be 26, right? So the idea of the percentage of the odds of you choosing character z are how many times z occurs in the string divided by n, right? Simple, that's simple probability. I definitely know that, right? Take something from the string. Now what are the odds that you pull another z from the string? How many z's are there left in the string? There's one less, and how many letters are there left in the string, right? So this is the probability of pulling a z twice from this given string. So now we wanna calculate this for all possible characters, right? So we just sum this, we say i zero to 26, we change the 26s to i's, and why can we pull out the n, the n times n minus one? Because it doesn't depend on the summation variable, right? So it's gonna be constant, so that gives us this one, so then we have summation i goes to zero to 26 over f i times f i minus one, all of this divide by n times n minus one. So this gives us a number for what are the likelihood across all the numbers that when we pick one number, we'll pick that same number again, sorry, letter, again. And what this is nice is that we can precalculate this, and there are tables for this, different periods of the key. So in our cybertext, in that last cybertext, if you run this calculation, it's 0.043, and oh, I did not show that. And so, ah, somebody wasn't very good. So you can calculate this for different key lengths, and you can see it's kind of asymptotically getting closer and closer to one over 26, which is 0.038. So you can use this to try to double check that you actually got something that you think is reasonable, and what's very cool is, so it's obviously statistics, we have a smallish sample size, so it's not gonna be 100% correct, but the fact that this says slightly more than five, and we got a answer of six, that kind of indicates we're probably in the correct ballpark. So we split the cybertext into six different alphabets, and the very cool thing is we can run this same index of coincidence again on each of these strings, and so what would we expect? Is the coincidence be the same, or similar regardless of the size of the string? Yes, and what would we expect it to be? Slightly larger than five, 0.066. You expect it to be 0.066, so all those have a key length of one, so there should be one. Yes, we would expect it to be 0.066. We expect it to be, once we split it correctly, it should be the case that it's 0.066 or close to there. What's maybe a problem with that? How many letters do we have here? Not very many, I don't actually know the number, but yes, not very many letters here, so our statistics may not be correct because this is calculated off of a large, pretty fairly large sample size, so for our first alphabet, we have 0.069, 0.078, 0.78, 0.56, 0.124, and 0.043. Good, are we happy with this or not happy? We go back and start from the run board. Try a different key length. These are the psychrotexts, so I feel like this, but sure. Okay, this is not the way to do this, but. This is the psychrotext split up into six different alphabets. You take this, say, one, two, three, four, five, six, one, two, three, four, five, six, one, two, three, four, five, six. I mean, it's also how you split teams up evenly, which make everybody count off into numbers, and then you go with the people who have the same number as you, it's exactly the same idea here. See, look at that, nine, six, one, so, and then you split that up, so that every one of these alphabets here is, so this should be exactly A-I-K-H-O-I. These are all A-I-K-H-O-I, right? So that's every six character, and then this one is every six character plus one, and then that one's plus two. What are the two together for when we're doing this? Are we adding the, I guess we have A to A, would it be the output would be A, or? A, it would be A. All right, so we're. Yeah, tell us what the amount to shift, and we have been defining A to be zero, but A represents notion, and Z represents minus one, or basically the previous character. Is there any position in that one? Yes. So, then should we be happy with these results? Do we press forward in our analysis? What's worrying about this? Yeah? I don't know, like the whole second half. I don't know if you can say second half, because that implies that there's some relation between all of these, but yes. You definitely, I mean this, the one, two, three, four, the fifth alphabet, right? The zero, one, one, two, four, that's a little bit weird. I mean, it could just be that that has a weird distribution of characters, right randomly. And you think this is only, what about 30 characters drawn from there? The nice things are the first couple are fairly high indices of coincidence. So this is more in line with what we would expect for a key of one. Well, I mean, our central assumption is kind of flawed in the sense that we've built all this based on the assumption that English letters occur at the same frequency in actual English sentences. Yes. So some deviation from that might be, like if one of those words on there happened to be xylophone, like that would be, or maybe that's not right, but the hello has two Ls instead, so those are different. Oh, exactly, and this is part of the thing. And it's not, for something like this, it's not a lot of letters, right? So basically how I would approach this is say, well, it's a decent, at least where we think we are on a majority of them, about three or four. And the other ones are not quite what we would expect, but maybe that's just due to noise. So let's proceed forward, but cautiously. And so we know if we have to revert back, we always can, right? And we'll keep this in mind, but randomness, whatever. I mean, it's all kinds of things, the words that are being used, all that. We talked about this a little bit on Tuesday, is how to break this, is we then now have Caesar ciphers for every single alphabet. So we have six alphabets now, and they each are encoded with a different Caesar cipher. And so how, so, but we talked about also that one of the difficulties is a Caesar cipher when we're breaking it will know we get it correct because we will see English text up here, right? But here we may not know when we're getting it correct because doesn't, each of these are dependent, right? You're breaking every six characters, so it's hard to actually know whether you broke in one correctly. But what is the benefit that we do have? I can't remember if we talked about this on Tuesday, so let me just say it, and then we'll go on. Different maybe from a Caesar cipher. So each of these six alphabets are independent, but they're also not independent. How's that helpful? So the combination of six Caesar cipher keys, that when you apply them to each of those six strings resulting in distribution of characters modeled in English language, English language distribution, you might be close. Could that again, if you got the point that you could see that, it would be the same as selecting six keys, and just looking to see if there's also English words in there, because by that point, we were, sorry, I got something I guess, yeah. You might want to try a second, third, English is very likely that one. And why the third or fourth? Well, it's most likely to be the second letter, and then like, plus like, the third, and then like, you couldn't have like, the thumbs or whatever. You're basing that off of the first letter, basically, or first word, you're assuming that there's like a first word in this. That's the only one that you know. Right, and so let's say you do that, and let's say you're fairly confident about two out of the three alphabets. How do you perceive from that? So one thing we can do, we can actually do for this, a really simple type of statistical analysis. So rather than trying to, and actually, so what I like about this is these techniques also kind of can apply to a Caesar cipher. So the idea is rather than compare it necessarily to this very fine grain distribution of English letters, you kind of bucketize each letter into high, medium, or low frequency, and then you calculate that for your string and calculate that for English language, and that may tell you kind of how to shift them and make it some clues, so let's try that. So we can, yeah, so I can calculate, so here's the letter frequency A through Z of all of the characters in alphabet A, all characters in alphabet B, and the rest. And in English, we have this kind of a rough distribution where you have A is high, E is high, I actually don't know if this one's that correct. The lows, you have a string of low letters at the end. What does this tell you? Does any of them have a very clear way we should shift them or not shift them or zeros at the end? Even at M, I believe that's here. Yeah, oh, the T, I see the T is high, I think U is medium. Yeah, so actually, one kind of looks like it made the key there maybe A, right? So it may not actually be shifted at all. Yeah, so definitely, so if you look, I mean it's very clear on one, right? There's only this kind of big string of zeros here at the end, which is fairly similar to what we would expect. Three, I would say, like you said, we're going for four. Right, here zeros. The question though is that you have the two here, so if you shift like, so if you shifted this all the way to the right, if you shifted this all the way to the right, the two would be in the A spot, and then you have B, C, D, E, actually which does fit pretty well, the four there, and then F, G, H, I, yes, I would say that would be a safe shift to try, right? Shifting this, I don't know how many that is, but yeah, shift it all the way there. But what about four? Does that same logic apply to four? Also, if that is lining up, assuming it's lining up correctly, column wise, string three has four Q's and three U's, and we, is that seen, I see that right? Or no, it has, I think that's a T. Four, three? Oh yeah. I thought I saw four and three and I was going to say Q, U, generally, go initiative, but now I think three is under T. So think about that though, would you accept Q and U to be in the same alphabet here? Oh no, because these are split up. Exactly. So let's go back with four, right? So I guess the problem at least, I see with four is we don't necessarily, even this is more zeros, but this is still four and this is what? Six. So it's hard to say which one is necessarily should go there. We also don't, it could be similarly ish here. Yeah, so cool. So this, let's see. Yes, perfect. Okay, so yeah, so we just did this. I actually didn't look ahead, so I don't really know where this goes. So yeah, so we said the first one is likely not shifted at all. The third one, it's likely we said that I should go to A, right? So this should be shifted kind of back all of that. And yeah, I mean it's not quite as clear as necessarily. Yeah, I mean you could kind of use that same argument here on alphabet six, right? Where we have, here we have six zeros, but the only other place we have zeros are not in a large area like this. So maybe we can try shifting. If that was true, this V would go to A and we would shift it. I think it's six characters to the right. So we can try this and the important point. So, and what we want to do is we want to keep track of our guesses, right? Shift all the alphabets and then put them back in the ciphertext. So why would we want to do this? Oh, you guys can't see that. Because you don't know what works until you... And now what can we start doing at this point? Yeah, pretty much, right? I mean it's a, I think I was like a wheel of fortune, right? You're like, but it's easier in some sense because there are letters you don't know but they are all linked, right? So you can track shifting one letter to what you think it would be and that shifts that entire alphabet on a certain amount so you can try that out. So this is what I was going to, in that guess all of the alphabets are all six independent things but they're not actually independent because I'm here, this, they are, this is some kind of an English text and the letters from each of the alphabets are related. So then we have to do cool things and look for clues. So this is where it gets tricky and this is why this is annoying but that's a very good answer so we can just try random stuff. Maybe next time, of course next time is when it's not helpful for you but maybe I'll have this up so that we can actually do the decryption together. So the idea now though is now if we assume that the bold letters are probably correct, right? We can try to use our knowledge of English to try to come up with some way of thinking about these. And so the cheating way going forward is basically thinking we can look for, we talked about this common like two or three letter words. So if we saw anywhere in here like a T of something and an E we would probably try the, right? Like it's an incredibly common three letter word. We can also maybe look for something that's like an A something and an E and that could be what words? R, A, R, E. Sure, so let's try that randomly. Does that make sense? Does that look better? So this is the thing, right? You're trying to testing things, right? You're making a change and you're seeing how that changes the Cycret bud, the decrypted Cycret text but you always wanna check and make sure, did you actually make progress or is it just still random? So is this a good substitution or should we go back? Everything except the R, is that true? Pace, Rick. What was that? Oh, there's pace up in there. There's Rick. Pace, Rick. What else? Butt, which is a very common word. Maybe the other one. Part of this is tricky because we don't have spaces, right? So, you know, spaces could be thrown anywhere in here. I would say that the fact that this decrypted to butt is probably a good- None. None? Where? The one, two, three, four, fifth row ended the second column. Oh, none? Yeah, the fact that that's a word, that would also be a very good sign, right? That it looks like we're getting closer and any other suggestions, things to try? Sounds kind of like a puzzle, right? We're trying to fit a piece into the puzzle based on what we already think we know about it. And it is probably easier, I guess, if you're actually doing this, to delete or gray out the sector text that my friend did this and it's a little tricky just having a bold. Where? We have two more off of this. Sorry, on the, well, I'm looking at the G-O-O. Where? Down, close to the start of it, yeah. So I'm looking at the G-O-O, thinking that might be good. If you could even go in front of that, you could have a butt R good. Butt R, good. So does that actually work? So if A goes to F, so reverse that, does that, is that the same key? So, like, actually do that because I'm not hoping that, what's the shift there? So it would be F, so W5, H-I, yeah, something the same. Like, D, T-A-G, okay, so does that work? That works better. Yeah, so that would be F, T goes to F, so how about that shift this character? We can actually try that, that may, I'll do it right now, but one, so we talked about using the first word, maybe there's something we can learn about the first word. We also know that the last thing on here is some kind of word, which means that the very last letter here must be some kind of ending. So is there any common ending that ends with L-I-C-A and then a letter? I-C-A-N, so could it end with N or some other possibilities? H, if it's an A, Micah. Micah could be an H, with a name. Is the apostrophe in America before or after? Wait, what? So is the apostrophe in America before or after? Usually in these things we don't, we have no apostrophes, so it's not gonna be an apostrophe. So how could you figure this out? Could be an L. Could be an L-Y. Well, if you have R after R, you could have comical. So, nickel has like an ending, you'd have to use the O-C before comical, C-O-M-I-C-A-D-L. Right, right, I'm just focusing on the last target. Yeah. I mean, you know back up to the F-V that we were talking about? Yeah. That if you turn that up to a T-H, right above it, it turns into that is two, interesting, so if you do, it makes two words, but changing F-V to T-H. So you change F-V back to the, as we were talking about, and then that would change this. What was it? So, pace of that. Yeah, that's actually a really good idea. Actually, so, and if we did that, we'd find out that by changing that F to T, that would change this X to L, and it would also change this O to C. So we would know that we were on the right track. Apparently, I had no idea that this was a thing, like, that Nicole is a common ending for an adjective. I guess if you were a crypto person, you would know this more. But, yeah, so you could do this. You could actually, that T-H is a much better way of doing this, so we'd see that T-V-E, and at this point, at this point, we could even brute force it, or we could very clearly see what it is. And another thing that we can use, so if the other good thing we can use about English that we touched on earlier, then these are all not techniques necessary to break this one, because we just did, but in general, they're approaches, right? We have Q something, I, so we assume that the I goes to what? U, U, right, and then we've tried that, so that would be another thing to try. Yeah, there's two more. There's two more, Nichols, where, oh, here, yeah, yeah, yeah, and you could look in here and see does this O change to A and does that's the same there? So yeah, so we would do that, and we would get our awesome cybertext, or clear text. And you can see it's weird, it's not like a normal, these aren't like normal words, right? So it kind of makes sense that some of those distributions were weird. You can look at some later, it's not anything fancy. Doesn't matter, that's not a thing that matters. Okay, cool. So this was basically, we looked at multiple different types of substitution ciphers, where the basic idea is we substitute one value in for another. Oh, the other thing I was going to say was that I did look up the enigma machines. Apparently they were super cool in that, so they were changing substitution ciphers, so basically when you hit, so the way it worked was there would be a certain number of rollers that are all different versions of enigma machines. So each roller had 26 sides on it, each one to a letter. So you would type, let's say A, and that A would cause one of the rollers to output, like the letter that was being output, but by pressing A, that would also cause all the rollers to shift based on how they were wired internally. And so that means the next time you hit A, it wouldn't map to Z or whatever it did before, it would map to something else. So yeah, super interesting step in there. That could be a really interesting homework assignment, so I'm talking about it. So the idea here is, so now that we looked at, so we can do substitutions, right? So basically substitute one letter in for another. Another way we can go about trying to do ciphers would be to maybe rearrange the letters and scramble the letters. This better or worse. So what's the benefit of doing this? If someone's looking for one of the other types of ciphers, it'll be on from there. Yeah, so what'll be the, so we could tell easily a Caesar cipher because it doesn't follow the standard distribution of letters in English. But if we run that on ciphertext, that's been moved around, then what's, ah, so we move around all the letters, then how does that affect the letter frequency of the plaintext, of the ciphertext? Same, it doesn't change it at all, right? It just moves the letters around. So it has a benefit where you can't do, let's say one gram analysis to try because the distribution will be exactly the same as the plaintext. But what else is it going to destroy and change? It's not gonna change, so it's not gonna change how often single letters appear, but what is it going to change? There are, so the letters that follow each other are now likely to not follow each other, right? So two, three gram frequencies will be changed. So we can actually test for this and we can see if the index of coincidence is around 0.066, then that indicates and it's, and the distribution of letters is what we would think in English. We think it's probably a transposition cipher and they're just moving stuff around. So the kind of simplest transposition cipher you can think of, simply break the message into some blocks of the key length size and the key is how you transpose the blocks. So the key would be here, okay something is kind of automated. So the idea of your key is length four, right? So you break the ciphertext up into blocks of four and then the key tells you how to move the block. So it means of each block, the zero width character is mapped to the third character, the first character is mapped to the zero width character, the second character is mapped to the second character and the third character is mapped to the first character. And that's how you mix it. So, this is gonna drive me crazy. So if the message is ASU is awesome and this is the key, what's the first block going to be? This is going to be better, but three, zero, two, one. So then what's the ciphertext? So which is gonna be the zero width character of this block? S, so the second character, this means the second character gets mapped to the zero width character of the cipher. So then what's the next one? E, W, S. So then how do we break this? Why is it dead to us now? So how would we try to go about attacking this? Well, something this small, you could just try to create one word and then see if that pattern change would work for other words, like. So try to do what? Create one word, common word, like of this, you could do some, or some of the easiest, most common word you could do. Yeah, so this, so it's a little bit trickier, right? Because we can do this because there just happened to be a word that was exactly the key length, right? And so maybe we could try moving it around, I don't know, maybe it's possible in English, it was this, or very less common, probably, yes. Yeah. Assuming we've had to break it before we know the key length, we're going to have to determine the key length first. Yeah, you have to determine the key length, you have to, let's say we do know the key length. So the only question is, how do you break it from there? So it's the first simplest thing we should try. Group force it, yes, just try everything, try everything, right, actually, so how many, so for a key size of like two, how many combinations are there? One, well, two, I guess, but one doesn't encrypt anything, the other one just pops, about three. It is factorial, yeah, so if we have the key length key, then the number of possible combinations is k factorial, which is nice from a defensive perspective, which means we don't have to have a large key in order to get to very, very, very large numbers, right? So that's why like, if it's about roughly, I mean, 13 would be about, was it a six trillion, which is kind of a lot of combinations to try. But anything less than that is probably something that can use root force on a modern machine, even like a laptop, pretty easily. Do this is to try to analyze, so we know that the single letter frequency is then kept, but we also know that the biogram frequency, so what characters follow each other. So given, if you have character A, what character likely follows A most of the time, or for instance, like we saw with Q, right? If we know that there's a Q in there, we know there's likely a U that has to follow that, so we could use that. So we could use the biogram and even trigrams, so we could use the three letters that are likely to follow each other. And that would probably incorporate our knowledge, essentially of T-H-E and A-R-E and these kinds of things. The rails fence cipher, so this is actually a cipher that was used, I believe by the Greeks, I wanna say, but I should look that up before I make that point, so don't write that down. The idea is if you think about like the fences, you want to put the cipher text once here. So if the cipher text is hello world, we would do something like H-E-L-U, and then we would read it left or right, so we do H-L-O-O-L-E-L-W-R-D, so this would be our cipher text. So what's the key? The height of the rail, right? The shape and structure that I'm writing these things on. You could get more, you could get this fancy, you can have like a shape that kind of looks more like this, and you could do something like H-E-L-L-O-W, this is one problem that I can't cross break. R-L-B, and then you could read this left to right, top to bottom, so the cipher text would read H-O-L-E-L, so the key would then be knowing how to place these letters back on the rails so that you could get the correct cipher text. And the way this worked on the ancient devices was you would get some kind of cylinder device, and you would take basically a strip of paper, wrap that strip of paper around the cylinder, write your message, I believe, like top to bottom, and then when you take this piece of paper off the cylinder, because each of these pieces, letters is going to come at a different interval along the path, so when you take it off, all these letters will look or jump it up and mix up, and the only way to do it is to have a device that has the exact same diameter of the original device, and then you just take the thing, you put it on there, and then the message magically appears, but it's the same idea here. We spent a lot of time breaking the other stuff, so we're not going to go into too much detail here. So yeah, you look, and so we talked about how to detect this, and we just talked about this, right? So you look and say, well, one letter frequencies follow English, but the bi-grams and trigrams do not, so it's likely some kind of transposition. And do is when we want to try to break this, we try to look at the ciphertext and say, okay, what is the most likely bi-grams? Let's say we choose H, and we say, okay, what's more likely that O follows H, L follows H, E follows H, L follows H, W follows H, R follows H, E follows H, so we try to think what's more likely of one character following another character, and then try rearranging that. And we try that on the rails, when we try different possible combinations of rails. So going off to the other one, we can see that if we look at the frequencies in English of bi-grams that start with H, we would see that H, E is the most likely, where H, O occurs much, much, much less frequently. And the other ones occur almost never, so you would then think, oh, yeah, so we also want to look at two grams with that N with H, because maybe H is at the end, so it doesn't make sense to think of what comes after it. And so here we'd see that almost nothing ends with H, so it's probably safe to say that E follows H, and then we could arrange them so they're adjacent, start writing it, and then we could read off our context.