 So the Visionary Cypher was widely hailed as being an unbreakable cypher and on the whole it had a pretty good track record but in many sense the birth of modern mathematical cryptography began in the 19th century with the breaking of the Visionary Cypher and that emerged as follows of a German cryptographer by the name of Friedrich Kaczynski discovered or invented, is the proper way I suppose you could say, a method of breaking the Visionary Belosso Cypher in 1863 and the problem is that the fundamental weakness of such a cypher of the Visionary Belosso Cypher is that once you know the key length what you can do is you can take the cypher text and split it apart into a bunch of individual shift cyphers and and what this means is that in large part the security of the Visionary Belosso Cypher relies on the fact that in general we don't know the key length and as soon as that key length is known then we can break our cypher text into a bunch of individual text and apply statistical analysis or what-have-you on each of the individual segments. So keeping that key length secret is absolutely critical. So Kaczynski had the following insight. There are common bigrams and trigrams in the plain text. If I'm writing something in English, for example, the letter combination T, H, M, M, R, E, and so on letters like these are fairly common and from time to time two occurrences of a given biogram or trigram will be separated by an exact multiple of the key length. So I might have a T, H, some place in the text and some number of letters later I'll have another T, H and these might be separated by an exact multiple of whatever the key length is and what that means is that the two occurrences of the biogram will be encrypted in the same way and what that means is that I'll have common biograms and trigrams in the cypher text and so Kaczynski's attack is based on the following idea. What we'll do is we'll find the common biograms and trigrams in the cypher text. We'll find the distance between them, the number of letters intervening and these distances may be multiples of the key length. Now, I don't have a guarantee that that's the case because some of the biograms might be coincidentally the same as others but encrypted based off of different plain texts. So I have no guarantees here and that's part of the art of cryptography. So let's consider a cypher text. So I have this mass of letters here and what I'm going to do is I'm going to look for common biograms and trigrams in the cypher text. So this is the painful part in the 19th century. Nowadays we can just do find and replace or search and find in any sort of word processing software. So for example, I noticed that this trigram CDG appears in the first location and also in this location here and we count that turns out to be location 133. This biogram DG actually appears in a number of locations as well and we record those positions to 1783 and 134. The biogram NN appears a bunch of times 6183207 and we can find others. Here's another trigram OAG. Note that because the way that the letters are split up into blocks of five, this OAG is actually split between two blocks. It still counts as the trigram OAG in any case. So we have these different positions of the various biograms and trigrams, and there are others, but we'll start off with just these few. And here's where we have to apply the art of the cryptographer. Some of these, but not necessarily all of these repeated biograms and trigrams, are separated by multiples of the key length. So for example, the trigram CDG occurs in positions 1 and 133 and these are 132 spaces apart, so 132 might be a multiple of our key length. Now the caution here is it's possible that the two occurrences of CDG are from entirely different sets of three letters and the fact that they are encrypted the same way is mere coincidence, and so it's possible that this 132 might not be a multiple of the key length, and that's where we have to apply this art of the cryptographer. So likewise this biogram DG will be recorded a bunch of possible positions for it to 1783, 134, and the differences in position here are going to be 15 or 81 or 132, 66 and so on, and again some all or none of these might be multiples of our key length. And likewise the biogram NN appears in a couple of positions and again the spacings here, the difference, the distance between the two locations might be multiples of K. OAG, likewise, we have a bunch of different separations for OAG and again these might be multiples of K, and what we have to do is the following. If I find every occurrence of every biogram and trigram, I generally find nothing useful. And the problem is that I end up with so many numbers that in general what I don't find is any single number that these are all multiples of. What we find is that all of these numbers would be multiples of one, which is kind of useless. The art of the approach is that what we want to do is we want to select some, but not all of these separations as I'm presenting multiples of the key length. And the observation here is that most of these numbers are divisible by six, not all of them, but most of them are divisible by six, and that strongly suggests that the key length is in fact going to be equal to six. And so we'll note that we're ignoring these numbers here, and what that means is that these spacings, so this, the DG in space 83 and the DG in space 2, don't actually represent the same pair of letters. They are just coincidentally encrypted in the same way. So again, if I assume a key length of six, then every six letters going to come from the same shift cipher. So what I'll do is I'll take every six letter. So I'll take the first 7th, 13th, and so on letters, and I get this set of letters. Now, I can't hope to make a word out of this grouping because they come from every six letter of the actual plaintext, but I can still apply a statistical analysis, and if I look carefully, what I see is that G is by far the most common letter in this set of ciphertext values. And so it's reasonable to imagine that this particular set of letters is produced by the shift e sent to G. That's a shift 2 ciphertext, which tells us that the first letter of the keyword is going to be c. Again, the way to remember this is that this letter is going to tell you what a gets sent to. If e gets sent to G, a is going to be sent to c. Now let's take a look at the second 8th, and so on. Let's take a look at the other letters. And here our decision is a little bit more complicated because I don't have any single letter that is overwhelmingly more common. What I have is I have five a's, five f's, five j's, and five w's. And I can't really decide which of these might be e. And so we might take a look at the frequency histograms, and in particular the thing to remember is that if we're working with a shift cipher, as soon as I decide what one letter is, I know what all the others are going to be. So I know that a, f, j, and w are common letters in the ciphertext, so let's see what's going to happen if e gets sent to one of these. So my first possibility, e might get sent to a, e might get sent to a, that has a peculiarity because this common letter f is going to have to come from the letter j. I have lots of f's in the ciphertext, and so that suggests that I'm going to have a plain text message with lots of j's. Now it's possible, I might be talking about one M&S from Jamaica, but it seems to be a little bit unusual for a ciphertext, so I'll pause that for a minute. Well, what if e gets sent to f? Well, again, this common letter a here, this ciphertext letter a, must have come from a z, so again, I have many, many z's in my ciphertext, and again, that's a little bit of a peculiar setup, so I might suspect that that's not the case. If e gets sent to j, then again, this common letter a is going to come from v, and again, I have a plain text with many, many v's in it, again, somewhat unusual. If e gets sent to w, then these common letters a, f, and j in the ciphertext must have come from letters i, n, and r in the plain text. And that says I have a plain text with lots of e's, i's, n's, and r's, and that's not unusual, so it seems reasonable to think that my second substitution is going to come from e to w, which tells me that s is going to be the second letter of the keyword, that a is going to be sent to the letter s. And if I continue in this fashion, I'll be able to pick up the remaining letters of the keyword, which again, is not necessarily a word, but is an easily rememberable sequence of letters.