 So welcome to the second invited talk given by Dan Murchtyn. Dan is one of the few members of our community who does not do introduction. It's a great honor and pleasure that we prepared to have a talk here. So Dan has a PhD from UC Berkeley from 1905. He's a professor at the University of Illinois at Chicago and part-time also at the U.N. School. If I would start reading a CV, I would actually take most of his time so I will not do that. Just a few things you may know Dan from when he was a PhD student he's from the U.S. government about export controls. He also, in 2004, found close to hybrid unisecurity holes for the students in their class on this. He's the author of Q-mail, PGB and DNS. He's known for his work on factoring and PCC but I guess today we want to hear more about his mental cryptology. So Dan is a designer of probably 1905. He has self-sacrifice, a few cash, and he's also running together with Tanya and Etax. So I believe he's here. So please join me in welcoming Dan Brunchen. Alright, thanks for the introduction. Can I get a check? Is the microphone working? No, it has to be on the right side. It has to be on the other side. Okay, let's try it again. Okay, checking again. Is the microphone working? No. Okay, third time. Is it working? Alright, great. So thank you for the introduction. Thanks to the organizers for the invitation. I've decided to start out with some quotes about the relationship between cryptography and information security. I see cryptography as a critical part of information security, but let's see what some other people think. Let's say, for those of you who might not be able to read it from the back, the left side there has a bad guy saying, his laptop's encrypted. Let's build a million-dollar cluster to crack it. And then the other bad guy says, no good. It's 4096-bit RSA. And the first bad guy says, blast, our evil plan has foiled. Now this is labeled a crypto-nurse imagination. Now on the right side, it says what would actually happen. And there the bad guy says, drug it and hit it with this $5 wrench until he tells us the password. The other bad guy says, got it. Then there's another maybe similar quote from Greg and Guptman from 2011 saying, in the past 15 to 20 years, no one ever lost money to an attack on a properly designed crypto system. What is a properly designed crypto system? How does the user tell what they give a definition? They say it's one that doesn't use home-brew crypto or toy keys in the internet or commercial world. And I have another question from Shamir saying, crypto is usually bypassed. I'm not aware, Shamir says, of any major world-class security system employing cryptography in which the hackers penetrated the system by actually going through the crypt analysis. Now, I'm not quite sure what these quotes mean. I mean, first of all, it's clear from the XKCD that the bad guys are unable to break the crypto. But what about this Greg Guptman quote and what about this Shamir quote? Are they saying that the cryptography is actually infeasible to break? Or are they saying that, well, it's feasible to break the real world crypto, but the hackers have easier things to do. And that's certainly all of the quotes I've given have this theme of, yeah, there's easier things than going through the cryptography. Well, Shamir very clearly says this when the hackers penetrate the system, they're not actually going through the crypt analysis and saying, well, XK, it's amazing that a comic strip is more clear than other people commenting on internet security. Well, anyway, there's another possibility which is that what these people mean, mainly that cryptography is not the weak point, that the attackers are not breaking the cryptography, either because it's not feasible or because it's feasible to break, but there's easier attacks. Maybe they're just wrong about all of this. The real world cryptography is actually breakable and it's actually being broken along with lots of other things. And if we actually want to secure a system then we have to fix the crypto and fix all the other things that are going wrong. So it's not some strong point in security, it's actually one of many weak points in security that we have to fix. Well, to resolve this question let's look at some examples starting with flam. Now the target here is the Windows update system. Flam is some malware that broke into a bunch of computers, targeted a bunch of computers around the Middle East mostly in and around. And Microsoft announced about a week after Flam was publicly discovered. Microsoft said, we recently became aware of a complex piece of targeted malware known as flam and immediately began examining the issue. A bunch of computers are broken into, it's an issue. We have discovered through our analysis so cleverly are that some components of the malware have been signed by certificates that allow software to appear as if it was produced by Microsoft. They broke our signature system. So Microsoft has a Windows update system where any Windows machine will accept updates to Windows if they're signed by Microsoft. And the attackers one of the main ways that Flam broke into computers, well the main way that Flam broke into computers was by forging a signature which appeared to come from Microsoft. And the targeted computers looked at that signature and said, oh yeah, that's VAL. Now this sounds like a PKC issue rather than an FSC issue until you look at how they forged signatures namely a chosen prefix attack. A chosen prefix collision attack against NB5. Mark Stevens has a nice tool where he will look at a colliding message. They'll try exploring it with a bunch of plausible message differences and then tracking it through NB5 and seeing if there's an internal collision. And so he was able to look at the Flam message they signed Flam the Flam signature and say yes that is part of a chosen prefix collision attack which is not his optimized collision attack. It's something different, it's something entirely new and unknown. Now the concept of Duke here is a little interesting because if you look at another report from the crisis lab that was one of the first labs to find Flam and analyze what it was doing they said that they have, there's a bunch of machines around the internet where they have virus protection and so on but maybe people get through that virus protection the malware protection, Flam of course got through lots of projections and these machines will still keep logs of everything that they're doing and send the logs off to other sites for analysis later. So it's possible to go back and see what was on those machines historically and in particular one of those machines in its logs in 2007 had one of the critical Flam Flams. Now the conclusion from this lab was that Flam was probably already active in 2007 which was before the real damage of NB5 not just that there are collisions but you can use them for reforging certificates that was before that was known before it was established in the public cryptographic community and they think that Flam was perhaps active for even as long as 8 years. Now if you go back to 2007 again problems known with NB5 people were saying in the mid 90s NB5 was a bad function don't use it but people were still deploying NB5 in 2007 there wasn't this rush to get rid of it in 2007 so NB5 was not homebrew crypto it was something which was standardized widely used and nevertheless the attackers will get and apparently they thought that this was one of the easiest things for them to do they didn't bother with some other attack which would have been harder the crypto was the weak point and the system at least it was the point that the attackers decided to break. If you compare this to what Grig and Guttmann are saying they say things like crypto system failures, orders of magnitude below any other risk and this is obviously a little bit of an exaggeration nevertheless that's what they say the crypto is fine the crypto is strong we're done crypto is successful mission accomplished yay let's look at another example there was a nice talk about web yesterday I have just two slides here saying a little bit about the history I would give some credit to Borisov and Goldberg and Wagner who in 2001 this was a flurry of attack papers a few years after standardized my equivalent privacy turns out to have as you heard yesterday a 24 bit message number so you have this long term secret key which is 40 bits or 104 bits and then you put on a 24 bit message number to get your 64 or 128 that RC4 key and that message number is never supposed to repeat of course the 24 bit number is going to be very frequently repeating and when it does then all sorts of things go wrong with the crypto it's all your confidentiality you lose all your integrity so it's already a complete disaster just from having a 24 bit message number used for a long term key and then the user authentication fails and then Flurman's engineer started off the types of attacks that you heard about in the talk yesterday where because the long term key is concatenated with this message number and there's correlations between the key bytes the long term key bytes and the RC4 outputs which you can see when you vary the message number you end up seeing the key bytes from a bunch of RC4 outputs and then lots and lots of papers about this perhaps concluding with the paper from yesterday saying you can really optimize this attack to the point of getting the wet secret key from 120,000 packets so incredibly weak cryptographic system now some people look at this and say okay you're pointing to a bunch of academic papers and sure a weapon is still used in the real world and it's weak the academic papers do say that but come on who's actually being attacked well here's an example of somebody being attacked some attackers stole 45 million credit card numbers and a bunch of other personally identifiable information from an American company TJ Maxx by first of all breaking into the internal network which was protected with what? now a company in the US when there's people losing money or maybe gaming money and there's a whole subculture of other financial research companies which will look around and speculate about how much it's going to cost them and so Forrester Research said TJ Maxx is going to be losing something on the scale of a billion dollars from this which is maybe correct with all the consulting fees and cleanups the only documented damage I'm aware of is that they paid 40.9 million dollars to settle a lawsuit about exactly this theft so nobody loses any money well okay here's an example of somebody losing money because of broken crypto let's try another example key lock this is one of these brick brick systems for car doors and garage doors Wikipedia says here's a bunch of car companies which have deployed it there's a paper in 2007 in the state of Heller and the government and part per nail called how to steal cars and what this says is that the security of the cipher used here this key lock cipher it doesn't give you 64 bit 2 to the 64 security you can break it using only 2 to the 44.5 encryptions and 2 to the 16 known plaintiffs now when I see numbers like this then I start envisioning okay what does that mean 2 to the 44 computations I've got some idea how key lock works okay what does that computation mean and then how about the known plain text 2 to the 16 known plain text okay okay for those of you who don't know our session chair part per nail is a parking valet at a very fancy restaurant now rich people come to the restaurant and they give them their cars and their car keys temporarily of course he steals the car at that point with the car key then they know it's him and he gets fired and gets in trouble but if he can clone the key and then give back the car and then sell the cloned key on the black market and then a week or a month later the car gets stolen nobody knows it was him so he drops the car away he parks the car and now he's got a little radio receiver which receives the key lock and he also has a second job as a professor so he has a bunch of graduate students and he gives the key to the graduate student he says ok graduate student push this key as fast as you can ok if you push it I actually tried this at a reasonable speed that it takes about three hours extrapolating it takes about three hours to get two to the 16 pushes and then fancy restaurant hours for dinner, perfectly reasonable, and sometimes it takes a long time for them to give you the bill and for the ballet to bring your car back. And then, well, by listening to the radio signals and doing a little bit of computation, he ends up getting the remote key and then being able to build a close-up. Then there was a paper the next year from Eisenhardt and Kostler and the Ruddy and Parr and Thomas and Zadda and Shalmani saying that they have a much faster attack, where they can, from some distance, intercept the radio signals of pushing the button once or twice and then immediately build a clone of the key. Having done a lot of initial computation, get the manufacturer keys out of these systems. So it turns out that the receivers for these signals, inside cars, inside garage doors, from each of these companies, they all share a manufacturer key and the main work that was done here was actually extracting the manufacturer keys from interactions with those devices, the receiving devices. Now, some people say that these authors of the second paper were cheating because this was a side channel attack. They weren't just looking at the legitimate radio signals that were designed to be outputs of the system, they were looking at the power consumed by the system. Now, maybe Schumer would say that this was bypassing the cryptography. Maybe it's not an issue that cryptographers should worry about, that there's these side channel attacks. There should be some other people who are protecting us against side channel attacks. But maybe there are interactions between the cryptography and the side channel. Maybe we can change the decisions that we make and there's more and more evidence that there are different decisions we can make in designing cryptographic systems which make the side channel attacks harder or easier to carry out. Another question that I'll formulate about this attack. I don't have any examples here of press about news stories about people who are stealing cars using these attacks. So some people would look at this and say, hey, there's all these academic papers, yeah, it's theoretically weak, but no attackers would ever carry this out in practice. There's no evidence that anybody's actually breaking the key lock system. Now, okay, suppose you have two academic teams. They find an attack against key lock. The first thing, if an attacker finds a break against a widely deployed system like this, there is 100% chance they will call the New York Times or some equivalent organization. If a real attacker breaks a system, then what's the chance they're going to call the New York Times? They don't want anybody to know they're breaking the system. So let's say there's a 0% chance the real attacker is going to call the New York Times. Now, the New York Times articles are then two academic teams times 100% plus n real attacks times 0%, which gives us a total of two articles. Now, what does this tell us about n? Some people think that this means they're no real attacks. I think this doesn't tell us anything. If the system is weak, then it is presumably being broken by real attackers, even if we don't have news stories. Let me give you another example. VMWareView. So this one is maybe not so well known. This is one of these corporate, you buy thousands of terminals from some company, you buy from, say, Dell, and these terminals are going to have little desktops where the desktop environment that your user is talking to is actually generated on a server. So you have some central image server, thousands of terminals, all using some protocol, VMWareView, to send the graphics from the central server to the users of the user can interact with the computer system. Now, if you look at the documentation from VMWare from Dell, then they say you're supposed to switch from AES 128 to Saucer 2256, whatever that is, for the best user experience. Now, okay, this is faster. The best user experience refers to there's some speed limit for the hardware here, something like five megabits per second of network graphics, which AES can successfully encrypt. And if you switch to something faster on that hardware, then you will get faster. The ethics of the users are happier. So they recommend that. And normally cryptographically, I would say that this is also a good thing to do that this is increasing your, well, key size, it's increasing your number of rounds of security margin. There's lots of reasons that this is a reasonable thing to do for speed and security. But let's look at the documentation. I'll run through this slowly. AES 128, according to the protocol documentation is AES 128 GCM. Saucer 2256, according to the documentation is Saucer 2256, round 12. Now, does anybody see the problem here? Authenticated encryption says the parking bell is okay. AES GCM is not just encryption, it's also authenticating your messages so they can't be forced. Now this Saucer 2256 with 12 rounds, maybe, I mean, it could include some authentication that they just aren't saying anything about, but I see no evidence that they are including any authentication. So as far as I know, I haven't actually tried buying one of these things and carrying out an attack. But as far as I know, I mean, if it's doing just what that would suggest, this is completely vulnerable to packet forgery, where AES GCM is not. So at least from all the available documentation about this, I would say even though switching from AES to Saucer 20, I'm generally happy with switching from AES with authentication to Saucer 20 without authentication, that's a really bad idea. The user doesn't need just encryption. The user needs encryption and authentication. There's tons and tons of examples like this where the cryptography is failing because it's barely even being deployed. The IPsec null authentication is one of the most common ways to deploy virtual private networks. You have a protocol which could have some authentication built in, but the user actually does no authentication. And then they get surprised when their packets get fortune. Okay, let's try another example. SSL. Now in SSL, there's lots of stages to it. I'll focus on the secret key part as to this entire talk. What SSL should be doing is nothing at all like this. But let's start with CBC, normal AES CBC encryption. So you have a bunch of plain text blocks, P0, P1, P2, say 16 bytes each, 48 byte message. You encrypt each block by applying AES to it where you add something to each block. So you add the previous ciphertext block C02, P1, and then feed that into AES, get C1, add that to P2, feed it through AES, get C2, and you send along C0, C1, C2. Now at the beginning, the first P0, there's no C sub minus 1. So what you're supposed to do is generate a random number, which could be, for example, you encrypt a nonce, but somehow you generate a random number, somehow make sure the other side knows your random number, send along that number or have them implicitly know the nonce. Somehow the other side knows B and you add that random V to P0. Okay, so that's AES CBC, standard AES CBC encryption. Now here's what SSL does. You take, well, very much the same thing, you take each ciphertext block, add to the next plain text block, and then feed it through AES to get the next ciphertext block, except it's very beginning. Instead of making a random number, SSL takes the last ciphertext block from the previous packet. So it sends a packet along C sub minus 3, C sub minus 2, C sub minus 1, and then it gets the next packet to send from the user, P0, P1, P2, and adds that previous last C sub minus 1, which was sent along in the previous packet, sends that, adds that to P0 and encrypts that. Now the problem here, the order of operations is very important that because the attacker can see this last plain text, sorry, last ciphertext before choosing this next plain text, the attacker can choose P0 as a function of C sub minus 1. In particular, it can choose P0 to be C sub minus 1 exclusive or something, where the something is, for example, allowing him to guess some previous plain text block. So this is none of the random collisions that you were hearing about in David Brunstock. This is the attacker can actively say, okay, I've got a guess for, say, P sub minus 3. And I'm going to take that guess exclusive work with C sub minus 4. So that's what was encrypted before. If the guess is correct, that's what was encrypted to get C sub minus 3. And then I'll X or C sub minus 1 and put that into P sub 0. So if the attacker is trying to guess some previous plain text and controls this next plain text, then he does this. If the guess is correct, then the AS encryption of P0 plus C minus 1, which is exactly what SSL sends through AS right here, gives C sub 0. That's exactly the same as P sub minus 3 plus C sub minus 4, which gets encrypted to get C sub minus 3. So the attacker simply compares C0 to C minus 3 and then sees that his guess is correct, or that it's not. Then he tries more guesses. As long as there's not much entropy in this piece of minus 3, the attacker quickly figures out what piece of minus 3 is. So complete failure of the encryption to actually detect the data. Greg Bard in 2006 said, OK, if you've got a browser running in an applet, maybe then it could carry out this attack. There's no clear obstacles to carrying out this attack. It should be able to repeatedly carry out guesses. And then this was actually implemented some years later by Duong and Rizzo. It's called the beast attack against browsers. Browser exploit against SSL-TLS. And this really does work. And it really does get a cookie out of the browser in something like two minutes. So browser vendors look at this attack and say, OK, what are we going to do? We'll send along before we, OK, we get this P0, P1, P2 from the user. And before we encrypt that, before we add P0 to something, we're going to insert another packet before that, where that other packet doesn't have any data. It just has a message authenticator, which I'll get to in a moment. Every packet has a message authenticator. So the browser sends an empty piece of data with a message authenticator, then sends P0, P1, P2, which includes a message authenticator. And because there's an extra message authenticator before the P0, that randomizes the ciphertext block that's used to add to the P0. So the packet that gets sent is committed to before the c sub minus 1 is generated. And that's enough to make things secure. It doesn't quite work because empty data actually breaks Internet Explorer. But what browsers do instead is they take one byte of the legitimate packet, send that along, then send all their naming bytes. And that's probably secure. And it's actually deployed today. Now the attacker, instead of controlling the plain text, to try controlling the ciphertext, sending along forged network packets. But that doesn't work, because like I just said, each packet includes an authenticator. Each SSL packet has a MAC, which is protecting it against forger. And the way that works is SSL takes the legitimate data, puts an authenticator on the end, and then CBC needs to have 16 times some number of bytes, has to have an even number of blocks. So SSL takes the authenticated data, puts some padding on it, and then encrypts with CBC. And that padding is something I put on 1, 1, 2, 2, 3, 3, is whatever you need to get to a full multiple of the AS boxes. OK. And this is, it's a little complicated, but you can look at it and prove, paper by a project in 2001 says, this is provably secure. Now this is like the moment in an action movie when you have, you know, there's a fight going on and suddenly the camera is looking at some barrels that are full of oil, and you just know they're going to explode. So this is a cool way to secure money in 2001. I think this was presented at the crypto rump session right before crumpchicks talk. Say, this is completely broken if you have a padding order. Well, if you can tell the difference between a padding failure on decryption and a MAC failure. So the receiver has to decrypt the CBC, check the padding, and then check the authenticator. And if there's a way to tell the difference, for example, different error messages for these different failures, then in almost no time the attacker can figure out any plain text, random type of text box that he wants. Now this, if in the statement was, as far as I know, first demonstrated to be correct by Kanvel in 2003 who said, you know, there is one of these padding or goals where, here, let's watch the time that the server takes receiving this data. If the padding is incorrect, then it's fast to reject that block. If the padding is correct, it has to compute it back, and that takes longer. So by watching the timing of the receiver of this SSL-encrypted data, you can get exactly the padding or goal you need for Vaude's attack to work. So what do browsers do? Well, they say, okay, okay, let's always compute the Mac so that there's no longer this timing difference between having a padding failure and having an authenticator failure. And then just a month ago, Alphard and Patterson in the Lucky 13 attack said, okay, we can still break it because there's still some very, very small timing differences between having the padding failing and having the authenticator fail. Okay, so what does SSL do about this? There's obvious reactions, like, okay, really try to control the timing, and this is hundreds and hundreds of lines of code, all sorts of bugs just to deal with the timing issues here, incredibly fragile system. But SSL-TLS has an alternative to this, which is called cryptographic agility. Now cryptographic agility is a marketing stunt which has two points. Maybe the second part first, cryptographic algorithm agility means that you have some button which says press in case of emergency and then you will switch to different crypto. Nobody's ever tried the button, whether it works doesn't matter because it's a marketing stunt. Then the other part of cryptographic algorithm agility is because you have that button saying push if the crypto is failing, emergency, because you have that button, you don't bother having good crypto. You don't care whether your crypto's actually working, whether it's secure, you just say, oh, if there's a problem, we'll push the button. So everything's just fine. Now, as an example of the button not working, there is ASGCM, which has all of its own timing problems, but at least it would get rid of the basic problems with CVC, SSL in theory can switch, TLS 1.2 can switch to ASGCM. But it doesn't work because if you try turning it on, you find that 90% of web servers and basically 100% of web browsers don't actually understand it. Okay, so what do you do instead? Well, there is one alternative to ASCBC or other CVC modes, which is supported by all of the clients, all the servers out there, and which you can just turn on and that you really do have. Here's the emergency backup plan. In case of failure of ASCBC, you switch to RC4. And more and more sites are in fact doing this. This is now more than 50% of SSL connections on the internet. And lots and lots of people are recommending switching to this because it's obviously much less fragile than ASCBC. There's even statements from Revest, which people keep quoting and saying that RC4 is okay in SSL. It's not as bad as web. Web takes a long-term key, puts on a sort of nonce and then uses that in RC4. SSL does not do that. SSL takes a whole public key setup and then does some reasonable hash to generate a one-time RC4 key and then when it's done with that, it never generates any related keys. Good, sensible. There's no reason to ask for related key attacks. And Revest says the attacks against web don't apply to SSL with RC4. Designers of applications using RC4 should not be concerned. However, some of the problems with RC4 are a reason for concern. There's all these biases in RC4 output bytes. And at this point, I'd like to advertise something I've been doing very recently with Alphard and Patterson and Shulk which is called on the security of RC4 and TLS. Which is at the highest level saying, okay, you can actually do very much what Beast does where instead of targeting AES encrypting a cookie with CVC, you target RC4 encrypting a cookie. And then have, well, the same cookie or password or whatever sent through a lot of RC4 sessions the same way that Beast does and then use the biases in RC4 in order to figure out what the cookie is from all of these ciphertext bytes. Now, what are these biases? Well, those of you who were paying attention yesterday heard about some of what I'll say here. First of all, the best known one is that the second byte of RC4 output is biased towards zero. Has a probability of two over 256 and being zero instead of one over 256 that you would expect. That's not an issue here. And then Miranov looked at Z1 and found that Z1 is biased away from zero away from one towards two. And it's a totally weird distribution of the first output of RC4. And then much more recently, my Charlie Paul and Senhupta observed that Z3 and Z4 and so on through Z255 all have more than one over 256 chance of being zero which is contrary to something that Mantin and Cheney were claiming that they were exactly balanced all those other Z-survives. And then the key length dependent bias for people who were paying attention yesterday on Z sub 16, the 16th output byte is biased towards minus 16, assuming your key length is 16 bytes, 128 bits. Okay, that's not nearly the end of it. So here's what, with my co-authors, the conclusions are. There are 256 squared biases in the first 256 bytes of RC4 output, almost 256 squared. So we computed for every position, one through 256, and for every possibility for the i-output byte, what is the chance that the i-output byte of RC4 is zero, is one, is two, et cetera. And basically none of those probabilities are one over 256. Now, as a result of that, you can use all of these biases doing sensible statistics to attack SSL. The paper from yesterday was a little earlier, and certainly independent, but I think it was also, the results were found a little earlier than ours. We're seeing some of the bigger biases, which I've listed here. I'm gonna show you some pictures of what the actual probabilities look like. First of all, here's the first output byte, which again, from near enough, was already known to be a very strange distribution. So what the graph means, taking for example here, at 129, the value of this graph is 0.993. That means the probability of the first output byte from RC4 being 129 is 0.993 divided by 256. It should be one over 256. It should be a flat line, but no, RC4 is biased against 129, against 0, and towards some bytes here, away from other bytes. And you see there's some fuzziness here. This is actual biases, plus and minus. I mean, there's every 16 values of the byte value, X, there really is a spike up or down, as you see in this graph. There's also, if you look at Merinov's data about this, for some reason, the very negative spike here didn't appear. These initial values, the initial output bytes of RC4 are not too useful in attacking SSL, because the first few bytes are used to basically encrypt random data. But let's look at some follow up bytes. Well, Z2 is still not in the useful range, but just to indicate the graph that Mountaine-Chevier bias says the probability of Z2 being 0 is up at two over 256, so way off the graph. But then there's also all these other spikes up and down that you see. And lots of little fuzziness that you see in a general upwards tilt. And then moving on to Z3, well, the big things you notice are, well, there's a bias towards zero, there's a bias towards three, that's one of the things that we saw in this paper from yesterday. There's this bias towards 131. And then also, if you look closer, you can see some little bumps up and down, even without zooming in on the PDF here, you can see, for instance, maybe this bump down. If we move along in some bytes, you can see now that bump is up and then down and up and down and up and down, just the little bite there, along with some biases towards other positions. Well, it goes on for a while, eventually you see the key length-dependent biases, like here's this bias towards minus 16. And this is starting to get into the range where you can use this to attack SSL. We keep going like this. So, for instance, here's Z33. Oh, I should have emphasized that I get Z31, there's this funny fuzziness to it. What a weird cycle. Z33, as an example, there's these two big biases towards zero towards 33. Now, suppose you're looking at a bunch of ciphertexts, which are encrypting the same plaintext by that position. And you see that the most common ciphertext bite is, say, one. Now, that could be that the plaintext bite was one, exclusive or was zero. Or it could be that it was one exclusive or 33, which would be 32. How do you tell the difference between that? Well, if all you know is those two spikes, then that's the best you can do. But if you see the whole distribution, and you think what is this whole distribution, XORed with 33. And that totally changes the shape of this curve, makes a huge difference if you XOR it with something. So you can get a lot more information from seeing the entire graph than just what you get from seeing the big spikes in the graph. And it goes on like this for a while. You get home, maybe you can see things like, under the mouse, there's this little button, which is jumping along by 16. But enough admiring graphs, you can go back to slides online afterwards. And eventually, it seems like the bias is kind of smoothed out as you get to bite 150, 200. So eventually, the bias, like the bias bite 221, bias towards zero, bias towards 221, but it's more bias towards 222, 223, and so on. Then this tilt upward is very, very effective. And then eventually, you get to bite 255, and you start feeling like, at this point, it's almost a reasonable sight. You've got this bias towards zero, maybe we can deal with that somehow. It's maybe got a slight tilt up in town, but it almost looks kind of flat. And then you get to the next bite, and it's tilted. Okay, so we applied this to actually attacking as a cell situation of having a lot of plaintext being repeated in a lot of ciphertext. And then here's the probabilities we get, for example, from two to the 24 ciphertext with arbitrary plaintext data. There was a similar graph drawn yesterday where instead of, say, the 30% probability you see at the beginning bites here, it was about a 10% probability. So there's a big advantage to taking all the biases into account and doing the statistics properly. And then if you have more ciphertext, you're gonna have to say two to the 28 ciphertext, and then in the interesting region for SSL, you get complete plaintext recovery. So why do we have all of these breakable cryptographic examples? Well, you could say it's because the people who are implementing cryptographic protocols haven't gotten the memo. In many of these cases, they were doing the implementations years before AES, so maybe they shouldn't be blamed, but at this point, we just have to educate them, right? We just have to explain to them there's AES, and AES does not have these problems. Well, maybe side-general problems, but we also know ways to defend AES against side-general attacks. We have AESGCI, and that's got the authentication. It's maybe the words AES GCM, maybe that doesn't express to your average user, this includes authentication and you need authentication, but somehow we can educate them that AES GCM is what they need, authenticated encryption. And then again, we can protect the whole AES GCM, everything you need in secret key crypto, we can protect that against side-general attacks. So we just have to explain this to people doing crypto. Except maybe AES GCM is not actually what they need. Maybe AES GCM is kind of meeting some of the user requirements, but not meeting others. And the most obvious requirement that it's not meeting, especially with side-general protection, but sometimes even without, is performance. The users have some performance requirements. Here are all quotes of random examples, revessed again from 2001. The heart of RC4 is its exceptionally simple and extremely efficient pseudo-random generator. RC4 is likely to remain the algorithm of choice for many applications and embedded systems. And that's correct, RC4 has remained the algorithm of choice for many applications and embedded systems, in part because of the speed. You can find Adam Langley from Google online only a few years ago, something like two years ago, saying, here's the advantages of RC4. Here's why we prefer RC4 to Google. Number one, it's fast. Number two, CBC has bad history and it goes on to other reasons, but it's fast. Everybody knows that RC4 on a lot of platforms is fast. On a typical arm chip, it's twice as fast as AES. And so it's something which people like because of the speed. If people need that kind of speed, AES is not an option. Another example, OpenSSL in AES for extra speed is not doing what they should be doing for side channel protection. So OpenSSL is leaking a bunch of key bits to side channel attacks. For instance, in this financial crypto paper, 2012, from Lysen, Haydn, Stuck. OpenSSL is leaking something like half of the AES key bits to a very straightforward side channel attack because they're not doing what they should be doing in the implementations. Why aren't they doing it? Well, you look through the thousands of lines of comments in the AES implementation in OpenSSL, and they say it's because of the speed. They want better speed than they can get from a side channel protected implementation. Another example here different from speed is the size. If you have a small chip, very low cost chip, or a chip which has to do more than just the crypto and fit into low cost, that for instance RFID applications, you really need a small Cypher, lightweight crypto having a Cypher that fits into a small area and then maybe it also still needs to be fast. It's an incredible challenge. And the users can't just throw away these requirements. They really want the crypto to be fast. They need the crypto to be fast. And how do we give it to them? Well, there's all sorts of work on trying to make this happen. And I think this is one of the critical directions of continued research into security crypto is trying to do things which are better than what we have now in particular better than say AES GCM, where you don't have to reduce the security level to improve performance. AES has this huge 8-bit S-Box where it's taking 8 bits of data and then scrambling them, scrambling them, scrambling them and then taking this huge hardware server, a huge amount of energy just to totally mangle 8 bits of data. And that's overkill. There's lots and lots of Cypher's that do better than AES in all sorts of performance measures by not doing that overkill by having smaller S-Boxes than AES does. There's all sorts of fundamental constraints on what the users want, like they want the power to be smaller, there's some limit on how much power they can feed into the cryptographic circuits and they can't do an incredible amount in parallel. There's often limits on area. There's often limits on latency. You'd have a time that you can spend until you, maybe microseconds instead of seconds, but some limit on the time until you get your encrypted, authenticated data. Maybe what attracts the most attention is the throughput, the speed, the number of bytes per second, which you really should turn around into area times the amount of time to the byte, because otherwise somebody can always increase the bytes per second by doing two parallel encryption units. And then they get twice the area and half the seconds per byte. Well, to not be trivially allowing the throughput to go through the roof, you should be multiplying the area times the time. And then similarly, you can optimize the energy, the battery life of your cryptographic algorithms, which is basically optimize the number of bit operations you need to do crypto. Except if you're doing lots of big random access to big tables, that consumes a huge amount of energy. Also makes the area time a little bit. There's lots and lots of platforms for doing that. And of course, just to briefly summarize a lot of the research, and this would be a whole hour talk of all the different ways that people have been improving the performance of authentication and encryption, improving the performance of secretly crypto. There's, of course, lots of different application environments where maybe you have long plain takes, maybe you have short plain takes, maybe you actually want to encrypt five-byte plain takes all the time. And you save time by only having five bytes of encryption instead of 16 bytes of encryption. Tons and tons of optimization work and design work to make secretly crypto meet the user's performance requirements. One of the questions that I find most interesting is whether there's a way for one design to meet lots and lots of application requirements, to be good in particular for hardware and for software. I think a lot of people say, oh, hardware optimization of ciphers looks like this, software optimization looks like that, and you really can't do both at the same time. But there's some great examples of ciphers which are optimized for hardware, but they're still pretty good in software. Trivium, Ketchak. These are, well, Ketchak used in, say, an authenticated encryption mode, like the duplex. These are ciphers which are designed for hardware, designed to be really, really good in hardware. And they aren't really, really good in hardware. But they actually also provide pretty good software performance. So you can try to start from ideas like that, see what makes them work well in both of these environments and try to do even better. Or you can say, let's start from the software designs and see what's making them not so good maybe in hardware. So one suggestion here is to replace ARX and Rotate XOR with ORX or Rotate XOR. Now, if you remember the SCAME mix from earlier today, you can't do that with OR because OR doesn't give you the invertibility for a SCAME mix. But you can do, say, the basic operation in salsa 20, where you take two state variables, you, instead of adding them together, you OR them together, you rotate by some fixed amount, and then you XOR that into another state variable. And that's invertible. And, well, it's not as much diffusion across bits as addition, but it's still okay. It gives you, this is enough to build whatever sort of, whatever sort of cycle you want. You can build any function out of this if you want. You probably need a few more rounds for good diffusion than what you get from an ARX design. But hardware people are gonna be much, much happier with ORs than they are with additions. Additions are these huge carry chains. Hardware people are constantly complaining that doing an ORX design should be much, much better for hardware. I say it without having actually tried. Something else you can do, which I've seen lots of papers on, and I think it's one of the most important directions. It's not something where it's maybe motivated by current problems in cryptography, but we're clearly gonna have more problems along these lines of the future of our security not being good. And there's one thing to say, okay, we've got something secure, like ASGCM, it's not fast enough. Let's make something that security level that's fast. But how about when people get to the security level of ASGCM and say that's not as secure as we need? For instance, the 128-bit block size starts breaking down horribly once you've encrypted two to the 64 blocks. Even two to the 60s start getting failures. Some modes, two to the 40, start getting failures. That's not that many blocks to encrypt. So maybe we should have some of these birthday, beyond the birthday downs modes where we don't worry about two to the 64, we can encrypt much more. Or maybe we should have instead of 128-bit blocks, maybe we should have 192 or 256-bit blocks. Now, another direction of smallness in our current ciphers, authenticated ciphers, like ASGCM, the authentication, I guess there'll be a talk, well, the very next talk today is exactly about what security do we get out of the authentication in ASGCM? And there's a 128-bit key pipe inside GCM, which is really starting to feel uncomfortable. And some people would say for these kinds of issues, oh, the problem with having a 128-bit block with, say, counter mode, which breaks down with the birthday bound, some people would say the problem is counter mode and use a better mode. Some people would say use a bigger block. Same thing here, with a 128-bit pipe, you're gonna have some people saying, well, your mode is not making the most effective use of that pipe as you can change your mode so that it's safer. And some people would say, your pipe should be here. Can you do that efficiently? For instance, something I haven't seen people looking at, there's all these optimizations of 128-bit universal hash from just polynomial hashes. Has anybody looked at 192-bit, 256-bit? I don't think 256-bit should be twice as soft. I think it should be maybe 20%, 30% slower. But I haven't seen anybody try it and I think it would be a really cool thing to have for bigger, more secure design. Maybe the extra security issue that's attracted the most attention is misuse resistance. So instead of insisting that the message number would be a notch that somebody only used that number once, how about allow the message number to be repeated? Now, of course, if somebody has two different messages, M&M Prime, they encrypt them with the same message number, they're going to get different ciphertexts if they were different messages or if they were the same message, they'll get the same ciphertext. So there is a leak to the attacker if the message number is repeated. I mean, the message number repetition leads to the message repetition. But users get surprised if the message number repetition is making the authentication fail, for example, or leaking the XOR's messages or leaking much more information. So one proposal for dealing with this is to do authenticate then encrypt. And if you're careful about the details, this is the SIV mode from Robert and Trenton. What they say is you take your whole message number and message, you authenticate that with something, it has to be a little strawberry than just an authenticator, but do some appropriate thing to look through the whole knots and message, well, message number and message, and then use that result as if it were a knot. So that's something which would be a knot if the N changed or if M changed. And then use that to encrypt and counternote the whole message. Now, this should, well, you can prove if you do the details, right? This will protect against all of these message number repetition attacks except full-linking does the message repeat. But it also comes at some performance cost, which I think is maybe not primarily the cost of looking through the message twice, but what worries me the most is if you're in a denial of service situation, somebody's giving you lots of forged messages, then can you throw those forgeries away? If you have encrypt then authenticate, then if you get a forgery, you just check the authenticator, throw the message away. And that's cheaper than doing an encryption, a decryption of the message. Can you do something like this where you get protection against the message and over being reused and the forgeries are still fast to throw away? Another direction people have looked at is having integrated authentication, like OCD mode or Helix or Felix. I don't know whether having message authentication so you have these designs have some state where a plaintext block comes in, the state encrypts the plaintext block into ciphertext in an invertible way, and that plaintext block modifies the state for the next block. Now, can you do that in a way that's still secure if your message numbers are repeated? Obviously the data flow has to change, but can you do this efficiently and have some very fast, lightweight, have some good performance integrated authenticated encryption mechanism, whether it's some mode like OCD or something designed all at once like Helix or Felix, can you do that in a way that's not broken by the message number being repeated? Had you do it in a way that allows forgeries to be rejected quickly? I don't know. Maybe there's a good way to do this. One possibility would be, this is not really satisfactorily answering things, but something you could try is to have a block site for four round flight spell, HFFH, HFFH means you do a lightweight hash at the top, lightweight hash or lightweight, a very efficient, say a polynomial hash at the top, polynomial hash at the bottom, and put all your strength, all your defense in the middle two rounds, the FF rounds. The only job with the H's is to make sure that there's no internal collisions inside the input to the second round and working backwards the input to the third round. Now, if you have an HFFH block cipher, then that top gauge looks very much like what people want to do for authentication. Authentication, those people do some very simple transformation of each block and then, say, add up the results and then encrypt that one sum to get an authenticator. And so the top level of a sensible encryption mechanism looks very much like what you want to do for an authenticator or for what you want to do for this SID mode. You can also use the last stage and use the last stage for authentication. And I think you can get both of these requirements if you want to have your authenticator being, say, 256 bits instead of 128 bits. But that's quite a bandwidth limitation. So I don't think this is a really satisfactory solution. But this is just one of lots of directions that people have been trying things like this and people have been making lots of obvious improvements and performance and security over ASGCM. So just to summarize, ASGCM is obviously not satisfactory for all the users. It's something where the performance could be better, the security could be better. And if we don't worry about the security now, we're going to get people who eventually can get something with OK security and good enough performance and then they're going to get broken by the next round of attacks. Well, we've already seen those reviews attacks. You could have, if you like building modes and you want to do something better than ASGCM, you can certainly make a better mode than ASGCM even if you still use AS and still use GMAT. You can also make a better rack than ASGCM, something that's much faster in software, easier to protect against side channel attacks or even faster and harder. You can make better ciphers. AS, again, is certainly not the best that we can do. It was okay for the 90s, but hey, we know a lot more now about cipher design. Or you can put everything together, make a big integrated, preferably small integrated, lightweight, authenticated encryption system. If you're interested in any of these things, then some of you have heard, in case any of you have not heard, there is a Caesar competition coming up. So this stands for competition for authenticated encryption, security, applicability, and robustness. The main webpage for this is competitions.cr.yp.to, and then you see on the left side things like Caesar, if you can really ask questions and stuff like that. There's a mailing list for it. This is the main place for public discussions of what are the requirements and what are the decisions. If anybody from industry has some ideas of, okay, here's what I would like to see for authentication and encryption doing better for us, then this is the right place to tell people about it and say, please, please give us a better cipher. We're not happy with AESQCM, and here's why you can do better for us. NIST has burned themselves out on competitions for the moment, but they have been nice enough to provide some funding. So there's a cryptographic competitions grant, which is making sure that there's things like workshops and software benchmarking and such all going to be handled. Of course, this is not an infinite amount of money, so people want to scrape off more funding to keep this running nicely than that's, of course, appreciated and happy to chat with people about how to get funding for authenticated encryption generally and for this competition in particular. Let me finish off with just briefly showing you, I'll skip past a review of the older schedules and just say the most critical date for the moment if you're interested in designing a new authenticated cipher, so authenticated encryption system. The deadline for submissions will be the middle of January, 2014, and then a month after that, you should have the software done. And then the round two candidates will be at the end of 2014 and round three at the end of 2015 and so on through the end of 2017. Last slide that I have, workshops. We've already had a 2012 eCrypt-funded workshop on directions and authenticated ciphers. The outcome, and there's going to be, well, thanks to this NIST grant, there's going to be forthcoming workshops. Unfortunately, NIST money can only be used for the U.S., but hey, if you're scared about more money, you can have workshops outside of the U.S. as well. DIAP 2013, we're looking at Chicago either before SAC CryptoChev's or right after SAC CryptoChev's and people have any scheduling requirements. Now's a great moment to say that and then there's already been some volunteering for running DIAP 2014 in California close to crypto and then 15, 16, 17, well, there's the funding, just don't know yet where those will be going. All right, so if you're interested in authentication and encryption, then Caesar's great moment for you to show off what you can do and then starting in January 2014, that'll be a great moment for you to start breaking everybody else's submissions to this. That's it, thank you for your attention. Quick questions for Dan? Yeah, so the comment was that the Z1 distribution, which I will get back to you any moment now, there's also some fun things happening here. Yeah, so the Z1 distribution, if you look at that journal of cryptology paper from Sindhupta and Mitra and Paul and Sarkar and then you will see a theoretical curve which goes down and up and up and down and is sort of like the actual curve shown here. It does not explain the positioning of the curves. I mean, it does explain it down and up. It does explain it up and down. It doesn't explain how far down it's going, how far up it's going. It doesn't explain this spike, doesn't explain any of these little spikes. So it's certainly a big step forward understanding where some of you, the facts are coming from. But there's also a huge amount of RC4 biases which are totally unexplained by the theory so far. And yeah, for people who are interested in RC4, RC4 has been, of course, studied by more pretendless than AES, more papers than AES. Some people would say this means it's more secure than AES. If you like doing that stuff, then understanding what's going on in this graph, and in this graph, and in this graph, I'm sure there's some students who need to understand why RC4 is producing these distribution. Okay, thank you very much, and we're very pleased to talk. Thank you.