 Okay, so welcome to session five. My name is Yu, I'm from NKT. So this session is a stream cycle one, especially for RCA-4 stream cycle. The session consists of two talks, and yeah, let's start the first talk. The title is, it's nothing but, in a passive attack, the results are OVN separate that, that is this here, such as bodily and bodily well-being. OVN, I'll give the talk. So I'm very happy to present this work today because this is the result of very hard work of us during the last two years. So we did the last some analysis to reticle analysis in Europe in 2011, so you might have already heard about those. So we computed the theoretical framework analysis of FFWPA attacks by optimizing all the existing attacks and found out many new ones, we changed the attack for optimizing and so on, and at that time we didn't implement the attacks. So currently the main contributions of this background would be providing a patch on air crack engine, which is a software that is used for cracking FFWPA in real life that behaves much, much, much faster and better than air crack. So I'm going to describe them now. So this is a joint work between Pesosius and Mahtamwood. Okay, so if you probably go to Europe, everyone will tell you whoever uses VEP, he says, so VEP is not used anymore, why are you trying to analyze it anymore? So if you look around you at least in Singapore, if you look at hotels, restaurants, or even airports, you will see that almost like 20% of the wireless networks are out to his VEP, or even more of it. So this is not a problem of Singapore alone, it's almost the same in most Asia and Middle East, if you don't consider Middle East Asia. So this is already, VEP should be replicated already, but it's still widely used in several countries, so this is already motivating us that our work is still useful and you can still have attacks on VEP, which are behaving extremely fast. For instance, I was very surprised to see that even in Changi airport, we're still using VEP and it's not even 128 bit version, it's a 40 bit version that you don't need any attack, can just put 40 bit. So let's go to the author and then my talks. So first I will give you a reminder on our C4 stream cipher, how it can be adapted to be used in VEP application. Then I will mention a multimodal attack we have on VEP, which we call them tornado attack, we'll explain why we call it tornado. And finally, the challenges that we face to adapt our theory into practice. Okay, RC4. So RC4 consists of two distinct algorithms, the first one is called key scheduling algorithm, the second one is called PRGA. So you give an initial key as an input and then the role of the KSA is to scramble the state of the RC4. At the end, we will get some state, we call them SN minus one, or N is here to 156. So we get a state that is called S255, which why we call it S255, because we update this state 256 times. So initially it's S minus one after 256 update, it's called S255. So you give this scramble state into PRGA algorithm and then it gives you a new key stream. Each key stream is a byte. So anytime that you need a new byte, you run this PRGA once. So let's tell you exactly how it works. So you have a 256 byte, which is an identity permutation. So you have zero one up to 256, and then you have two registered called INJ. So the value of I, if you look at here, it's incremented each time. So for every state update, I is completely predictable. But the value of J depends on the value of the key. So it's pseudo random. So in every state update, you swap the value of S of Y and S of J. So seven comes here, zero comes here. And then you increment, in the next state update, you increment the I by one. And then J, for instance, depending on the value of the key, it goes to index 12. Then you swap the value of these two values. So 12 comes here, one comes here, and then you repeat this process to 156 sites. At the end, what you will get is a scrambled state that looks like this. So then the difference between the PRGA and PSA is that in PRGA, distance I starts from one, and J depends on some value of the state. You again do the swapping. So three and seven are swapped. And then to compute the key swing, we compute this function, S of S of I plus S of J, which in the case, it's S of S of I seven. S of J is three. So it's S of 10, and S of 10 is 189. So you all perform 189. Anytime that you need a new key swing, you run this PRGA once more, and J is in the loop. So that's simply how RC4 works. So RC4 can be used in different applications. For instance, now the application is web. It's used in the OEPI, it's used in SSF TLS, and several other applications. So the next I will probably hear about it in broadcast encryption mode. So how does it use in web? The difference between RC4 and web using the RC4 in web is that first of all we are attacking web in the version of 100 to 88 bit key. So our key is divided into 16 bytes, so you have K0 to K15, and the key swing is chosen onto as many bytes as one. So what happens is that if you replace RC4 with web, since there is no ID embedded into RC4, the designer of web has to have some ID embedded into RC4. So what they decided to do is to give the first three bytes of the key to that packet. So these values are ID0, ID1, and ID2 given in clear text over the channel. And the mistake they made, or maybe they didn't know that was a mistake because they assumed that RC4 is perfectly secure, is that these 13 bytes left are completely fixed for the next packet encryption. So these values are not changing for the next consequent packets. So that's because of some biases and correlation that you have on RC4. It's made the web vulnerable, but you can run a statistical attacks on the 13 bytes and find the key pretty straightforward. Okay, so let's talk about the attack and how it works. So as any other streams are very good with the key, we all put the key swing. So what are the correlations we are trying to find out here? So imagine that we have access to the key swing, and our goal is to reverse this process and find out the key swing. So you might ask, okay, yes, in theory, in crypto we assume we have the key swing, but what happens in practice? Why do we have the key swing? We have only the cyber text. So due to the protocol specification of IEEE 0.2.11 in both WPWP, some of the initial bytes of the key swing correspond to non-heathers in our anticipated packet. So you already know. So you know, for instance, up to the 32nd byte of the key swing, these are known, so you just explore the cyber text to get the key swing. So the main goal is to reverse this process and have some correlations that have some predictions or both for the value of the key. So how does these biases look like? So we call them conditional biases. There are pairs of a function called fj and a probability called pj and a predicate. So what we do, we get the key swing, we drop it from the communication channel and then we check whether this predicate holds. We have z and some clue that we assume it's known. If this predicate holds, then you say that the value of k bar i is equal to a specific function depends on the value of the key swing. And this relation holds with some probability. So why do I say k bar i, what's the difference between k bar i and k i? k bar is a summation of the bytes of the key from zero to y. So we have k zero plus k one up to k i, which is the number z 256. So this is because some technical reasons because it increases our success probability. So you can look at the paper to find out exactly why. So this is one of the biases, for instance, that we have in reality. So this is a famous product called AU-15 and these are the predicate. So you look at the value of z two in reality, if it's zero and if the value of st of i is zero, so t is an index that is known. This means the state after the piece iteration of the KSA at index i, if it is zero, then you vote for the value of the key. So you say that the value of k bar i is equal to two minus sigma i with some probability. And sigma i is assumed as a value that we know. We can compute it. So in RC4, in web applications, we have 22 of such biases. And each of them have different probabilities, different conditions, different key recovery relations, and so on. So what we did in Europe is that we computed a framework how to analyze these all biases with different conditions and so on in an optimized manner so we can get the most we can get out of all those biases. To drop the number of packets, we need to break them. Okay, so I'm going to tell you the trend of the attacks start in web. So in 1994, in 1994, RC4 was disclosed. And then the first attacks from 1995 by Ruse correlation, and then the famous attack of Roman Den and Shamir in 2001, we wrote web. And then you have correct attack, which appeared in 2004, and software such as AI track and G, which could break, wireless communication appeared in 2004, and then you have this BW attack, the famous attack of breaking web in less than 60 seconds, and then we have up to now, which is 2030. What I want to focus in this picture is this. So by the initial attack of Roman Den and Shamir, you needed 5,500,000 packets to break web. This was dropped to 100,000 by having the correct attacks. And finally, right now, what we have is that we almost see something like 20,000 packets to break web. And we think that maybe beating this could be extremely hard. So coming up to something less than this, the current devices that exist for RC4 is hard, but if you find more correlations, you can probably drop this number of packets. And then 19,800, where is it coming from? It's if we only restrict the running time of the attack to only five seconds on a normal PC. So if you want to run it for, I don't know, one hour, this 19,800 will be dropped much less. This is only restricted if the attack only runs for only five seconds, around five seconds on a PC. Okay, so how does the attack work? So I will just briefly tell you what works. I'm not going to go to the detail. So due to some reason, due to some relation that exists on RC4, it's always good to recover the value of K bar 15 first. Because we can match some of our correlations we have on RC4 if we have the value of K bar 15. So therefore you have a loop here that computes the value of K bar 15. How this computation done? We look at all the biases we have for RC4 that are votes for the value of K bar 15. We look at the key stream, if the predicate is full, we increment the counter corresponding to the index that we're interested in, the table of K bar 15, which is index is zero to 255 by corresponding value. And then at the end we sort the value and we think the best value. So this will be the best value for best vote for the K bar 15. Using this value of K bar 15, then we vote for the value of K bar three similarly. Then we update the state to S3 because we have the value of K bar three so we can update the state to S3. And then we recover K bar four. Then we update the state to S4 and we do this recursively as you see in this loop. Then we recover K bar 14 and then you have the value of K bar three to K bar 15 and the value of K zero, K one and K two are already gone. You check if the key is correct. If it's correct, you're happy. If it's not correct, you will have recursive attack again and again to this size. For instance, one of the differences that we saw between practice and theory is that here you have a recursive loop. So having a recursive loop in reality when you're implementing the attack is extremely expensive because if you make a longer assumption in any of these steps, everything afterwards in theory would be completely wrong. So in our experiment what we saw was that even if you have for instance a wrong guess for the value of K bar three, we can still recover in many, many cases the value of high probability in the value of K bar four. And this is because if the value of K bar three is guessing correctly, it only changes, it only modifies 16 consequence swaps out of 256 swaps possible. So it doesn't have that much effect if you guess K bar three incorrectly. So at the end in practice, we decided not to do this recursively and we did it in a non-recursive manner and we obtained almost the same results which I think are being much faster in reality compared to if you want to do it in a recursive manner. So what's important I want to focus here because I'm going to talk about it later is what we have a value that we call it Y of X. It's a counter for X. What's meaning of a counter for X? So as I said, we have a table with indexes zero, one to 255 and you are voting for those values. So the value for each of these anytime that increase the counter corresponding to value we call that specific index counter. So you have the counter and you call it Y of X. And what's the meaning of random X? So after you sort the specific table, then imagine that the best element in the list we assume that it's ranked as zero. The second best element, the index rank is one and so on, the worst one is 255. So R of X just means rank of X. So all the parameters you see in this attack are completely optimized and we check in practice and they really work. So what are the challenges we face when we try to implement the attack compared to what we did in your group? Okay. So in our paper in your group, we made a heuristic assumption that the variance of the good counter is not equal to the variance of the bad counter. Y of good and Y of bad. Y of good means the counter corresponding to the correct value of the key and the bad counters are all the other counters we have. Then we implement the data so that the result doesn't match what we predict. We start to testing what's going on and we found out by implementing that those biases don't match. So this can be easily solved. We just change the formula, it's easy solved. The second part was we also make a heuristic assumption that the random variables Y good minus Y i's for all bad i's are independent. That's something you always do in theory to simplify your analysis. Then we test it in practice, they're very independent. So this can be again solved. We just have some bunch of integrals. It can be solved again. But then the problem came here. We assumed that the random of the correct counter is normally distributed. Then in reality it wasn't normally distributed. Then we assume, okay, so it should be plus or distribution. But then they expected value on the variance there and the same. So another question was that what's the distributions of the specific rank? So the only way that remain for us is to compute the distribution in reality in packets that you swap and see how does the distribution look like. So this is how the distribution looks like. So here you have the rank from zero to 50. So R3 meaning that it's the rank of the correct counter for recovering the value of K bar three. So if you have rank zero, it means your correct key is the best element in the list. And if the rank is, for instance, 50, it means that your correct value of the key is the 50th element in the list. So of course the zero one has the highest probability and so on. So this might look like an exponential distribution, but it's not. So here we started matching all the distributions that are out there where these three can see which of them looks like this distribution. That's the only option they were found. So then we found out that there's a distribution called Poirier distribution, which is a generalization of negative binomial distribution, which I draw it in math lab, it exactly matches what you see in practice. So there's a red curve and a blue curve. One of these is the practical experiments and the other one is the exact definition of the probability distribution. So you see they're exactly on each other. Now when you have two distributions that are on top of each other exactly, there should be a justification for that. So we started looking at where is the use of Poirier distribution and use of negative binomial distributions. And we found out that the main application is in analysis of frequency of head occurrences and in tornado analysis. So that's where the name tornado death acts is coming from. So we started reading some papers about tornadoes and hails. And these papers are from 60, some of them are used by a typewriter and so on. So for instance, here you can see it says head occurrence being a comparatively rare event is fit well by the Poisson distribution providing the headstones are independent. Then this condition is not met. Head occurrence follows negative binomial distribution. Or there's another paper called tornado probabilities you have here 1963. So yeah, so if this is another sentence from that paper. So if I want to summarize what is the contributions of those two papers, they say that tornadoes and hails are something that are rare. So we assume in reality we model them with a Poisson distribution. But in some areas that you have high frequency of tornadoes or hails, then if one head or tornado happen, it affects the probability that the second one happens. So you have multiple Poisson distributions that are dependent. So the contributions of those papers was to say that if you have multiple Poisson distributions that are dependent, the joint distribution of all of them will be following negative binomial distribution. So then we started looking to have the same situation here and we are exactly similar situation. So what's the meaning of the probability that our rank is zero? So this means that the value of the good counter is more than the first math counter is more than the second math counter up to the more than the 25th to 255th math counter. So you have a random value of y of good minus y of bad which is following Poisson distribution and so on. So you have some Poisson distribution, 256 Poisson distribution that as I said, because in practice they were dependent. So the joint distribution of all of them will be following their Poisson distribution. So that was a good justification to make sure why our distribution is following Poisson. So solving this problem and also the other two problems, then we have something that completely works in practice. So one more thing that I wanna focus on between the challenges we have with comparing theory to practice. So these are the structure of the R packets in the wireless company, the L02.11 and also in TCPIP packets that we sell over the channel. So when we are talking about R packets, we are talking about the active attack on web because you are doing R injection into the network. So the problem of active attacks is that some of the drivers that you have out there for your network are not able to inject packets into traffic. So you need some extra patches for the driver to make it able to inject data to the traffic. So some network cars out there, if you go to the first network of aircraft, they have many, many several pages how to make your network card, be able to inject back and for some network cars there is no one available. So there is always a hold that we can drop the number of packets for passive attack. So we don't need to do any type of injection into the system, we just drop the packets and then we still have a successful attack. Then this would be the TCPIP packet. So you just listen to the communication and then you just drop the packets and you success. But then the problem of the passive attacks using only TCPIP packet is these values that you see here. So the values that are completely gray, they are gray or the values you cannot guess them. These are the values that are not unknown. It means that for instance you cannot recover the key stream corresponding to these two bytes and also these two bytes and these two bytes. And the ones that are in light gray are the ones that you need to guess them. You might guess them correctly or might not guess them. But these values cannot be recovered. And the ones that are on white you can recover them very easily, they are new. So these are the factor values of them here. So you see that comparing TCPIP packets with R packets, there are many values that are unknown while you already know them in R packets. That's why the active attacks on VEP works much better than the passive attacks because you cannot recover these specific bytes. So these are the something that we have to consider in the theory and so on to be able to apply to practice. So this is the result that we have. So we are going to put this page online in a week and we're writing some manuals about it to just put it online. So I'm going to put it on the last second page of EPFL. So we'll be publicly available, you can download it. So this is a comparison with the current air crack engine that is out there and on attack. So here is air crack engine in active attack mode, which you need to inject packets into the traffic. This is our passive attack mode. So you see that this attack outperforms much better than the air crack in active attack mode. So we have a passive attack that is much faster than air crack in active attack mode and this is the attack that is active or attack that in active attack mode. So for instance, to have successful which of 50%, we need something like 22,500 packets to break web. But with 22,500 packets, you have only something like 3% of success rate if you use the current version of air crack engine. Okay. So do I have some two minutes or something? Yes. Yes? Oh, okay. So I have two minutes left. So I'm going to just... So the problem is that in active attacks, you are using R packets and in passive attack, you are using TCP IP packets. So here in R packets, we can recover all of the key street lights. For instance, in these parts, you can recover this value, you can recover this value, you can recover this value. This is not possible for TCP IP packets because these headers are not guessable. And these TCP IP packets and R packets are encrypted, so you cannot guess those specific values. That's why the passive attacks behave much slower than the... Okay, so let's thank the speaker again.