 Good morning everyone and welcome to the first session this morning. So the first talk of this session will be analyzing multi-key security degradation by Alta Loitz, but many cannot per person and Alta will give the talk. Do I need a microphone? Can you guys see the microphone? So thanks for the introduction. So the type of research that we did, you might be able to consider a bit more analyzing conventional and symmetric key algorithms, might not be as sexy as post quantum or computing on encrypted data and stuff like that, but what I'm going to try and argue is that there's actually motivation, there's reason that you want to look at these things and you can actually get interesting research. So this is kind of briefly an outline of the talk. So first I'll be going into kind of why it's interesting to look at these schemes and these models because basically in short you can have a direct impact on algorithms that are being used nowadays. Then I'm going to be talking about this multi-key setting and introducing it properly and showing how there are some obvious questions that are still unanswered about the setting. And then looking more at the analysis it turns out that it's actually kind of counterintuitive to find attacks against schemes in this multi-key setting and at the same time also prove that schemes are secure according to some level of security. So jumping right in, the motivation. So a few months ago, Roy just came out with an article titled, This Trustful U.S. Ally is Forced By Agency to Backdown Encryption Fight. And what the article is talking about is basically how the NSA is trying to standardize their block ciphers, Simon and Speck in ISO. Simon and Speck designed to be lightweight block ciphers. There's a whole range of them going from small black sizes of 32 bits all the way up to 128 bits and bringing key sizes. And I'm setting the political issues aside. One of the things brought up in this article is basically that through the discussions and arguments in the ISO process, the NSA ultimately decided to drop all the block sizes below 128 bits. So right now they're just trying to standardize the 128-bit version of Simon and Speck. And the reason for that is basically that these smaller block sizes provide quite stringent restrictions on how much data you can process. And in fact, what convinced them, ultimately, the paper that convinced them to drop everything was this Suite32 attack, which was published last year. Suite32 attack is an attack against TLS and other protocols as well. But in the case of TLS, they were able to recover secure cookies and worked against triple desks, which has a block size of 64 bits. It also required 785 gigabytes of traffic, which means that it's not a very high priority attack. But what's more important about it is that this is a birthday-bound attack. These are known to corrupt cryptographers for over 20 years. They are often considered impractical because they require known plain text and they reveal little information, but the Suite32 illustrated first proof of concept of these type of attacks in practice. So this actually convinced a lot of people at these ISO meetings that maybe there is a problem in standardizing these smaller block sizes. Okay, yeah, it might not be the last attack. So these birthday going then a bit further, step further in these birthday-bound attacks and these are of course attacks against cryptographic algorithms, which means that they are independent of the implementation, independent of the standard, independent of the block cipher, meaning Suite32 attack works equally well and openly at the end with long-lived block-fish connections as it does with triple desks and TLS. But as I mentioned before, these attacks known to cryptographers for over 20 years and of course we've developed a lot of theory in order to understand when the attacks happen, when they don't. And we kind of have this very succinct way of describing how long we can use these algorithms via the insecurity bounds. The security bounds that summarize the insecurity of the algorithm that you're considering as a bound where you have this term which describes a security of the mode of operation or whatever plus a term describing the block cipher security. So this is where most of crypt analysis goes into all the knowledge and this is usually some kind of proof right here. So now how can you kind of make sure that you can avoid these type of birthday-bound attacks? Well, you just take your parameters, you plug them in here and let's say you have a certain sensitivity, you don't want to have an advantage of more than 100 million in attacking your scheme, then you just try to make sure that this term will be here small enough. And this term in it will then describe, for example, how many queries you can make and the length of the queries and blocks relative to the block size. If this block size is too small in the case of Simon Speck, for example, you will not be able to make too many queries or too long of queries. So this kind of provides the context in which we did our research. Now moving on to this multi-key setting. A lot of the previous analysis that's been done on all these symmetric key algorithms with modes of operation has been done in the so-called single key setting in which an adversary interacts with just one instance of the oracle, one key instance. But in practice, the multi-key setting is actually important because algorithms are never just used in isolation with just one key but they're actually used by millions of keys. And this multi-key setting then is where the adversary has access to many independently key copies of the algorithm. So here's an example of ASGCM using TLS, hundreds of millions of different keys used by many different users. Another example is Internet of Things, you get a whole bunch of devices in which it might even be difficult to switch keys on the devices. So then the question is, okay, we've got these security bounds which we got from our single key setting analysis. What can we then say about the multi-key setting? Are there any catastrophes that can happen? Well, there's this folklore result which says that the probability, success probability of an adversary finding an attack in the multi-key setting is bounded above by the single key success probability times the number of keys present. So this is valid in all kinds of settings using a kind of straightforward extension of the single key setting to the multi-key setting. Okay, so this is great. This means that we know there aren't any catastrophes. This means that we know that if we use an algorithm with independently independent keys, then we're fine still. But once we start to actually take into account how many keys this can be over here, it can reach up to hundreds of millions, then it starts to increase the restrictions and the amount of data that we can process for a key. So we tried to kind of... We looked at this, so me and Kenny Patterson in 2016, we just kind of computed these bounds for TLS to see what would happen, how much data could you actually use with these algorithms. So we looked at GCM charge at 20 polyversion 05. We used, at the time, the state of the art knowledge on the security bounds. And it actually led to the inclusion of this key update feature in TLS 1.3, which allows you to then change keys. And in particular due to this multi-key setting. Because the people at TLS are rather conservative in what they want an adversary to be able to do. So they want an adversary to have success probability at most to the minus 60, which then puts significant limits on the amount of data you can process. And in particular, when you start multiplying by a number of users using all keys throughout the world, TLS, that might be slight exaggeration, but in case the data limits gets even more stringent. But then of course we started questioning, okay, there are these stringent data limits, but we don't know of any attacks against these schemes. So then we started doing this research and in the case that key update feature might not necessarily be necessary. So going more deeper into this multi-key setting, we look at an example, block cypress. So here I just depicted an adversary interacting with an algorithm. Of course the adversary, we're going to measure the complexity of the adversary. We're going to call it computational complexity. And the queries that the adversary makes to the oracle, we're going to call it data complexity. And these are two measures by which we're going to measure attacks. So what you can do, for example, with block cypress, and what we've done over here with AS128 key recovery, is map kind of the data complexity necessary in order to perform key recovery versus the computational complexity. So in the top left corner over here, you'll see, for example, brute force key search, which requires one-planet X cybertext and 228 computational complexity. And then if you go through the literature, kind of find best known attacks against AS128, then you see that no matter how much data complexity you give the adversary, they cannot reduce their computational complexity at all, or very, very little bit. I think this is a factor two or four improvement over brute force. Now if you contrast this with the multi-key setting, in which now the adversary has access to, say, two to the thirty different keyed instances of AS, the picture completely changes. So whereas before data complexity didn't make a difference, now with additional data complexity you can actually reduce your computational complexity quite significantly. So up till the number of users. So this illustrates that actually in some settings it does introduce a qualitative difference. So what is this attack? What allows you to reduce your computational complexity? Well, you just simply pre-compute, you take one plaintext key, and you pre-compute the encryption of that plaintext under a whole bunch of different keys that you choose in advance. Then you take that same plaintext and query it to your online oracles, under different keys. Then you just compare the ciphertext you have on the left-hand side with the ciphertext you have on the right-hand side, the moment you have a collision. You know with very high probability that the key used over here to encrypt that ciphertext is the same as the key used over here to encrypt the ciphertext, hence you have covered one of the keys over here on the right-hand side. So this is kind of a birthday bound in the key size of the block cycle. This was initially published by Piham 2002, extended to a time-memory data trade-off at SAC 2005. So now moving on, so that's kind of the state of the art for block ciphers and the multi-key setting, with the best moment type. Now what happens with modes of operations? And in particular GCM, which was the algorithm that we were kind of interested in due to TLS. So here is just generally you don't, you know, details of GCM not so important, but the only thing that's important is that GCM uses block ciphercalls over here, which are these white boxes. And when analyzing GCM, whereas in practice you would use, you know, A is with a key over here, when you analyze the mode security you replace each block ciphercall by a uniform random permutation. And now when analyzing this key mode over here, all of a sudden, Piham's attack, which was the best known attack for block ciphers, is not really applicable because there are no keys which you can kind of pre-compute ciphertext for, or at least the keys of these things are way too long and they wouldn't give you any information. So the only thing we have in the case of GCM is the single key security bound and then also this folklore result, which then gives you the multi-key security bound. And what happens when you map that out in a graph? Well, okay, so now in the case of analyzing modes of operation, computational complexity doesn't play a role because the analysis is done versus information theoretic adversaries. So the only thing left is data complexity over here. And what we can do is we can map data complexity to successful ability in attacking the scheme. This over here is just the single key bound. So if an adversary can make 246 queries, then it will have, you know, roughly successful probability more than 264. Now you give the adversary access to the 30 users, all of a sudden this line shifts up and this is the best known multi-key security bound for GCM. So basically if you want to reach the same level of security, let's say you wanted to reach 2 to minus 64 in a single key setting, but that means that all of a sudden the amount of data that you can process on your key reduces significantly. And yeah, of course, no, this is a log log. Okay, so putting that by side by side, we see that for block ciphers there is indeed a qualitative difference. You can reduce the computational complexity and it's actually matched by attack. We know what happens exactly. In the case of GCM, we have a single key bound. We have this multi-key folklore result, but we don't know any attacks. We have no clue. We can't apply the arms attack. We don't know anything else, basically. So then question our work would be what to do is set to understand this gap over here. What we tried to do initially was kind of characterize exactly what the necessary and sufficient conditions are in order for this degradation over here to occur. When we came up with a sufficient condition, meaning that if a mode of operation satisfies this condition, then this degradation doesn't occur. So we proved it in kind of an abstract setting. We introduced some definitions and stuff and then we applied it to GCM. Now I just kind of go to give you an intuition of what this sufficient condition is. It helps to think of Pachinko. For the Japanese people out there, they know very well Pachinko is basically thinking of it as a salt machine in Japan. You're going to put it in little balls at the top and you need to get them into certain spots and you win. You win more balls so you can play even more Pachinko. So we're going to create an analogy between Pachinko and attacking cryptographic algorithms. Basically a player is going to be the adversary and its data complexity is going to be measured in balls over here. And its goal in breaking algorithm is basically winning the game, getting a jackpot. Gamble, as I would say, data complexity of money or balls, success is a jackpot. So what I've depicted over here, then, you have one machine, this is a single key setting, but when you've ever been in Tokyo or something, you walk down the street, you never see one of these machines by itself, but in fact you see one of these things, right? There's a whole bunch of people playing and in fact these people, they actually often collaborate and work together saying, oh, this machine, this is a box, or this machine is better or worse or whatever. So they often collaborate. So in order to describe our setting, or to describe the multi-key setting, let's say that you have an adversary or a player who has a 500 ball budget, a 500 query budget, and it has access to 100 particular machines. So now the single key setting would mean that the adversary has to spend all 500 balls on just one machine because it only has access to a lot of them, versus now in the multi-key setting, it somehow has access to 100 machines and it somehow distributes those 500 balls over the 100 machines. Now using the folklore result, what that would say is that using 500 balls on 100 particular machines gives you a factor of 100 higher success probability than using 500 balls on one particular machine. So that's what this folklore result is saying, is that somehow the adversary is able to come up with some kind of special strategy which allows it to distribute those same 500 balls across 100 machines in such a way that it can improve its success probability by a factor of 100. So this is of course very counterintuitive because I mean what would you do? Is this ever possible? Well in fact yes, it is possible. So you assume that a particular machine is lucky with some small probability, then with one particular machine you're kind of stuck with the lucky one or not. You have to spend all those 500 balls on that one machine, but if you have 100 particular machines and let's say you can collaborate with a bunch of people, you can try and find that lucky machine and then spend all your entire budget on that one machine. So maybe you have 100 machines, you spend one ball on each machine to kind of determine which one is the lucky one, once you find the lucky one you spend the remaining 400 on that one. And this is not just pathological, they're actually examples of algorithms, design algorithms which are kind of lucky. Namely, block cyphers with weak keys. So take this example Midori 64, it's been identified to have 2 to 32 weak keys out of 228 and it's identifiable with one query. So once you've identified the weak key, then you can perform a weak key recovery which requires completion complexity to the 16 and data complexity to, significantly less than a normal key. So what you can then do is, if you're in the single key setting, one user, your computational costs basically, let's say that either you can perform this weak key attack or you have to do brute force search. So in the single key setting you're either stuck with a strong key which means that you have to do brute force search pretty much and or you have a weak key in which case that your computational cost significantly reduced, data cost is only 2 and this is your probability, it's basically the probability of getting one of these weak keys. Whereas what you can do now in the multi key setting is make one query to, for example, your 2 to the 16 different oracles and then just perform the weak key attack once you have one of these weak key oracles and you can improve your success probability to 2 to the minus 80. So what we have now is that, you can perform this key recovery attack if the key size is not too big. You can exploit weak instances if they're present. But then the question is, what else can happen? Are there any other kinds of attacks or can we maybe even prove that in certain settings they're secured? Okay, so this is exactly what we did in the case of GCM and what we found that is if algorithms satisfy this particular condition then they don't suffer from any of these weak key settings, weak key things. So the conditions to describe it need to describe two settings. One in which adversary has access to one particular machine and, for example, 300 ball party. Second setting exactly the same. But in this setting, a friend has played on that same particular machine with fewer than 200 balls. So it gives you a history with fewer than 200 balls about what's happened on that particular machine. And here the friend, in this setting the friend has also played on the machine with 200 balls, but with 200 balls exactly. So meaning that the differences in the histories over here is that this history is shorter than this one. So if the setting, if the jackpot probability of this setting is less than the jackpot probability of this setting for all histories below some certain cost then adversaries will have no advantage in interacting with multiple Pachinko machines. So what we're able to say is that by analyzing just the setting which adversary interacts with just one machine we're able to conclude something about the setting which adversary interacts with multiple machines. So with this condition, lucky machines are actually excluded. So you take setting one which adversary just interacts with one Pachinko machine and then you take setting two in which adversary interacts with one Pachinko machine and the friend tells you that the machine is not lucky, meaning that it's kind of a strong machine. So now the adversary's probability of winning over here might actually be higher than over here. Because over here the adversary is stuck with one very strong machine and over here the adversary might have a chance of interacting with a lucky machine. So this means that actually in this case the probability of the adversary could be higher over here than over here hence any algorithm that satisfies our condition won't suffer from this kind of lucky machine problem. Okay, so then using this condition what we're then able to show is basically that instead of disrupting the case this is actually the case for GCM regardless of the number of users that the adversary is interacting with. So briefly to conclude, so understanding the limits imposed by attacks against most of operations is important because of all these practical applications. Multi-key setting balance is not necessarily matched by attacks and finding these attacks is not very obvious at all. And proving the absence of multi-key security decoration is not easy despite the lack of attacks. Even applying our condition is not very easy. There's also one in Tassaro where they came up with some other conditions and other settings to be able to prove this lack of multi-key security decoration. Thank you for your attention. So you know should GCM satisfy that? Is that kind of the only one or is it something that many motor operations might satisfy? So when you say satisfy so there's this multi-key security decoration? I mean your criteria. My criteria? The only thing that we've been able to prove for is GCM and polynomial based hash functions. Like I said it's not very easy to apply it. Okay, thank you. Let's thank the speaker again.