 to the fourth talk. The title is Unknown Inputed Packs in the Parallel Setting, Improving the Security of the Chest 2012 Liquid Resilient PRF. And the offers are Marcel Medwet, Francois Sadiak-Stondard, then Sislav Nikov, and Martin Heldhofer. And Marcel will be the talk. OK. Thanks for the introduction. So today, I want to speak about how to improve on the Chest 2012 Liquid Resilient PRF, but not only improve how to improve the security, but also we lower the requirements. So with implementation, it is possible to implement it with a plain parallel AES implementation, which was not possible before, while keeping the same performance. OK, I will start, well, recapping the countermeasure landscape and the cost of countermeasures and then motivate why PIFs or Liquid Resilient PRFs are useful, and then, well, evolve on the different constructions, ending with our construction, and then give some results on the analysis we've done and conclude. So we've already heard before that if we implement cryptographic implementation, like a block-side on a device, we have several sources of side-chain information, like the timing, which we need to keep constant, but also instantaneous leakage, which might come from power or EM. And also, just before I've heard, one of the things you want to do as an implementer is to keep the signal-to-noise ratio for these side-chain traces low, or apply masking to make the power consumption. Well, the power consumption depends on the data processed in a device, and this one depends on the key. And we want to make the statistical moments in the leakage distribution independent of the key, at least the lower ones, by masking. And another thing we can do is to just limit the number of measurements an adversary can make, or in other words, limit the key usage. And then we have other problems we have to take care of, but we are not so much concerned about this in this particular talk. So I will focus on side-chain leakage. And for this, we have already seen, while there are well-studied countermeasures, for instance, masking, which is the costs are quadratic in the number of shares, in a general case, for an arbitrary circuit. And the security is exponential in the number of shares. Then we have time randomization, which were both a linear. And then we also need fault protection. I don't discuss this in this talk, but our construction also gives advantages there. Also linear, and the real device must be protected against all of them. So you can basically multiply the costs of all of them, and that might be too much for a low-cost device, or it might be high in general. And also, combining those countermeasures is not even straightforward, because sometimes the goals are contradicting each other. So adding redundancy might give more information in the trace, but that's not what you want for side-chain protection. And so one thing to do, as I said, is using key updates to limit the number of measurements. And one thing you could do, for instance, to limit the number of measurements is to implement this leakage resilient PRG, or stream cipher. You can see that you start from a key, and then you just go to the right. And in every stage, you update the key, and you also produce the output, like a key stream. And in every stage, the key is only used twice. So if you have a bound leakage for the block cipher implementation, then you also have a bound leakage for these two calls. And then for some constructions, you can have a proof that you can run this basically as long as it's still secure in the black box model. And you will be fine. You will have bounded leakage for the whole construction. The problem is you can run these only once, because for a stream cipher, you need to update the seed every time. So the question is, how can we update this seed in a bounded leakage way? And OK, the first thing, or at least in the proofs, you often need a leak-free gadget to initialize first. And one thing you could do is to have a highly protected implementation, like masking. And then, well, overall, you might get better performance than always using a masked implementation, but you still have no bounded leakage. You can improve on the performance by using rekeying. There you usually have a special dedicated function for the seed update, which is linear, so that also masking has only linear costs. So you can mask to higher orders, but still you have no bounded leakage. And so that's one of the motivations for actually instantiating a leak-resilient PRF, which would allow you to initialize here with bounded leakage. And in this work, we attempt such a construction. We have no proof for this, because if you have, for instance, an adaptive leakage function, you can not really show something for hardware. But if we fix some assumptions like hamming weight leakage and so on, we can at least conduct experiments which suggest that we have, well, very positive results on the bounded leakage side. So our construction works in a parallel setting. And therefore, I will first talk about DPA attacks against the parallel implementation of AS. That means we have a plaintext input, so I have a text output. And inside, everything goes on at the same time. So for instance, the first S-box layer, all 16 S-boxes are evaluated at the same time. And in an attack, we would collect traces for many different plaintexts. And we would then know these values, the plaintext bytes, P1, we would make a key guess. And this would allow us to predict the output here. And applying a leakage model, we can therefore predict the leakage of the device under a certain guest key. And then having the actual leakage from the device measured, we can make a comparison and decide which one is the correct key. Of course, in a parallel implementation, we don't only leak about this intermediate value, but also about the others. Fortunately, our plaintext bytes are independent. And therefore, well, we can marginalize the distribution of these S-boxes. And therefore, our noise produced by these S-boxes becomes independent algorithmic noise, which we can easily model and take into account during the attack. So eventually, if we have more algorithmic noise, we will need more traces. But we can always recover the key given enough traces. And eventually, we will get for every S-box because we attack every S-box after each other as core vectors. So we get 16 score vectors, suggesting the 16 key bytes. So this is how a plot of attack simulations. In our simulations, we always use noise-free hamming weight leakage and perfectly computed templates. And you can see that parallelism increases the algorithmic noise. And therefore, we need more traces. On the y-axis, you have the guessing entropy of a single key byte in log scale. And on the x-axis, you have the number of traces needed also in log scale. So the purple one refers to a fully parallel AS implementation. The blue one, there is no algorithmic noise at all. And apart from the fact that algorithmic noise increases the number of needed traces, security decreases exponentially with the number of traces you have. But what's positive about it is that if you would be able to fix the number of plain texts to two, so the number of traces an adversary can get, then we see we have bounded leakage, which is not sufficient in the case of no algorithmic noise. But if we have enough algorithmic noise, this is already something we can work with. So the question is, how can we fix this in a practical scenario to make it useful? And the answer is quite old. We can use the GGM-PIF construction, which essentially works. We have a public input x, which we write as a bit string. And we have a key, and we fix two plain texts. In this case, you have the old 0 and the old 1 plain texts. And we start with the key. And if the first bit of x is a 0, we encrypt the old 0 plain texts. And otherwise, the old 1 plain text. And then we get a new output is now our new key for the next iteration. And so we process the whole input x like this. And the nice thing here is we only have two plain texts that we use, so we get only two traces per key. Of course, in practice, you can measure as often as you want. But since we are in a noise-free scenario, having one trace is good enough. So it's the optimum already. Problem is we need 128 encryptions, which might be too much in practice. So the question is, can you improve this in terms of performance? And well, naive way would be, OK, we just make this tree wider and we use more plain texts in every stage. But as we saw before, security degrees is exponentially. This is not an option. So then can we do better in terms of security? And the chess 2012, the proposal was to have carefully chosen plain texts. Imagine you take one byte of your input x and you just replicate it 16 times. And those form your 256 plain texts. So all plain text byte inputs to the S-boxes are equal. And then if you mount your side channel attack, you get two interesting effects. First of all, the plain texts are not independent anymore. And therefore, this noise here becomes key dependent. And second, you can't help but attack all S-boxes at the same time. So your attack doesn't give you 16 score vectors, but only one. And then the 16 subkeys are hidden or are somewhere there. So we get the speedup. Dividend conquer doesn't work anymore. We have key dependent noise. And even if we recover, so even after the attack, all 16 correct key bytes are ranked first, we still don't know their order. That's not too bad. Problem is for plain AS, 16 faculty is not big enough. So we need a different block cipher. Then also the property holds only for the first round because later on the state bytes are not equal anymore. And also the 16 S-boxes need the same leakage function. So we have this equal leakage assumption. Because if the leakage function for every S-box is random, then it's the same as having random plain texts. So can we do better than this? And this is now our contribution. What we propose is to use unknown plain texts. And so we need a pre-computation step in which we first run the leakage resilient PRG and produce 256 secret plain texts, for instance, plus one updated key, which we use as a root here for our new tree. And the way we run it is we just take one byte of X to index this table of secret plain texts and then encrypt the secret plain text. Of course, we need to take care that they are all distinct. But that's not a problem. It's also easy to show how to do this in a leakage-free way. What is interesting to see is, or not surprising, for two plain texts we are already better than the GGM construction because now we have three secret portions instead of one. But the interesting thing is that it stays like this. We lose a bit, but not much. And so what happens here is that we now have all these bytes being secret. And in an attack, normal DPA doesn't work anymore because we don't know anything about the plain text. So we need a template attack. But the templates need to evaluate information about the plain text and about this intermediate value. So it's a second-order plain text template attack. And also, again, we can't target a single S-Box anymore in a parallel setting. We always target all of them at the same time. So we again get only a single score vector. So we need profile attacks. And this key-dependent noise, which we still have, now affects a two-dimensional distribution. So it has much more impact and disturbs the attack much more. And also, now this key-dependent noise property holds in the entire algorithm and not only in the first round. OK. Then we had these attack traces. We can look at what we can do to see what's happening is we can make models for all the distributions for the different subkeys, assuming, since we don't know the key, just uniform noise caused by uniform inputs. And at the same time, we compare this to the distribution we get from the device. And if we do this for carefully chosen plain texts, we see that most of the time, the correct key bytes have the closest distance to the device distribution. And that's why they are almost always ranked first. Compared to this, the unknown plain texts, we have a much better scenario. And they look much more scattered. So this is a single experiment. We did this several times to get the distribution of the ranks. And again, for the carefully chosen plain texts, we see that they are. So this is the distribution of the subkeys in general. And you can see that they have very low ranks. And most of the time, the 16 bytes are ranked first. And then there are some outliers. Out of the 16, the best ranked one is always ranked first. And this can be used in an advanced attack. And then also to find the worst ranked out of the 16, you don't have to go very far. So usually a ranked 20 is sufficient. On the other hand, if you look at distribution for the unknown plain texts, it looks very much close to uniform, which would be a straight line. And so the median of the general rank distribution is 102 instead of 128 for a uniform distribution. And the median of the best ranked subkey distribution is 6 instead of 10. And then for the worst ranked 240 instead of 245. And so we did this 40,000 times. And the best ranked for the worst key was 110. So an intuitive complexity calculation would be 110. You need to go up to rank 110 to find all your subkey bytes. And then you can estimate the complexity like this. This is much more simplified. But OK, it's still above 2 to the 100, with probabilities smaller one. So OK. So I presented a liquid resilient PRF with bounded leakage under practical assumptions. With assumptions that are much fewer than for previous constructions, we have no equal leakage assumption. We don't need randomness for this countermeasure actually. That's also nice. And it works with a plain parallel AS implementation. The speedup depends very much on the memory. And the tradeoff is not very nice, because for m times speedup, we need 2 to the m memory. But for m equals 4 or 8, it's still OK. And then we did much more experiments in the paper. So we played with the leakage models to see if that has a bad impact or with implementation flaws and so on. And we got positive results for all of them. But of course, it's a kind of new approach and it needs much more analysis. Like masking has been studied now for more than 10 years. So more analysis is needed. And we also found out that the weakest point is the generation of the secret plaintext. So we need to take care of that. OK, I think I'm overtime already. So thanks. Let's find the speaker. I've got any questions or comments? Can you get back to your contribution, the slot that you explained to your partner? You mean the drawing? Yeah, exactly. The plaintexts are unknown now, but they are selected once and they stay unchanged. Yes, they depend on the key. So here we attacked the key. But since both are unknown, you can also turn around the game and you can go to the last iteration here. And you can fix the plaintext and you can just vary the key. So in this case, for the key, you only have 256 traces. If you have a mountain attack on a secret plaintext, you can have as many traces as you want. That's also why we extended this much. And we also did the rank. Distribution analysis was done using the full distribution that's as if you would have all the traces at hand that exist. But the point is if you do this, then you only got one plaintext. You have no way to verify it. So you need several of this to actually mount an attack on the key back. And the complexity, which you might have still, because you don't get the full plaintext, they multiply with the number of plaintexts you need. Compared to the previous one, when the plaintexts are selected intellectually, you have more secrecy here. Well, I mean, we always start from the same entropy. The point is that if we use the key with known plaintexts, we have only two fixed, always independent of the speed up. And then, yes, for the later stage, I mean, there is less known about the internal values. Any questions or comments? If not, let's thank the speaker again. Let's thank our speakers at this session. Enjoy the coffee break.