 Good afternoon everybody. I'm Shahindra Shah and I'm going to present our talk divided with stand united with fall security analysis of some SCA SIFA counter measures against SCA enhanced fall template attack. This is a joint work with Arnab Bagh, Dhirvanto Zapp, Devdip Mugabadhyay and Shivam Bhasin. Myself, Arnab and Devdip Mugabadhyay are from India Institute of Technology, Kharagpur. Dhirvanto Zapp and Shivam Bhasin are from Naniyan Technical University, Singapore. Fifth time elimination is real life leak sensitive information, pretty well known till now. Among several sources of information leakage, side channels are one of the most prominent one where the adversary passively observes some physical signals such as electromagnetic radiation or power consumption of the chip and from there he tries to derive the secret. In contrast, a fault attack adversary deliberately part of the computation of the chip and from the fault system responses tries to derive the secret. Counter measures are there for both of these attacks, and they have been different design quite orthogonal to it. However, one thing is common that designing counter measure for both of these attacks are very tricky and there are several failure stories. Now, we address the question that when both kind of counter measures are present. How can there be adversaries meeting their hands like they can be present together. Now, given modern experimental setups is quite feasible, and we have also shown that practically in this world. However, this area is relatively less explored. Our major concern is that when both counter measures are present, and the adversaries are present simultaneously, or they're assisting each other. Then what is the picture what happens to the counter. That is what we address in this work. Now talking about fault attack counter measures. The main theme is redundancy. In simpler words. Computation is performed several times. And if the results agree with each other, then only the ciphertext is outputted otherwise it is muted or randomized. So redundancy can be implemented at final train like for every round for every S box operation and so, but the main idea is the same for everything. In contrast, side channel counter measures tries to randomize the computation in some way so that the side channel signals become uncorrelated with the secret. So one of the most popular one is called masking, which requires fresh randomness for each every execution of the cipher. So the main idea is that we split a secret or sensitive variable into multiple random variables, which we call us shares, such that the exhaust sum of shares gives gives you back the actual value. The function which works over the shares are also separated into several component function such that their operation over the shares, if you combine them you get the exact result. Now, such splitting a function is kind of trivial for linear function, thanks to the linear properties. However, for nonlinear function is very tricky and error prone. And over the gears, people have figured out several ways for doing a proper sharing lightweight and secure one for nonlinear function. There are other tricks like hiding suffering where randomly the program sequence is altered without changing the end result. But in this talk will be mainly focusing on masking and with no doubt, masking is the most prominent one in a side channel confirmation. Now, since 2018, there has been some breakthrough results in the context of fault attacks. So which basically makes all the counter measures vulnerable to attack. So, more precisely, we are talking about ineffective faults. So what is an effective fault? Let's take an example. Say I'm injecting a fault in a four bit state, or more precisely, a bit stuck at fault, stuck at zero fault at the first bit of the four bit state. Now if I consider all possible valuations, we see that for first aid valuation, the fault has no impact of the output. So even if you are injecting the fault, we'll get correct cycle. Now, one of the most prominent attacks was statistically ineffective fault analysis, exploit this fact. And it can extract the key from the character. The fact given that if you just considered the current surface space, the intermediate where you are injecting for the gates and bias, it only takes seven values or eight values among all possible valuation. So this bias has in recovering the key. In contrast, if the attack which is published in 2020 Europe is slightly different. First of all, it is the first case of profile that are in the context of fault analysis. And it experts the fact that fault propagation to logic gates is data dependent. That is, whether the fault will propagate to the output of the gate or not depends on the inputs of the gate. And this fact actually helps to recover the secret without the ciphertext access. You just need to know whether the fault is there in the ciphertext or not. Now, one fact is common for both of these countermeasures that the bypass all existing faulted account measures, as well as countermeasures which are which combines faulted account countermeasure research and data countermeasure. Fortunately, since 2020, we're observing significant research in countermeasure development to prevent this attack, especially the cipher. The first proposals consider fine gain error correction. I mean for partial level, along with masking. There is another solution, which consider to implement the mass gates using to fully gates, so that some advantages are gained. I will discuss this countermeasures later in this talk in more detail. But here we want to focus that most of the faulted cipher countermeasures actually involve masking because masking has some advantages over cipher. And also it involves some error correction or detection operation. So there is a complex interaction between an ACA countermeasure masking and the standard faulted account measures here. Since there are both countermeasures present, so our aim is to see whether this stands against a common adversary or we need to do something extra. So in this regard, our finding says that we have still to do something more to prevent all these attacks. More precisely, we propose a CFT attack, which actually breaks certain existing cipher countermeasures. Moreover, it exposes few intricacies associated with masking while realizing cipher countermeasures. Like FTA attack, it enables middle round attacks without cipher text access or direct access to plaintiff. So let's see what is a CFT. So it is a template attack or profile technique and has similarities with fault template attack as well as side channel template attack. So the adversaries assume to have access to a device for we can know the key and randomness. He injects faults at different locations and simultaneously measured the side channel signal. Now, combining the side channel signals for several fault locations, he actually creates a template which indicates what is the intermediate state like this. Upon getting a new setup where the key is unknown. He deliberately part of those locations, which he have previously decided, of course, one at a time, measures the side channel signal and from there extract the secret. Now, the model is similar to FTA or we can also call it similar to standard classical side channel template attacks. But the gain is that over FTA that FTA only explores the fact whether the cipher text is character faulty, typically in one way information. Here, due to side channel, you get something more and that actually becomes crucial for breaking CIFA counter. Now, how we build this template, we exploit the interaction between fault propagation and error detection. So, as I mentioned, fault propagation and activation through gates are data dependent. If you consider an XOR gate in a stack at zero fault, the outcome is only faulted if the fault location contains the value one. For AND gate, even if the value is one, the fault may not propagate because it depends on the other input. If the other input is also one, you see a fault propagation. So, if you see that there is a fault propagation at the output of this gate, you surely know that the inputs are one and one. Now, how can we see the fault propagation? So, if there is some circuitry which actually reacts in the fault propagation, then there is a chance that it will detect whether the fault is propagating or not, and that will lead to some effective side channel information. And this is what error correction or detection model does in fault attack or side channel attack. Let's take a more prominent example. Here we consider a three-bit SBOX, U2, KCHAC, Zulu, etc., and we are injecting a fault at X0. Now, we see the fault manually propagates to Y0 every time, but the propagation to Y1 and Y2 are conditional on the input value. Now, given this fact, we see that if we observe the fault propagation or fault differential at the output of this SBOX, it exposes the input. And the main fact is that we cannot directly observe such fault propagation in all practical cases when you have countermeasures, right? So, for that, if we use a side channel, we still can do that attack because side channel, although abstracts the information from a hamming wave, so it loses some information, but still there is some entropy loss in this intermediate state or the input of the SBOX. So, that can be exploited for that. Let's see a more concrete example here. Here we consider the present SBOX. It's a four-bit SBOX, and we excite each of the bits one at a time with faults and measure the output signal from an error detection site. This is quite common because present at a bit permutation. Now, observing them and measuring the side channel leakage, we can construct a template of this kind. Here we have abstracted the side channel signals using hamming wave, which is noise free. The actual experiment was performed considering both fault injection noise as well as side channel noise, and we see that the intermediate state of present can be exposed roughly with 11,000 traces and fault detection. Now, this one was an unprotected, side channel unprotected version of present and only if a protection was present. Now, let's see what happens when masking in present. Now, the scenario becomes quite interesting. Because now you can also mask the error detection logic. So, directly from the error detection logic, you are not supposed to get any information, but still we get some information, at least for some specific masking schemes. We consider here domain oriented masking. Now, let's see what happens if you inject a fault at A naught. Since we inject a fault at A naught, it exposes the information of B, because it's only propagated to Q naught when B is one. The interesting fact is that it only propagates to Q one and not the other share Q one of the DOM game. Now, when considering an AC attack, we always have to think about the order of the leakage because this is what we are considering here is a faster implementation. Now, consider a second order AC attack on that that is not efficient, right? But fortunately, due to the fault propagation, if we just probe the error detection logic of Q naught, we are good. We are still exposing the information of B and that is a faster than that. So that attack is efficient and with just a fault and a faster side channel leakage, we can expose the information from this kind of construction. Similarly, this also applies to modern DOM based or ISW based schemes like Pini. And it is not limited to the faster case only even for any higher order, you can still do a faster attack because the fault propagation is something like this. So this is a clear vulnerability. Now, let's consider a situation when there is error correction on the shared values in a bit level manner. Now, why we are considering this because this has been crucial in developing some recent CFA countermeasures. So here we consider shared level error correction or each year there is some error correction. Now we see that we can see if you consider this kind of error correction logic, then when the encoding is perfect that there is no error, then some of the intermediates were strong and strong. But whenever there is an error such that this bit becomes zero or this bit becomes one. These intermediate words starts to zero. So there is a clear difference between a faulty and non faulty situation and that leaks the side channel information, which leads to an attack. Now let's see what happens for a CFA countermeasure. We consider the CFA countermeasure published in pictures 2020 which uses to fully gets the exciting feature of this countermeasure is that whenever you inject a fault within the S-box operation or outside of the S-box operation, the fault manually propagates without. But the tricky part is that the mandatory fault propagation is valid only for one output wire here shown in red. But for rest of the wires, the propagation is still ineffective. So if we consider an error detection circuit at the end, which is working on the shared values or which is working on the unshared value, this still can leak some CFA. So considering the case that error correction happening on shares, we see that due to the dumbest construction of this S-box, this error detection logic on S-0 still leaks information about A. Because it propagates through both A0 and A1 simultaneously. And that completely breaks this countermeasure. Now, if we consider error correction on shares, the logic that we have previously discussed, it still leaks the information. Now this kind of proposal has been made in countermeasures published in tips 2020 where they propose that masking and error correction on each share is good enough for CFA protection. Here we found that it might be true, but not for every masking theme, especially if you have a dumbest masking theme, then you are not good. Now, does this only applies to DOM and ISDW variants? No, the answer is no. So we see that for higher RTI, the case still persists. So here the main problem is the share compression state, which is always applied to reduce the blow up in share counts. Now, due to share compression, we can still do that. For example, if you consider this Simon S-box, which is second order secure, and if you inject the fault at A4 share and probe the error detection logic of E3 and E4 will leak the information of B. Now, this is a second order implementation and we are still doing a second order attack and the attack is efficient. This means that even for higher order TI, we can apply our attack by efficiency. We practically validated this attack on an ATMEGA board with an open source CFA protected KCHAC implementation present in this link. This is due to the top-of-the-line database implementation. We performed laser fault injection with A2 diode laser and power measurement was done for SCF. We tested unshared and shared detection operation in this experiment. So one interesting feature of this experiment is that, or what we observed here is that we do not need to measure the laser injection and perform the laser injection and measure the side channel in the same clock cycle. So for software implementations, the injection point and the side channel measurement points are several clock cycles. So what is the advantage of that? We do not need to face the noise due to laser injection on the side channel. Side channel traces were fairly clear and we see around 200 to 300 operations in both of the cases, we can recover each bit. So for each bit, we need this many number of traces of fault injection, but eventually we can recover the entire secret. In this experiment, we do a simulated evaluation on hardware setup, but we're worth mentioning that the setup that we have discussed previously can also be tuned for hardware systems, especially for some cases. But in this experiment specifically, we have simulated the faults and the power stress leakage, but the main stress was given that whether with adjacent gate leakages, can we still discover the side channel signal that we're looking for? So in these years, and we see that roughly with 1000 traces, considering all kind of noises that is present inside channel, we can still recover the secret. Now, there are still some ray of hope. We found that the countermeasure instantiation that has been made in the masking plus error correction based proposal is still secure. And here we see the non-compliance property of TI plays a crucial role. So more precisely, if we do error correction correction over non-complete parts, then we can still maintain the security. Here is an example for faster TI of present. We've just considered a part of the present S-box implementation. Now, if we inject a fault here at x20, we can still recover x2. But if we are doing error detection correction on shares, then we have a problem because the error propagation always happens to f10 and f30. So unless we do a second order attack, we cannot combine the information, we cannot extract x2 in this manner. So given a fault and a faster side channel adversary, which is that should be the case here because we are considering a faster side channel secure implementation, we are good. We cannot do that. And in this line of work, the other proposal, which is pretty similar is called NINA, which also performs error correction on shares that is also found to be secure. Another class of words, such as kappa, is found to be secure against this combined ACFT attack because it performs multi-party computation based masking and it is fairly different than any other Boolean masking scheme that we have considered so far, but we found that it's still secure. So to conclude, combined attacks are practical and should be considered for implementations having both fault attack and side-genitor countermeasures. Now, we found that having both countermeasures doesn't mean that you have security for a combined attack and we show a very potent combined adversary for the scenario which is extremely powerful. It's worth mentioning that you have just considered the countermeasures cases, but SCFT might apply to other scenarios like for public implementations or other kind of implementations such as modes of operation. Surprisingly, it breaks CFA countermeasures. And exposes certain critical properties of masking schemes which are implemented to develop CFA countermeasures. As potential future work, we'd like to analyze multi-party based schemes like kappa and some other schemes because it's not yet clear whether how this attack applies to the schemes, but there is no sign that it won't apply as well. Also, finding lightweight countermeasures in an important research direction because so far whatever countermeasures we have seen requires a lot of data detection and correction operation. So if we can reduce that over it, that will be a real gain. So with this message, I would like to conclude my talk. I'll be happy to answer your question. Thank you.