 Jaco Breer from HP NTU Lab from Singapore is going to present. Thank you. Thank you for the introduction. Sorry. Okay, so from HP NTU, this work was done with my former colleagues from NTU who are currently at NUS Singapore and at Max Planck Institute in Germany. So let me first motivate this work. So we are looking at the data flow graph of unrolled software implementation of AES. We can even zoom it in. You can see there are plenty of nodes. There are over 2,000 data nodes. So in case we want to analyze them manually, it is not very convenient and it really takes a lot of time to do that. So to do the analysis, we developed an automated method which takes the assembly implementation of block ciphers. It identifies the spots vulnerable to differential fault analysis or DFA by bit flips and it verifies by SMT solver whether these spots are really exploitable. So this type of method is sound. So if it marks the spot as exploitable, it is actually provably exploitable. Along with the method, we developed a prototype tool which is capable of outputting the attack which was identified. And based on these results, we extended the method to check how many rounds should be actually protected by a countermeasure if we want to avoid DFA to those vulnerable spots. I will explain this method on the prototype tool which we call the tool for automated DFA or on assembly or TADA in short. So the tool takes the assembly file. It generates a custom data flow graph which preserves all the parameters we need to be able to do the DFA. The static analysis analyzes this graph, identifies the vulnerable spots and after this analysis, the SMT constraints are fed to the solver which verifies whether the attack exists and then the attack can be used for the key recovery. This is a high level overview of TADA. I will explain each of the steps in more detail. Now I will just give a general overview. So of course as input, we have the assembly code and then also the number of round keys we want to recover. As in some ciphers, you don't have all the master key information in just one round key. So we create the customized data flow graph and we analyze the known nodes. So the known nodes are those we have the information about. So from the very beginning, only the ciphertext nodes are the known nodes for us and once we go through the graph and do the attacks, we will uncover more and more nodes. Then we continue with finding vulnerable instructions to DFA. Once we find an instruction, we create DFA equations and from these equations, we create constraints for the SMT solver. These we feed into the solver and the solver tells us whether the instruction is really exploitable or not. After this, we output the attack details and since we uncovered some nodes, we update the DFG to identify these known nodes. And then we ask whether we have recovered all the round keys we actually wanted to recover. If no, we basically continue the analysis on the updated data flow graph. If yes, we finish with the success. So the first part is the construction of the data flow graph. I will explain it on a very simple small cipher. So here we have just one non-linear instruction which is end and we have one key addition. So first of all, we have the ciphertext nodes which are known to us, which are stored in the memory. From there, we can also check the corresponding registers from where the ciphertext was actually stored to the memory. We also have the information on what is actually the key in terms of where it is stored so that we know which nodes we actually aim to recover. So we need some properties in order to be able to do the DFA. So first of all, when we are doing DFA, we are exploiting the non-linearity of the cipher. So we need to identify which edges that correspond to instructions are linear and which are non-linear. So in this case, only the edges which correspond to end instruction are non-linear. All the other edges are linear. Then we capture, of course, the data flow propagation. So if we have an arrow going from node A to node B, we say that node A affects node B. And together with the non-linearity, we have a parameter called distance. So every time we have a non-linear edge, the distance between the two nodes is one. And when we have a linear edge, the distance is zero. So by this, we know the distance between every two nodes which are connected by a path. So then we go on once we have these parameters to check on vulnerable instructions. If we have vulnerable instruction, which is a non-linear instruction, each of its input nodes that is not known yet can be either vulnerable node or a target node. So we do the attack in a way that we inject the fault into the vulnerable node to get the information about the data which is in the target node. And for each couple of these nodes, Tata creates a subgraph which can be further analyzed. So when we go back to our example, the instruction which is non-linear is end. And then we create the subgraph which goes from the input nodes of this instruction all the way to the closest known nodes. At the same time, we capture the relationship between the key nodes and this instruction. After this step, for each of the subgraphs, we create DFA equations and the SMT constraints. And then we call the solver. I won't be going into details on this step. If you're interested, we have all the details on algorithms and SMT constraints in our paper. What we need to know now is that the solver will basically tell us whether the instruction is exploitable or not. So in this case, for the small example, the end instruction is exploitable because the distance between the node we were attacking and the known node was just one. So what we do, we take the data flow graph from the beginning. We do the attack and we update on which nodes are known to us. So in this case, we were attacking node R0 going to end and we uncovered the data which was in R1. And at the same time, since there was Exor with one portion of the round key, we also uncover this part of the round key. So then we ask whether we have recovered all the round keys we wanted to recover. So not yet because there's still one portion of the round key. So what we do, now we exchange the target node and the vulnerable node. So in this case, we will be attacking the R1 which goes to end instruction and we uncover the data which is stored in R0. After this step, basically we have all the nodes within this simple program recovered. So at this point, we can finish with the success because we recovered everything we wanted to. Now we move from a simple example to a real ciphers. So these were taken from public repositories. We analyzed Simon, Spec, AES, and Pride. For lightweight ciphers, Simon, Spec, and Pride, the analysis took just a few minutes. For AES, it took a couple of hours. In case of Simon and AES, the tool found the attack on assembly which was actually published before. In case of Spec and Pride, we found new attacks, especially the attack which was found on Spec was interesting because it was actually exploiting the optimizations which were done in the implementation. So these optimizations are actually not visible from the Cypher design. So this is an important point that we should always check the implementation which has some optimizations so that there are no new vulnerabilities. And now the question is how we can use this information to make the implementation of countermeasures more efficient. So let's take a simple duplication or triplication countermeasure which is actually very popular in industrial applications. In this case, we have either area or time redundancy for the Cypher. So we execute the Cypher several times and then we compare the Cypher texts. In case the Cypher texts are not equal, we assume there was probably some kind of fold attack attempt and we don't output the Cypher text so that the attacker cannot do the analysis. Of course, this kind of countermeasure is relatively expensive so it is always better if we can save some resources. So instead of duplicating the entire Cypher, maybe we can only duplicate several last rounds so that the attacker cannot use the information after the attack. So we take the information from the previous method. In this case, the target and vulnerable node changes into target node and exploitable node because we know this node is really exploitable and we are now trying to find the earliest node which affects the target node in a way there are no collisions. So in case we are attacking some node and there is a propagation through the target node to some node we are observing and there is another propagation through some different path. It actually means we might not be able to recover the information about this node because there is a collision and we don't really know which path is the change coming from. So after this analysis, we find the node which has no collisions and the earliest node will tell us how many rounds we should protect at least. So for example, in case of AES, we found out we can go all the way to the seventh round and we actually found the most efficient attack up to date which is in the eighth round. In this case, if we attack any of the 16 nodes, it will give us all the information on the 16 key nodes in the last round. This is called diagonal fold attack. It exploits the properties of the diffusion layer of the AES which propagates the fault into the entire state within these three rounds. So now if you look on the other ciphers, we can see that of course, if we have lightweight ciphers which normally have many rounds, then we can actually save significant amount of resources if we take this information into account. So for example, for Simon, it is enough to protect three rounds in order to avoid the attacks to the vulnerable nodes. To conclude my talk, we found a way to automate different channel analysis on block safer implementations. The analysis works on a modified data flow graph which captures all the parameters necessary for the DFA and we check every vulnerability with SMT solver to know whether this vulnerability is really exploitable. Later we use this information to be able to implement countermeasures more efficiently. As possible future work, it would be good to extend it also to other phone models. For now we were dealing with the solvers which are not really capable of using anything else except bit flips because it would be too expensive. And we would also like to do some extension on other photo analysis techniques, for example, CFA. So that would be all from me. Also in case you are interested in automated methods in photo analysis, we have collected recent significant works into a book. It was released a few months ago. You can access it in the Springer link. So thank you for your interest. If there are any questions, I will try to answer. I have a question. If you have understood correctly for your fault model, the DFG doesn't change. So it's mean that you are injecting faults on the data, but in the case that the fault changed the instruction, this at the moment is not called. Is correct? Yeah, we are considering faults in the data for now. Okay. And the numbers that you reported on which CPU was? It was actually a laptop computer. So it was kind of average, maybe three, four years old, i7 with 8 GB of RAM. This, I mean, for the time to perform. Yeah, for the time. But I mean, just in the code that you have considered as a target of data. Okay, okay. It was the AVR implementations. Any other question? Let's thank the speaker again. Thank you. Thank you.