 My name is Fan Zhang, coming from Zhejiang University, China. This is a joint work with many institutions, including DAS, NTU, AOAA, and SJTU. Let's start with an introduction. Phototag was first proposed by Dan Bonet in 1996. It is an active tag and has two stages. The online photo-injection stage and the offline photo-analysis stage. The pair of correct and faulty ciphertexts can be used to explore the secret key. The adversaries need some equipment to generate non-invasive, semi-invasive, and invasive injections. Possible injection methods can be clock glitch, logic glitch, EMFI, and laser FI. Most of the photo-injections that are started are actually non-invasive tags. When we talk about the photo tags, we need to address the so-called photo model, such as the photo width and the photo time. Also, we need to address the photo location and the timing. Photo location means in which byte or nibble the photo is injected, and timing means in which run and which operation the photo is injected. Those who experienced with the real experiment of the photo tags, you will know that it actually requires a very precise timing control. Most of the photos are transient, meaning a short time. In chess 2018, we propose a new type called the persistent fault, which can last for a couple of rounds or encryptions. Persistent fault attack in short PFA is to solve this problem of synchronization. Since the fault persists during the encryptions, it can be injected before the encryption starts, and it doesn't require a precise timing. A typical persistent fault for AES can be injected by modifying its esports, and then the key can be easily recovered with the statistics of the self-tax. In that paper, faults are injected into esports with the technique of low hammer on AES software implementation. However, at that moment, we didn't know whether such analysis can be applied in a traditional fault attack scenario, such as laser injections. In this paper, we conducted a persistent fault attack on an 8-meg microcontroller, a very typical target of classical fault attacks. To make the attack more practical and much easier, we improved our method in several ways. First, we utilized the maximum likelihood estimation to reduce the number of self-attacks that is required. Second, we proposed a method to verify whether the desired persistent fault is injected. Third, we improved the analysis technique to handle the cases when the fault in an esports is unknown. Along with these improvements, we extended the PFA to the lightweight block type present. And finally, our attack is verified through a laser-based fault attack. So we go over the PFA on AES first. We re-emphasized the fault model first. We assumed the adversary can inject faults before the encryption of a block cipher. We prepared an environment with the injected faults and then asked the victim to start the encryption. Second, the injected fault is positioned for multiple encryptions. Third, the adversary is capable of collecting multiple self-attacks output. The watchdog counter is not considered in this case. The core idea of a positive fault attack is that the previous tightly-coupled fault injection stage is now positioned into two stages. The loose-coupled fault injection stage and the subsequent encryption stage. In total, we have three stages for now. The interesting part of PFA is that the fault is a persistent in the esports. However, the 40 elements of esports, for example, 61 marked in red in this figure, may not be accessed. Some self-attacks are incorrect and some are incorrect. So how do we do the simple positive fault analysis? Suppose the correct element is V. After the fault injection, the value becomes V star, which is erroneous. For a specific ciphertext byte, V star will appear twice and you will never see the original V. The distribution of the values of the ciphertext byte is now becoming biased. V star is with a probability of 2 over 256 and V is with a probability of 0. The adversary can exploit three types of fault injection for the ligatures. For the zero probability and for the maximum probability, he can directly know the key. For other probabilities, he can still get some impossible values of the key and reduce the key search space. By the way, the values of V and V star are known to the adversary. Here is an illustration of the analysis result. Quite straightforward. Like the figures in DPA and CPA, we plot the probability of values of ciphertext bytes along with the number of samples. If you see the red curve, which is all zero, you know the value of V. If you see the blue curve, which converges to 2 over 256, you know the values of V star. That is the core idea of original PFA. With enough number of ciphertexts, the red curve can be easily detected. In this paper, we want to reduce the total number of ciphertexts that is required to find the red curve. The problem is that when the number of ciphertexts is not enough, for example, showing the green box, there will be multiple values correspond to the red curve with a zero probability. That means the red curve will have multiple candidates. To find the best candidates among them, we need a metric to compare these candidates. The metric is based on the fact that the red curve is bound to a fixed blue curve, given the fault in S-box. Since the blue curve should have a higher frequency, the candidate whose corresponding blue curve has the maximum probability will be chosen as the estimation of the red curve. Actually, what we did is equivalent to a statistic method called maximum likelihood estimation in short MLE. MLE is a method of estimating parameters by maximizing the probability of a given observation. In our context, the parameter to be estimated is the red curve, and the given observation is our collected ciphertext. And the estimation of the red curve can be performed with those equations at the bottom of this slide. The result shows that, with the technique of MLE, the total number of ciphertexts for the red curve can be reduced to 72%, compared with the origin paper in 2018, equivalently 28% less. Next, we will explain the practical problems of PFA. For real attack, we need to solve two main problems in practice. One is that the adversary may need to try for several times to inject a desired fault into the device, so he needs to know whether the fault is really injected or not. The other problem is that it is difficult for the adversary to know which element of the S-box is cracked and how it is changed. Therefore, before conducting a practical PFA, he had to recover these unknown parameters first, including the fault of location and the fault value. So how to identify the effective injections? For AES, the S-box is accessed 106 times during the inclusion. With a 40 S-box, the probability for a ciphertext to become 40 is calculated to be 53%. That means, if our injection is effective, about half of the ciphertext will become 40. However, the principle we judge an injection to be effective is whether both correct and 40 s-box exist. This is the key point. We do not require the ratio to be 53%. There could be some misjudgment, and we find that with 20 s-box, this misjudgment rate will become very very small. And how to find the fault value F? For the recovery of fault value F, the basic idea is quite straightforward. When sufficient ciphertext, F can be recovered by XOR, the blue curve, and the red curve in this figure. With insufficient ciphertext, these two curves are not that distinguishable. So we utilize MLE to recover the F in this case. Next, how to recover the fault of index I? This is sort of complicated, so please read our paper. But the key point here is that if the fault index is unknown, the adversary has to either do an exhaustive search or to find a tricky way. To find the correct fault index through the 256 candidates, the trick is that for the correct I, the original value of the fault element will never appear in the output of s-box, and this property holds for all the s-boxes in all rounds. With this metric, the fault index I can be easily recovered with hundreds of cipetags, and the misjudgment rate is still very low. In the next part, we will show the PFA on the present cipher. Present is a lightweight SPN block cipher. It uses a 4-bit s-box. It has a key size of 80 bits and a block size of 64 bits. To adopt PFA to present, we need to solve two additional problems. First, it uses a 4-bit s-box. So if the s-box is corrected, all the cipetags will be wrong. There will not exist a case that half are correct and half are 40. So we need a new method for identifying effective instructions for present. Second, the length of the mask key is longer than the run key, and the PFA can only recover the last run key. So additional analysis is required to recover the remaining key bits. Since the s-box only has 16 output values and one of them will disappear if the injection is effective, we can collect some cipher tags to see if there are only 15 output values. If yes, we believe that the injection is effective or quite simple. Similarly, there could be some misjudgment, but the rate is proved to be very low. The recovery of additional 16 key bits, which can be done in two steps. In the first step, we analyze the last round to recover the last run key and the photo value f and the photo index i. And in the second step, we trace back for one more round to recover the full key. The attack on the last round is basically the same as that on AES. The number of samples that is required is quite small, due to the lightweight feature of present. Roughly 25 and 98 cipher tags are enough for recovering f and c-mean respectively. Due to the time limit, I escaped this slide, which basically explains how to do the attack to the last run, last but run one round. And the final result shows that for full attack on present, only about one or two cipher tags are required. Next, we will verify our analysis through a practical laser photo attack. And we found some interesting observations for laser injections on SRAM. The victim device is an 8-meg microcontroller. It is decapped on the backside to allow the laser injection. And we use the high resolution camera to take a photo for details inside. The main components are labeled in this figure, and we can see the IO, the flash and the logic. One of the major contributions of this paper is that we identify that the target of PFA could be the SRAM. And we do put this point into the practice. We set up the whole systems and shoot the laser into the S-box, which is loaded from the flash when the microcontroller powers on. We automatically scan the SRAM area with laser poses. To reproduce the work, we mark the reference point, the start and the end point. The step size is about 5 micrometers in both x and y directions. Honestly, this is a very tedious yet time-consuming part. We injected a total of 55,000 laser poses and found 4,000 bit flips. As shown in this figure, the S-box and its inverse occupied half of the SRAM. And we also deployed the present S-box, which is only 16 bytes and attached at the end of AES S-box. With this experimental data, we can find the mapping between the logic address and the physical location. This mapping shows that in this SRAM, each row contains 8 bytes of data, and these 8 bytes are arranged by bits rather than by bytes. For example, 8 MSB bits from different bytes are grouped into one row, instead of putting 8 bits together. Finally is a conclusion. In this paper, we improved the PFA to make it more efficient and practical. It requires 28% ciphertext less than the original PFA, and it can utilize unknown fault in S-box. To demonstrate that our attack is quite general, we extended PFA to the present cipher. Finally, we tested our attacks with practical laser injections, and we found some interesting characteristics of the SRAM. So that's it. Thank you very much.