 So our next presenter will be presenting SIDH on arm faster modular multiplication for faster post quantum super singular isogen Oh, that's the wrong one. That's the wrong. Yeah Sorry wrong page. So this is actually FPGA hammer Remote voltage false attacks on shared FPGA is suitable for DFA on a yes presented by you and Jonas Kraut I Thank you very much. It was good enough. Thank you very much. Good morning, everyone and Welcome to my talk on remote voltage fault attacks on shared FPGAs Before I will go into detail. I first want to motivate a bit why we are considering shared FPGAs at all so With the increasing amount of resources we have per FPGA chip They are increasingly Considered for usage in multi-user environments. So a lot of providers are introducing them to cloud computing We see a lot of system on chip variants where we have FPGAs coupled very tightly with a hard processor and The Linux kernel actually already introduced support for Partial reconfiguration of FPGAs Which gives us multi-tenant FPGAs where the accelerators per user can deploy it to partitions on the same chip So and this opens up a range of new attack scenarios two of which have been already considered in previous works such as on chip side channels or denial of service attacks and for this work we wanted to consider fault attacks and As a proof of concept we successfully Deployed a differential fault attack on AES So the threat model we are considering is a single shared FPGA fabric So that also includes a shared power distribution network for both adversary and victim and The designs on the FPGA are in logically isolated partitions But we have some kind of public interface in the victim process which is running on a CPU And which can be accessed by the adversary to make the victim use his cryptographic implementation on the FPGA So this gives us for in the case of AES gives us this Jolson plaintext attack scenario Where the adversary can just issue requests With plaintext pseudo victim process and get the ciphertext back So in the further course of this talk, I will give you some background information on the mechanics behind these attacks I will talk about how we specifically designed the fault injection and analysis I will present details on the hardware we used in the experimental setup. I will present the results then I will discuss them and give some perspective on future works and Finally conclude the task So for the background information we first need to talk about these power distribution networks Which basically include all the interconnections from the voltage regulator on the board down to each logic element in the chip and they are usually modeled as this mesh of resistive inductive and capacitive elements and And the influence of the inductive and resistive elements on the Power supply voltage are reflected in this law of inductance and we can see that a high current variation can cause a power supply voltage variation and A lower supply voltage can eventually cause timing faults in critical parts of a design on an FPGA for example So the logic element we use to cause high current variation Is ring oscillators and they have been already used in previous works on denial of service attacks to crash FPGAs And we use not only a single one, but an entire grid of ring oscillators to have a high impact on the power supply voltage So the principle is that the high oscillation of these oscillators the gate switching of them causes a high current variation and eventually A voltage drop to inject a fault into another design on the FPGA and We found that it's not enough to just switch it on and let it oscillate But the ring oscillator grid must be toggled in a very specific way And we identified three parameters that have an effect on the success of the fault injection Which is the frequency duty cycle and the initial delay of this toggle signal So for example in this diagram you see the Externally measured supply voltage of the FPGA while decreasing the toggle frequency in the area between the red bars So the fault injection the fault analysis we used is a very well known by Piri from 2003 and the original scheme Intends to inject single byte faults before the 8th round of the AS encryption Which leads to all output bytes To be faulty so they can all be attacked simultaneously But since we needed to get a very high precision to inject before the 8th round We decided to inject before the 9th round instead and attack only four bytes at a time So this allows us since the propagation of a single byte faults before the 9th round Results in a specific set of four bytes in the output ciphertext to be affected This allows us to verify the successful injection from the ciphertext. Okay, we can filter out basically a whole lot of faults injected at the wrong time and We developed some kind of calibration For the attacker where he can first issue an encryption request with a plaintext X to get the correct ciphertext and then continuously Issues encryption requests with the same plaintext while activating the ring oscillator grid with a very specific parameters and He can then as previously explained verify the successful injection from the output ciphertexts and Can adapt these parameters accordingly and reissue requests to the victim until He found parameters where he could successfully inject faults at the right round of AES and then continue with the actual attack so this calibration needs to be only done once for a specific board and Then we can just Continuously perform new attacks on the same board The hardware we used is these two boards from Terrace like the DE1 SOC and the DE0 nano SOC So we used the three boards of the same time type and two different boards in total to show the generality of this attack and how the calibration can adapt to different boards and All of these boards are based on the Cyclone 5 FPGA together with an ARM Cortex A9 on the single chip And we have a Linux environment running on this ARM core. So This essentially gives us the entire thread model on one SSE We have attack and victim running software on the ARM core and they have their respective IP cores on the FPGA fabric and We only did the fault injection part on this system on chip and Collected faulty ciphertexts and the key recovery was afterwards done on the on a PC So for the results we first evaluated the general fault injection rates for 1 million requests We're such with respect to the number of ring oscillators used by the attacker and we performed these experiments first on the DE1 SOC board where the AS design of the victim was fully constrained so no potential timing violations were reported by the development development tools and We distinguished Usable and the total amount of faults so Total amount of faults is just any kind of fault appearing in the output ciphertext While only the usable faults can be used for key recovery. So the correct four bytes in the output ciphertexts are affected and We see here in this diagram Where the blue line is the total amount of faults and the red line the usable faults for DFA we see that the injection rate in general increases with the amount of ring oscillators But we have some at some point here for this board after 44 percent Resource usage by the attacker We have the the case that the curious see decreases actually so the amount of usable faults decreases again Simply because the ring oscillator grid is has too much effect on The victim design and the calibration cannot find any parameters anymore to adapt to this new situation This attack we this evaluation of fault injection rates We extended to three different DE1 SOC boards and We see that all of them are vulnerable and in general the calibration can find the appropriate parameters to Inject the faults with the needed precision But we also see that Due to process variation There's a different optimal amount of Ring oscillators used by the attacker for the attack, but you can Simplifying the amount that works on all the boards. So by looking at the overlap of the different evaluations and Actually, we also evaluated the actual key recovery on for five thousand random keys And these experience we also did on the DE1 SOC with the best configuration for each specific board for the fault injection And we see that the majority of keys could be recovered. So about 90 percent of the keys can be recovered for each board We have a couple of keys with only a few candidates remaining two or four candidates Which can be easily finalized with a brute force search, but we also have some keys Which cannot be recovered because this verification by looking at the the output bytes of the output ciphertext cannot distinguish Between some multi byte faults and single byte faults injected before the ninth round. So some multi byte faults Are still collected but cannot be used for key recovery Which leads to this few non-recoverable keys So we showed this attack on a fully constrained design on this D1 SOC board With less than 50% resources used by the attacker on The smaller DE0 Nano it has only half the amount of resources The fully constrained design was not vulnerable So we see that not all devices are equally vulnerable in this case The power supply was the same on both boards But with only half the amount of resources the attacker could not Attack a fully constrained design with less than 50% of these reduced resources and There may be also alternatives to using ring oscillators So there may be other malicious logic that causes a high current variation and eventually voltage drops and We also thought about extending this attack To other devices connected to the same power distribution network such as in this case the arm core on the SOC We also discussed some Possible mitigations for example by using internal sensors TDC base probably to detect an attack or Use bitstream checking to identify malicious logic in an attacker design and Finally what we can also do but requires hardware modifications is Putting different designs on different voltage islands on the FPGA sacrificing some flexibility, of course So in conclusion we showed how high precision fault injection on shared FPGAs is possible and logical isolation between designs is not enough to prevent manipulation and We showed that this thread model must be considered if we want to use FPGAs in multi-user environments and Mitigation may even need some modifications to hardware or new hardware architectures with that I want to conclude my talk and You're welcome to ask questions Thank you Any questions? Thank you for Thank you for your nice talk Thank you. I want to ask that you are you assuming the ring oscillator to be already present in the FPGA No, the ring oscillator is introduced by the attacker. So basically we have some architecture where Multiple users can put their accelerators through partial reconfiguration on the same FPGA Okay, and the attacker can just simply deploy a grid of ring oscillators to attack another user who probably Put his AS on module on the same chip Hello. Yeah, so the attack model as it seems to me is like somebody has already Reconfigured the FPGA and put some ring oscillator and now you are an attacker. You are going to use that. Yes I mean not necessarily use an existing structure for the attack But deploy your own ring oscillator grid onto another another partition of the same FPGA Okay, got it. Thank you Okay, so did you somehow have to have to modify your power supply of your victim boards? No, it wasn't modified. It's just Stock. Thank you about the the different Fault rates you were getting like this is the success rate on the different SOCs you were testing Yeah, do you have any intuition in the intuition why You were seeing those differences Yeah, I mean, I think the the oscillation of the ring oscillators is also highly dependent on the process variation so maybe the attacker design was simply more effective on some of these FPGAs because Due to process variation the oscillation was more high-frequent or something. Yeah So we still have time if anyone has any questions. Oh, there's some waiting at the back Hello And did you test on Amazon cloud service like it was the new service attack? No, we did not we did not test it yet But does it work like in the I don't know if the Amazon cloud service allows the other partition to access the same Inputs and outputs of the other partition. I mean, we don't necessarily need a direct access of the inputs of the other partition the idea was that the victim has his software process running and Using the software process uses the AS module or whatever this victim put on the FPGA and The attacker can access the software part through some kind of public interface which makes the Victim use his AS module. So it could be also a replay attack or something like that So there's no direct logical connection between the designs on the FPGA But the attacker can just issue requests to some kind of software interface provided by the victim Does it answer your question Yeah, but I mean if you technically are not cloud service, I would suspect there will be some software isolation between process so Sure, but I mean the victim put his AS module or whatever on the on the FPGA to use it for something, right and And so in some scenarios, maybe the attacker may be able to make the victim Encrypt the same plaintext twice which is all he needs right to perform a DFA or there Maybe some other attacks on other ciphers. This was just a proof of concept. So In the real-world scenario It might be different. Yeah This model of FPGA supports partial record configuration or you will use the FPGA in Thailand just throw it I mean we used because it was simpler We used just a single design and put it on the FPGA, but it supports partial reconfiguration. Yeah, okay Great thing. So thanks speaker again