 Hi, my name is Jonas Kalter and I'm looking forward to present to you today our work CPA map on the complexity of secure FPGA virtualization, multi-tenancy, and physical design. This work has been authored by me, Dennis Gennard, and Mehdi Tahori at the Karlsruhe Institute of Technology in Germany, and has been submitted and accepted for publication at the Virtual Chairs Conference 2020. Before talking about FPGA virtualization, I want to quickly talk about FPGAs in the cloud in general and Amazon was probably among the first companies to offer FPGAs as generic accelerators in their cloud computing services but other companies were quick to join and especially with emerging technologies in mind such as artificial intelligence they were offering FPGAs as accelerators in their clouds. Now with the increasing amount of resources that are available per FPGA chip the virtualization and multi-tenancy of FPGAs is a logical next step to further optimize the utilization and efficiency of these accelerators. Many researchers demonstrated however in recent publications how new vulnerabilities arise from FPGA multi-tenancy more specifically it is possible to perform fault and side-general attacks between designs that are logically isolated on the same chip. In this scenario the victim design in one partition of the chip can create voltage fluctuations that are visible on the entire shared power supply network. The attacker from his partition can craft sensors out of FPGA primitives and measure the voltage fluctuations resulting in a side-general attack and a secret key recovery for example. For fault attacks the attacker can cause voltage fluctuations strong enough to inject faults into a victim design. However while fault attacks can be at least detected by the hypervisor side-general attacks are hard to detect and the sensors can be hidden very well in seemingly benign designs. Therefore the goals of this work were as follows. We wanted to characterize and understand the mechanisms of chip internal side-general attacks and we wanted to investigate the impact of device and design parameters on the side-general vulnerability. Eventually this would ideally lead to perspectives and consequences for countermeasures and our results were that indeed placement, routing and process variation have a very big impact on the side-general vulnerability of a multi-tenant FPGA setup and this impact can in fact exceed the effect of some countermeasures and also significantly weaken countermeasures. However it might be used as a zero overhead countermeasure by itself. The hypervisor can select a secure mapping for trusted and untrusted modules in different partitions of the FPGA after evaluating the side-general vulnerability of each region. In the further course of this talk I will first explain some preliminaries then I will outline our experiment design, I will show their devices and setups we used then I will present our results and finally I will conclude the talk and give some perspective for future research. First of all I want to explain the mechanisms behind chip internal voltage side-general attacks and we see that here in the schematic different designs are placed on the same chip and while there is no signal connection in between them they can still cause voltage fluctuations and because of the shared power supply up to the voltage regulator but especially because of the shared PDM power delivery network on the chip the voltage fluctuations are visible on this entire power delivery network and the voltage fluctuations caused by a victim design in A can be sensed in B. This leads to side-general being possible despite the logical separation of the different designs. There are different ways of measuring the voltage fluctuations using FPGA primitives. In our case a clock signal is propagated into a long delay line and because the transistor delay depends on the supply voltage we can estimate the voltage fluctuations by capturing the propagation of the clock signal into this delay line and what we can see is that we have an initial delay as well as an observable part of the delay line where the propagation is captured into registers and this initial delay has to be adjusted such that the fluctuations of the clock signal propagation are visible within the observable part of the delay line and this calibration as we will explain in detail later can be done automatically. To assess the side-general vulnerability we make use of a classical correlation power analysis attack on the advanced encryption standard from 2004 and this attack is based on correlating power measurements or in our case sensor measurements with a key hypothesis model where the key can be attacked by twice and in our case we use a bitwise correlation model attacking the last encryption round of the AES encryption and the correct key among the 256 different possible values for the key byte K-Gas shows the highest absolute correlation. When plotting the correlation over the amount of collected traces we can see that the correct key byte can be visually distinguished from the incorrect ones after a certain amount of traces. This amount of traces is what we are going to use as our measure for side-general vulnerability. Therefore we define that a CPA attack is successful if for any bit B where B refers to the bit in our previous correlation model if for any bit B the correlation with the correct key is 1.5 times larger than the second highest correlation value or 1.5 times lower than the second smallest correlation value. With this definition we can use the minimum amount of traces for a successful attack as a security measure. So in the above example the amount of traces would be 55,800. Our experiments were designed to follow a line of questions that we wanted to answer with our results and first of all we wanted to know how the general placement of attacker and victim partition on the FPGA affects the impact on the voltage fluctuations so in terms of the sensor how does the placement affect the sensitivity and the attacker design and in terms of the victim design we wanted to know how the switching activity in each region of the FPGA impacts the supply voltage fluctuations and then of course we wanted to know how much the actual side-general vulnerability is affected by design parameters such as placement, routing and process variation and how much it is affected by process variation both inside the chip and between different chips. Then we wanted to know how much these physical design parameters impact side-general countermeasures and in our case how much simple hiding countermeasure is affected by the parameter choice and we wanted to know which of the design parameters are the most critical for the vulnerability of the overall setup and eventually we wanted to know whether we can select parameters so that the overall setup would be less vulnerable and that would be the final goal also for a hypervisor in a multi-tenant FPGA scenario to select parameters for untrusted and trusted modules such that the overall design is more secure against side-general attacks. For some of our experiments we might use of noise generation modules that were based on ring noise dilators or toggling flip-clops to cause voltage fluctuations in specific regions of the FPGA. These noise generation modules can be used to investigate the general impact of placement of the victim design which is represented by this simple noise generation module and it can also be used as a simple hiding countermeasure when ring oscillators are randomly activated around an AES encryption module to introduce additional noise to the system. To answer the first of our questions and investigate general dependency on global placement of attacker and victim design we defined a noise impact measure to assess the impact that a noise generating module in one region of the FPGA has on sensors on other regions of the FPGA. And in this experiment we placed noise generators and sensors in different locations of the chip and we measured the impact of the noise generators on the sensor average value as well as the sensor variance. And we defined this impact on sensor average by taking a measurement trace of length 1024 where the noise generation is activated after 512 samples. The impact would then be simply the absolute difference between the average of the first half and the second half of the measurement trace. The impact on the sensor variance is defined accordingly. To get a complete mapping of the sensor sensitivity as well as the impact of switching activity in different regions of the FPGA we placed noise generators and sensors in 50 different locations on the chip and we evaluated the sensitivity of each sensor to all the noise generators as well as the impact of each noise generator on all sensors. As the amount of sensors that we use in these initial experiments does not allow for manual calibration of the initial delay for each sensor we perform an automated calibration of the initial delay for each sensor and for that purpose we employ both coarse grained as well as fine grained delay elements which are composed either of lookup tables or carry slices. And within these initial delay elements we select the entry point of the clock signal to keep the fluctuations of the propagation of the clock signal within the observable part of the delay line where it can be captured by the registers. To evaluate the impact on situational vulnerability we define a parameter space for correlation power analysis. First of all we select four different boards of the same type to investigate the effect of internship process variation and then we select a subset of four locations from the previous experiments because we are unable to perform correlation power analysis for all 50 locations in the chip. Locations are selected such that it includes both adjacent and non-adjacent locations and we select locations that show either a high sensitivity to voltage fluctuations or where switching activity in that region has a high impact on voltage fluctuations on the chip. Moreover we chose four different strategies for placing FPGA primitives inside the AES encryption core and here we simply choose four different optimization settings in the design software to place the FPGA primitives within the AES partition. In total this leads to 256 experiments where for each experiment we collect up to 100,000 traces for the correlation power analysis attack. In total this leads to more than 25 million traces which need to be evaluated on our main evaluation platform only. For each experiment we use an identical set of random plaintexts and we always attack the first byte of the last AES encryption round where the secret key is identical for each experiment to ensure the consistency and also make the results comparable. In order to evaluate the effect of design parameters on side-channel countermeasures we implement a simple hiding countermeasure based on the previously mentioned noise generation modules which are placed as a fence around the AES encryption core and randomly activated. This leads to a decrease of the signal-to-noise ratio for the attacker measurements and makes the attack more difficult for the attacker requiring more traces to recover the secret key. In the end we wanted to answer the question how the design parameters affect the effectiveness of the countermeasure. Regarding our devices and setups our main evaluation platform was the pink Z1 board which is based around the Xilinx Zwinx 7000 FPGA SoC which has an integrated ARM Cortex A9 dual core along with the FPGA fabric. The Linux system running on the ARM core allows an easy mass collection of traces onto the board's SD card and the unprotected AES on that board can be attacked with very few traces sometimes only a few hundred were required to recover the secret AES key. So on this main evaluation platform we were able to evaluate the vulnerability across many different design parameters. To confirm our findings on a data center scale FPGA we also performed experiments on a subset of parameters on the Xilinx Vertex 7 FPGA which was placed on a board with PCI Express interface. Here the collection of traces was done through the PCIA port from a host computer and the unprotected AES on that board was much harder to attack which is why we only performed experiments on a subset of parameters and also due to the much larger size of this FPGA the parameter space would also be much larger than for the Xilinx 7000. However nevertheless we confirmed our findings on a data center scale FPGA using this second platform. Presenting all results I first show the effects of general placement on the sensor sensitivity as well as the impact of noise generators in each of the 50 regions on the FPGA chip. What we found here is that the sensor sensitivity is not consistent across different boards or experiments so we are unable to define attacker regions that are less sensitive to voltage fluctuations or more sensitive to voltage fluctuations for different FPGAs but we are able to find regions in the FPGA for example the lower left corner here where the noise generator, the switching activity in that region has a much higher impact on the sensor value than the noise generators in the upper right regions here for example and these results for the noise generators are consistent across different FPGAs. When evaluating the impact of parameters and process variation on the side channel vulnerability of the ES cores we find that the difference in the amount of traces required for a successful correlation power analysis attack reaches up to 500x so we can say that the impact of these design parameters and process variation is significantly high and in more detail we find that on the Zinc 7000 the amount of traces varies between 200 and more than 100,000 traces required for a successful attack whereas on the Vertex 7 where we collected up to more than 10 million traces the amount required varies between 90,000 and more than 10 million. When investigating the impact of design parameters on our simple hiding countermeasure which we explained previously we collected up to 500,000 traces and depending on the parameter choice we found that even more traces were required with a disabled countermeasure than with enabled defense and to answer the question on how big the impact on the countermeasure is we found that the differences in the required amount of traces for a successful attack were still up to 20x and in some cases the countermeasure could be entirely ineffective moreover the impact of mapping can supersede the effect of the countermeasure increasing the amount of required traces for a successful attack more efficiently than the countermeasure itself on the Zinc 7000 we found that the amount of traces varies between 25,000 and more than 500,000 traces our last two questions were whether we can identify single critical parameters where the choice of the parameter would lead to an inherently more secure setup and whether the hypervisor can select such parameters to make the setup more secure and here the main takeaway was that only a combination of different parameters can make the setup more secure this can be seen when computing the average across different dimensions which is presented in the table here where the values correspond to the amount of required traces for a successful attack times 1000 and we see that for all the different dimensions if we consider the single dimension only we cannot identify one parameter where the choice of the parameter would lead to a more secure design but in fact we find that the AES location in combination with the local placement strategy can lead to a more secure design even across different chips and independent of the sensor location so independent of the attacker location finally I want to conclude this talk and give some perspective on future research and first of all we can say that we found that the differences in the amount of required traces for a CPA were significantly high depending on the choice of design parameters and process variation between different chips we confirmed these differences in a different setup namely a data center scale vertex 7 FPGA and this impact is well within the range of simple countermeasures such as the presented hiding countermeasure and these countermeasures can also be significantly weakened depending on the choice of parameters for our hypervisor to ensure a more secure multi-tenant access the evictive module can be placed in a way to make it more secure and we found that the effect is also consistent across different chips if both the global location of the victim partition as well as the primitive placement within are considered the location of the attacker partition on the other hand of the sensor is less relevant and not consistent across different chips in future works we would like to determine the root causes for the observed effects for instance we were able to identify one of the boards that was significantly more vulnerable to the attacks which was also the oldest board however for many of the effects we are unable to find direct explanations eventually we would like to develop zero overhead countermeasures based on the results a hypervisor would evaluate the entire FPGA once and then apply a secure mapping for trusted and undressed modules in different partitions of the chip however a full correlation power analysis attack on all possible locations is way too costly especially when considering data center scale FPGAs so we need to find a surrogate model to perform the evaluation and then find a secure mapping of the modules onto the chip it is also important to respect these effects of physical design parameters and process variation when applying classical countermeasures against side channel attacks as we've seen in our simple ring oscillator based hiding countermeasure the countermeasure can be significantly weakened depending on the choice of parameters and this is important when developing new countermeasures and developing these countermeasures against side channel attacks with this I have reached the end of my talk I want to thank you for watching this presentation and for your interest in our work if you have any questions you can always write as an email the addresses are presented on the slide and with this I hope to see you all at the VirtualJS conference 2020