 Hello everyone, I'm Tuan from the University of Manchester. Today, I'm going to present our study about deny of service on FPGA-based cloud infrastructure. In this presentation, we'll go through our attack experiment and also the defense mechanism. First, let's briefly talk about the FPGA technology. So an FPGA consists of configurable logic blocks connected by routing channels or switch matrices. The low-level programmability allows implementing malicious circuits like a ring oscillator on the right figure here. In this example, we have a lookup table config as an inverter which enables pin A, input pin B, and inverted output pin F. The routing is done by configuring multiplications of nearby switch matrix. And that's how we design ring oscillators. The speed of a ring oscillator depends purely on the propagation delay of the loop. A ring oscillator like this could normally run up to 6GHz and that eventually draws a lot of dynamic powers. Also, a small change in delay caused by temperature or interference could vary the speed of ring oscillator dramatically. That is why we could exploit it to sense the state of nearby circuitry. Here are common attacks and defenses on FPGAs. For client-based uses, the main motivation is to steal the design loaded on the FPGA. This can be done through bitstream manipulation, power analysis, or electromagnetic analysis. To protect it, we could use bitstream encryption or authorization. For cloud-based uses, the target is the data being processed by the board. This can be done either actively using full injection or passively using side-channel attacks. Currently, countermeasures are design route checks, bitstream scanners, and design improvements. In this study, we explore attack on availability. Consider a cloud service provided hosting multiple users reliably. We aim to create excessive waste power for crushing or damaging an FPGA instance. Here we choose AWS F1 instances. We also aim to trigger resonant effects in the power regulator circuits to maximize damage as FPGA boards are likely to run on the same power rail. So the potential attack channel could be that hackers use fake ID or credit card to get access to the instance. Or for a simpler way, think of an FPGA design like a software application. There is also marketplace for sharing accelerators similar to mobile app store. Imagine that a chosen is impacted in a popular accelerator such as Bitcoin mining and put it to marketplace for free. Then it's not hard to scale up the attacks and cause severe disruption. So how severe power hammering attacks could be? To give you some numbers, only 10% of lookup table resources could draw up to 350W. That translates to kilowatt of power hammering potential. Interestingly, many of our circuits are not supported by the vendor tune and some of it can be deployed on AWS F1 instances. To give a comparison, the sun emits 6.3 kilowatt per square centimeter. Our number is not that far away. So we are now going to study about AWS F1 security architectures. Currently there are 4 security fences. We are going to explore it in the next few slides. The first fence is design inspection. The purpose of it is to prevent malicious design when user uploads their designs. Here a couple of FPGA vendor tunes are used such as DRC, unrooted net checks, design integrity checks and placement checks. It should be noted that the DRC couldn't detect all oscillators' circuits and if a design passes this fence, it can be deployed directly on FPGA cloud instances. This table shows some shell oscillating designs that could pass this fence and if it is sweet for server type of attacks, some designs are not suitable for DOS because the power gain is not enough to break the upper power limit. After the design inspection, AWS will generate bitstream from user designs and the generated bitstream are not exposed to users. Therefore, it prevents manipulating the bitstream even though the bitstream format is not publicly available. After the bitstream is generated because user don't have access to bitstream, the programming and debugging can only be done through custom API. This step will load the design onto the FPGA fabric. After the design is loaded onto an FPGA, power and temperatures are closely monitored. If high power consumption reaches, a warning is given. If power continues racing, clock will no longer be supplied and finally if temperature it reaches for some reason, the shutdown sequence is automatically triggered. It looks comprehensive at first, but we will show how to bypass this fence later. We are now going to deploy the deny of service attack on the AWS F1 service. So there are many deployable cell oscillating circuits which we have tested in the literature. For puff, we are going to use the latch by ring oscillator on the left. A latch here is configured to be nothing but a normal wire that sends signal from D to Q. For power wester, we have explored that the carry primitive is a good candidate. It is because we could create up to 8 ring oscillators out of one primitive and it can give a decent and controllable power potential. You can see how once a ring oscillator is created using the mugs inside the carry primitive on the right figure here. It behaves exactly like the lookup table config as an inverter. So in order to identify the instance that we are attacking, we use a simple puff here just for the purpose of finger printing the FPGA fabric. The right figure shows an example of puff responses. From this, we can clearly see the effect of power or temperature on the speed of puff. Therefore, we could know if the fabric has been used recently. Or put it in another way. We could also have some idea about the scheduling policy of FPGA in the data center. Here is how it looks on an FPGA layout. From left to right, first we have 3 puffs and speed 1 per dive just for curiosity since those FPGAs are stacked from 3 individual dives. Next we have our power wester and controllers. Then we have the AWS shell region. You can also see that we left one dive almost unused and that is more than enough to draw substantial power. Now let's talk about our power hammering design. We have shown previously in one of the security fans that if power limit exists, clock will no longer be supplied. That is why we need to have our own independent clock source and it's generated from nowhere other than a ring oscillator. With a pre-scaler, we can adjust the frequency that suits our needs. The purpose of the long change here is to ramp up power over time by enabling ring oscillator one by one. Together with the 32-bit counters, we can know exactly how many ring oscillator have been activated. As a result, we can determine roughly the power of an FPGA even when we lose the connection to it due to own system clock are gated. In some experiments, we can even observe that the instant terminal was freezing when power reaches critical point. It shows that the shutdown sequence is not only power of the FPGA ball but it also affects the host machine. With everything in place, here is the attack flow on the left. First step is to get an instant. Then we fingerprint it by using puff design to get puff response. Step 3 is to crush the instant by power hammering design. After that, we restart the instant and continue the attack sequence. The result is without attacking in step 3, we're likely to get the next instant within 5 minutes. However, when crushing the instant, we clearly see that the time to get next instant increased dramatically up to more than 12 hours. It definitely shows that the attack we deploy has effect on reliability of F1 service. Moreover, when examining the fingerprint, we saw that the minimum time to get the same fabric reallocated is about 52 minutes and the chance to get the same fabric in 2 consecutive experiments is about 1% over 100 experiments. With substantial downtime, we can estimate the cloud service provider loss is about 10 times the attacker cost if an attacker use fake identity. And if the malicious design was given as a chosen, it will definitely cause large disruption. So there are few mitigation strategy we could have here. First, let's talk about the FPGA Scanners. We expanded our previous work on Beach Stream Scanner to take input as user design instead of Beach Stream for ease of use and adaptation. Since cloud service provider like AWS doesn't accept Beach Stream directly, the scanner then run through a set of virus signature to check for malicious design, for example, combinatorial loops, high glitches net, high funnel nets, illegal port use. Finally, it reposts and is up to the configuration manager to decide if the design is safe enough to be loaded onto the fabric. This table shows the comparison between current mitigation methods. Hardware enhancement could likely to prevent own kind of attacks to some extent without sacrifice FPGA resources. But the cost is very high. Run time monitoring is good with medium cost, it consumes FPGA resources and also we believe it should be a secondary measure because as a time of detection the attack has already happened. For design inspection, unfortunately, DRC currently couldn't detect or prevent any kind of attacks. However, our proposed scanner could effectively prevent most kind of attacks without using any FPGA resources. We're now working on detection glitch based power hammering to complete our engine to detect own kind of known attacks on availability. So, that's the end of my presentation. I'd like to say thanks to all the people and organizations that support our work. If you want to see our project, please visit the link here. Thank you.