 Hello, I'm Giovanni Camurati, I will present understanding screaming channels from a detailed analysis to improve the attacks. I would like to thank my supervisors, Aurelian Francaillon and Flagstaff Xavier Standa. I am a PhD student in Eurocom, where I mostly work on the interaction between radio transceivers and computing devices from a security point of view. I also like computer architectures, embedded systems in general, and I have worked on firmware re-hosting and dynamic firmware analysis. But why are radios and computing devices close together and which security issues can arise in this case? Well, modern connected devices need both computing and communication capabilities, and for many reasons including power consumption and cost, it is very convenient to place the CPU and one or more radios on the same chip. This is called mixed signal architecture because the CPU is digital, whereas radio uses analog radio frequency components. It's a very popular architecture used for many protocols and applications, however it also has some issues. The biggest issue is that the digital part is very noisy, whereas the radio is very sensitive to noise, and since they are on the same silicon die, propagation between the two is easy, and there could be some performance issues. Additionally, in previous work that we called Screaming Channels, we also discovered a security problem. The conventional side-channel leakage coming from cryptographic operations on the digital block propagates to the transmitter, where it is basically up-converted, amplified and broadcast at a potentially large distance. Here is an example in practice, from a distance of 2 meters from a chip with a Cortex-M4 processor and a Bluetooth transmitter. If we look at the behavior of frequency over time on the spectrogram plot, we can see not only the packets, but also the firmware itself executing. We can see the weighting loop, and we can see AS. In the time domain, we can distinguish the key schedule and the 10 rounds. While Screaming Channels introduce a new threat. Now from a large distance, we can observe what happens, what is executed on a smart device which is communicating with another smart device through an intended radio transmission protected by cryptography. Differently, from conventional side-channel attacks, we do not need physical proximity, and physical proximity is a strong requirement, which is also the reason why, often, conventional side-channels are not considered in the threat model of these communicating devices. So with this idea, root cause and some initial attacks in a controlled environment, but already at 10 meters in an echoic chamber, we published a first paper at CCS 2018, and you can look at the videos for more detail. But then we wanted to look more in depth in the channel and understand its peculiar characteristics in a systematic way and improve the attacks, in particular, we wanted to make the attacks more realistic. These are the results of the chess paper that I am presenting right now. Before continuing, let me show you two attacks which are related to Screaming Channels. The first one is Leaky Noise. In this case, the coupling between the CPU and the analog-to-digital converter in a mixed signal chip makes it possible to run side-channel attacks from the ADC to the CPU. Soft Tempest attacks consist in generating and modulating electromagnetic leakages to exfiltrate data from a device which doesn't have any other connectivity capability. And the second order of Tempest attacks consist in exploiting this, plus some cascaded intermodulation effects which could be similar to Screaming Channels. Second order of Tempest attacks could use Screaming Channels or even other intentional effects introduced by, for example, a hardware torsion. And an example of second order of Tempest attack is modulating the Wi-Fi packets in order to exfiltrate data from a device. Let us now get back to Screaming Channels and the open questions that we want to answer. The first one is, are Screaming Channels really different from conventional side-channels or are they just amplified at a different frequency? Well, let's have a look at the victim. The victim is the CPU and usually we connect to it through a near-field probe used to measure the near-field electromagnetic emissions produced by cryptographic algorithms. But Screaming Channels use a completely different path. First, the leakage goes from the CPU to the transmitter through the coupling path on the chip and this can change the signal-to-noise ratio. It can change the shape of the leakage itself. And additionally, the transmitter broadcasts the leakage over the air so this is yet another channel which can again change the SNR or the distortion and here even distance and setup change. But most importantly, the leakage is not sent alone on the channel. It is sent together with the data and the data are not sent all the time but only at discrete instance of time in form of packets. Finally, for Bluetooth Low Energy, the frequency is not fixed because frequency hopping means that the channel changes following a pseudo-random sequence to avoid interference as a form of spread spectrum. Before we can start any analysis, we need to be able to extract traces, Screaming Channel traces. So I will explain how we do that for the specific case of the Bluetooth Low Energy device that we attack but there could be different types of Screaming Channels on different devices. So in our case, Bluetooth Low Energy uses frequency modulation while the leakage is modulated in amplitude so they are orthogonal and it's very easy to extract a clean trace. We cannot trigger using a signal from the device because we don't have physical access but we can use a peculiar frequency component of the trace itself as a trigger. For now we will fix the channel and we will consider frequency hopping later for real attacks. It's not important for the analysis part. Since the packets are happening, arriving only at discrete instance of time and not all the time, we often have to repeat encryptions many times before we can reconstruct a full trace and this could be interpreted as a form of time diversity to take into account the deep phase conditions between packets. We also have added Z-score normalization. We were inspired from previous work on Profile Reuse but we do this pure-taste normalization because this way we completely remove the effect of the channel. So even if the frequency that set up the distance changes between traces, the effect of the channel is completely removed and the traces are comparable. We now want to understand the leakage and in particular the leakage model more in detail. So we choose as we do in always for side channel analysis and attacks, we choose a leakage variable for example the output of the S-box and a leakage model for example the hamming weight and we compare this leakage model with the actual measured leakage. But the hamming weight is a very strong assumption. So we actually estimated the model by estimating for each possible value of y, the value of the model using a profiling set of traces. Then we estimated the linear correlation between this model and the leakage on a test set. This is the Profiled Correlation Test and as you can see it is able to capture a non-linear leakage model, any leakage model, but only a linear correlation between the model and the leakage itself. We used it to study the leakage on conventional side channel traces measured on our target device and we observed that there is no big improvement compared to using a hamming weight model which means that in this case the hamming weight model is a good assumption. However, in the Screaming Channel Case on the same device we observed that the hamming weight model leads to a very much lower correlation which means the hamming weight model is not good and the leakage model is distorted. So to sum up the amount of the correlation, the amount of noise, the signal to noise ratio is comparable in the two cases but in the Screaming Channel Case the leakage is distorted. We wanted to understand this distortion more in depth so we also built a model of the linear model of the bits of the output of the S-Box and we estimated the weights of the bits using linear regression. When we did this with the conventional case we could observe that the signal, the leakage model in red is a very good fit for the actual leakage in green. Instead when we did that for the Screaming Channels we observed that this is not the case at all. So we concluded that Screaming Channel Leakages are distorted in a non-linear way because a leakage model estimated with a linear regression on a linear model is not matching the actual leakage estimated with for each possible value of the leakage variable. As I said before the profile correlation test can only capture a linear correlation between the model and the leakage. Instead templates can capture a second order relation between the two. However when we run attacks we observed that template attacks are not considerably better than profile correlation attacks so we concluded that at least for our sample size the leakage is a first-order leakage. So to conclude when we take a one device and on this very same device we measure the leakage from the same cryptographic algorithm with a conventional setup or with the Screaming Channel setup we observed comparable levels of a signal but in the case of Screaming Channels the leakage model is distorted in a non-linear way and the leakage in both cases is a first-order leakage. So profile correlation attacks which during the profiling phase can capture a non-linear model and during the correlation phase can capture a linear correlation between the leakage model and the leakage are a very good tool to run attacks against the Screaming Channel traces. The second question which is really important for Screaming Channels is can we reduce the profiles because we are interested in profiling a device in a very simple setup and then reusing this profile to attack a different instance at a very large distance and we don't have access to this instance. So we want to compare the profiles taken with different conditions and different parameters in this sense. So let's imagine to have a profile and a tag set to taken at distance one on a device and then at distance two on another device. So in both cases the number of traces that we need for full key recovery is according to previous work related to the correlation. So the higher the correlation at one the lower the number of traces, the higher the correlation at two the lower the number of traces but can we reuse the profile one at position two? Well intuitively we need the profile one to be similar to profile two and indeed previous work says that the higher the correlation between profile one and profile two the higher the correlation between profile one and the attack set two. So the lower the number of traces that we need to attack at position two using the profile built at position one. We use this method to compare different parameters for screaming channels and we started with distance which is the most characteristic feature of screaming channels but actually distance doesn't have a big impact. Yes the number of the traces the trace signal has a quadratic loss in power with distance but the actual leakage signal doesn't really change and the distortion doesn't change. So with a good setup we can easily obtain very high correlation at a large distance and profiles built at different distances are similar between each other and this is different from conventional traces, previous work observed that while moving the probe from near field to far field there were huge differences in the leakage model. What really matters is the quality of the setup and the noise in the environment and we even saw that at 10 meters with a good SDR we had better results than at 10 centimeters in a noisy environment with a lower quality radio. The device instance doesn't have a significant impact on the amount of the signal and on the distortion of the leakage model so the big advantage is that we can profile one instance at the short distance without much noise in good conditions and then attack a different instance in a challenging setup with a large distance and a lot of noise. Here is an example with distance you can see that at 10 centimeters or 10 meters the correlation is high and the correlation between the profiles is high and in this case we also optimized the setup at 10 meters to remove the problems of distance. So can we attack more challenging targets now that we know more about the channel itself and the answer is yes we have first tried a more complex environment than the anechoic chamber so a real world environment with obstacles and we used spatial diversity to combine signals coming from different paths then we tried to increase the distance in an office environment we reached 10 meters and then 15 meters but I have to say that at 15 meters building measuring an attack set is really challenging and requires a fine-grained tuning of the setup but fortunately we don't even have to try profiling the device at this distance we can reuse a profile built on another device connected via cable in very convenient conditions much before at 34 meters we can still see a leakage signal from the t-test and at 60 meters we can still extract as traces we tried to attack the hardware as block it's very interesting because it's used in many applications including the link layer encryption unfortunately its consumption it's very low and we can't observe any leakage from the internal computations but we can observe some memory transfers both in firmware mm copy and in hardware dma transfer to the peripheral only simple power analysis attacks are possible in this case and unfortunately we didn't succeed in fully recovering the key or the plain text we then moved to attack in a real system so we looked into google at distant beacons they are small devices that transmit an identifier and a url for example with some information about the museum close to a monument and also some telemetry possibly encrypted and the identifier can be ephemeral they were really used for many iot applications like physical web and proximity marketing they were really used in market chains and and so on and so forth we need to configure them and for that the owner can connect and authenticate at the gut layer using a pre-shared as key as you can see these beacons differently from other beacon protocols were designed with security and privacy in mind in mind indeed telemetry can be encrypted the ephemeral the ids can be ephemeral to avoid tracking and configuration is protected by a as protocol let's look at this protocol a bit more in detail we have a beacon we have an owner they both know a pre-shared key the owner wants to connect to the beacon so the owner reads the unlock characteristic and the beacon replies with a random challenge p which is the plain text now both the beacon and the owner compute the as encryption of this plain text using the pre-shared key and then the owner writes the result in the unlock characteristic again now the beacon has both cipher text and can compare them because if they are the same this is the proof that the owner knows the pre-shared key but let's now take the point of view of the attacker the attacker can generate can trigger encryption as encryptions with the unknown key and the known random plain text by simply reading the unlock characteristic and this is the part in red in this key so since we are talking about screaming channels the attacker can then listen to the radio channel to observe the traces corresponding to the encryptions that the attacker triggered unfortunately this is a real-world example we have frequency hopping enabled frequency changing following random sequence over 37 channels and this is really hard to follow because we have to know the sequence we have to have a regular receiver which is fast enough to switch channels or with a large enough bandwidth to cover all the channels at the same time fortunately we had the idea to use a known standard feature of the bluetooth protocol to block most of the channels and this feature is the channel map and when the owner or the attacker connects to the device this feature is available so by reducing the number of channels to two the attacker can now just tune into one of them and have a high probability of observing packets with encryptions inside so the threat model is that the beacon is not protected from such channel attacks because there is no physical access to it but the attacker can connect to it over the air and then over the air measure the traces of the encryption that the attacker can trigger without having any knowledge about the authentication key we used a realistic demo from the Nordic SDK with optimized code hopping enabled but reduced using the channel map and a tiny AS software implementation I would like to highlight the newest versions of the SDK use the hardware blocks so they're not vulnerable as of now to screaming channel attacks we have a proof of concept attack we attack the device connecting to it via cable to simplify the setup and reduce noise and we still need enumeration which is possible with the protocol that I have shown but very slow so this is just just a proof of concept but it shows that screaming channels can be applied and can be a problem in real world applications protocols and iot devices if screaming channels are a problem we need countermeasures and we're not talking about secure devices with a lot of resources we're talking about very constrained devices hardware and software countermeasures which are expensive in terms of design or cost for the license are out of out of scope here so it's more interesting to look at countermeasures which are specific to screaming channels and not expensive at all like turning off the transmission used during sensitive computations or forcing the use of hardware encryption at least for now from the hardware point of view during design we could consider the coupling between the parts and the security implications that they can have but again this can increase the design cost and the testing complexity so to conclude we have shown a general problem side channel attacks become possible at a distance if radio devices are close to sensitive devices we have seen that the screaming channel attacks are different from a conventional side channel they're peculiar on on on the one hand they are easier to exploit because the leak is amplified on the other hand they are harder to exploit because the leakage model is distorted and the leakage coexist with the intended data on a noisy channel we have shown that attacks can be more and more realistic more devices could be vulnerable so we need clever and specific countermeasures we would really like to conclude to to attack wi-fi with the same method but in this case the coexistence between leakage and data is even more problematic because they're not orthogonal in modulation and we would like to complete the attack on the hardware as possibly exploiting these memory transfers we have published our code data and instructions online don't hesitate to check the website if you have if you already knew the previous version this is had a big update and every result in the paper can be replicated don't hesitate to contact me if you have any questions or if you want any help for reproducing our results and please come to the live session as well if you want thank you very much