 All right. Thank you all for coming here. My name is Peter Robbins, and today I'll talk about Lora Reverse Engineering and AES EM Sidechannel Attacks using SDR. So if this kind of sounds like two subjects were mixed together and then presented in one talk, well, that's exactly what happened because I actually submitted two talks and then the organization kindly allowed me to mix them together. So I'm really happy about that and thank you very much. So first, something about myself. I'm a PhD student at Hustle University since 2014, and I'm mainly an InfoSight guy. So I research wireless security. My job is to find vulnerabilities and wireless protocols. I also did some work regarding location tracking and fingerprinting. As of lately, I added some machine learning and Sidechannel analysis too that makes because that's what interests me. So I always try to do that. If you have a question or want to visit my website here is the link and we also have some time for questions at the end of this presentation. So first of all, my motivation for starting my research on Laura started in 2016. So at that time, Laura was a relatively new protocol and my co-advisor introduced me to it. I don't know how many people have heard about Laura before. Okay. So a lot of people. That's very good. So for those of you who don't know, it's basically just low power, long range and low data rates protocol designed for the Internet of Things. And because of this, at that time, there were a lot of new opportunities to learn new things. For example, there was no working software-based decoder available, only some simulations, but they didn't really work with the effective hardware. And then there was a second problem was that the description of the file layer was pretty limited. So there was this one patent which explains a lot of things about modulation, and then some scattered blog posts, but really to build a fully-fledged decoder, you would have to do some reverse engineering as well. And then there is the aspect of fingerprinting and tracking devices over long ranges which is interesting and it's an interesting problem we can tackle with Laura. And finally, situational attacks on IoT protocols are interesting because if you have for example a temperature sensor and you would deploy it somewhere, then obviously you will be more vulnerable to attackers that gain close range attack surface. So let's move on to part one, which is about unlocking the PHY. And with this I mean just gaining access to the PHY layer instead of just the MAC layer. Because if you would buy a lower device of the shelf, then the only interface you have is basically just over USB that you can program it. You can program it to send some MAC layer messages, but that's all you get. So we need to synchronize to the packets on the PHY layer. So where do we start? Can you radio to the rescue because we can just, this is how you would usually start when reverse engineering a protocol is just take your USRP and then dump it to file so you can analyze what the signal looks like. All right, and then you get something like this. So here you can see an example, Laura frame and in the beginning, you see the standard stuff like preamble which is just all zeros. And so this one little sawtooth wave is just one symbol and this repeats for a number of times and then you have the frame sync and frequency sync was function is just to make sure that the receiver is synchronized in time and frequency to the transmitter. And then we have a header which basically encode some information about the coding rate used, whether there is a CRC in the payload, and then we have the payload itself. And actually in the case of Laura, there can be some symbols from the payload in the header as well. So this frame structure can be easily derived from the patent and here's the link if you're interested. It also contains information on how exactly they encode data into this format. So the modulation and interleaving. Some other information is located in data sheets. For example, whitening and coding is not covered in the patent but it's somewhat discussed in the white papers. So with that, let's start building a receiver. Now usually the first step that you would do is detect a signal. And detecting is a pretty standard problem in signal processing so you can choose really any algorithm that works for you. For me, I chose this algorithm which exploits the autocorrelation of a repeating preamble. So you just take the autocorrelation between these two for example. And then you will see that as the symbols get more similar, then you can see the metric rising and then it approaches one, for example. But now we still need to find out what the start is of a single symbol. So I drew these red lines on the graph and those are drawn by myself. So we still need to find where this position is exactly. And if you would threshold on this metric, then that would probably be a very bad approach because you would end up something around here or around here which is not ideal. So thresholding is bad. And to synchronize, again you have multiple possibilities. I saw some people that demodulate the preamble symbol which is supposed to be zero. And therefore if you have an offset from zero, then that indicates a time shift which is the basic principle of lower modulation. But in that case you would have some ambiguity because if you have a frequency shift then that will also cause an offset from zero. So what I chose to do instead is I just cross-correlate with the instantaneous frequency and then you get a metric that looks like this. So you locally generate a preamble yourself and then cross-correlate it. And it will be near or close to one at the start of the symbol. So with that you can synchronize to single symbol. Okay, now the next question is how do we demodulate single symbol? So the modulation of Laura is based on CSS and it uses chirps. So a chirp is basically just a signal that linearly increases in frequency over time. So it starts at a negative frequency, then you can see it increasing until it reaches zero and then you can see the zero here and then it increases back. So to modulate a value i, for example, onto the chirp, what you would do is you would psychically shift it. So in this example, you can see the signal is shifted to the left and this will result in something that looks like this. And for this example the value is 20 but normally you wouldn't know that because the patent is very secretive about how this shift is really mapped to a real value. But now we have seen how to modulate it but how do we demodulate it? So when you receive a Laura signal, you receive chirps that are already shifted. So we need to do the reverse operation and that can be done by multiplying by the conjugate base chirp. So just the base chirp with the imaginary parts negated and then resample at the chirp rate and the details are not important but it boils down to when you do that and you take the FFT, then you get a peak in the frequency domain where the modulated value should be. So for example for the preamble which is always zero, preamble symbols are always zero so the peak you will get is at index zero. For the shifted symbol, there's something strange going on because you can see that it's not exactly 20, it's 24 and that's because this index is what's called gray coded. Okay, so after the demodulation you have basically a list of chip values and now we need to do some other things to get to our ultimate data. The interleaving algorithm is described in the Laura patent as well and it's dependent on two configurations. The first is the spreading factor which determines the number of bits per symbol value. So here you can see the spreading factor in this direction. In this case the spreading factor is seven and then there is the coding rate which determines how many symbols there are in the interleave matrix. So this should correspond to the coding rate. So in this case it's eight since we use four eight coding. So now if you look at this matrix what Laura does it essentially transposes this matrix and then walks through it in a diagonal fashion. So for example for the last code word you would just start here then move upwards in a diagonal way and when you reach the top you just continue from the bottom and this is pretty easy to implement in software so you can just do this for the entire matrix and you will end up with a list of the interleaved code words. Now as you may notice the advantage of using this technique is that if you have for example this symbol right here if it gets entirely corrupted then it turns out that only the third to last column of bits will be corrupted and this can be corrected for since we use timing coding so we can just correct this code word. Now what's left to be done? So in the previous slides I told you guys that the index is gray coded but how would you reverse engineer that because it's not in the patent or it mentions grain coding but it's not clear whether this is which index corresponds to which integer value and also at which stage of the coding is widening performed and how that's also something that's not known. There are some more to it for example the structure of the header that's used in Laura the clock drift correction and then there are some weird stuff like Laura appears to swap nibbles which is something that GSM also does if I'm correct and it's some weird CRC so we won't have the time to discuss that in this presentation so. Anyway on to the relation between the symbol and the integer value so here we have our example again with I equals 20 with a certain shift and there are multiple ways to interpret this for example you could interpret the x-axis as is so this is equal to 24 but then the question is do we use gray encoding or gray decoding to get the real value of I and similarly we can interpret the x-axis as inverted so in that case this would be 103 or we can do the gray of 103 so check for correctness what we can do is we can implement the decoder up to the interleaving stage and then start looking for patterns if there is some pattern that doesn't match up with other payloads then you know that there is something wrong with your assumption so let's take a look at an example so in the top left quadrants I have printed out all headers symbols when using right to left indexing so 127 to zero and using gray encoding to get the value of the symbol so I'm gonna give you some time to look at this and let's see if you can figure out where the length is used here is the length that I so this is something I added to myself this is just the length of the packet that I sent so we should see some pattern that stays the same here and then increases here and again stays the same here and then increases in the lower nibble and so if you look closely you can see that this pattern appears to emerge here because you have this looks like zero and then it's one, two, three then you have here one and again zero, one, two and the ones in the Tuesday even correspond but there's something strange going on because I said before that the header is not widened and it appears to be that in this case for example a two is not equal to two it's just all zero so there's something really weird going on and also if you take a two from the highest nibble and go to the lower nibble then you can see that they don't match so that's not correct then let's go to gray decoding in that case it's even weirder because in this case you'll have some bits that are added or removed so this indicates that there's something really wrong with the previous layers of your decoder so we can actually discard this whole column right here and the right solution turns out to be left to right indexing and gray encoding and in this case you would get like you would expect that a zero is effectively zero as a hamming code and it's consistent with the highest and lowest nibble the only weird thing is that the order changes first we had the length here and now it appears in these two columns all right so now that we have reversed the interleaving and all the other aspects there are still some two more aspects to reverse and that's the coding but since we use four, five coding and four, eight coding as options that's already a strong indication that probably hamming coding is used but those of you who know about hamming coding might notice something strange because the indexing doesn't really or the indices of the bits don't really match up with standard hamming coding and also it turns out the payload is widened so that means XORed with a random string basically and this is to make the data more uniform they mention the algorithm used for that in a datasheet but this doesn't appear to work in practice so in that case you need to answer another question and that's in what stage is this data widened all right so the fast solution for me was to just brute force all possibilities for the widening sequence what you can do is you can just send a payload with all zeros because if you XOR something with all zeros and you just get the value itself so in that case when the result of our payload is all zeros we just end up with getting the widening sequence which we can store in an array and then use for future packets now if anyone has an idea of how to do this more algebraically I would be happy to learn it because there is a library somewhere that seems to have an algorithm for generating it algebraically but the widening sequence doesn't really match up with my own and also if you try to decode you get some weird values sometimes so and then there is hamming code so as I said before the bit indices are not equal to standard hamming so they are permuted and so we can just go over all possible permutations of a single byte which is not that hard to calculate so it's only this amount of possibilities and it was pretty easy to brute force all right so now we have all of the required components linked together in order to build a fully functional decoder so we have discussed our preamble detection sync, modulation, interleaving, widening, coding and then we have our route data I have implemented a decoder for this and it's also open source on GitHub if you wanna check it out feel free to do so I made a comparison with real hardware for various number of payloads and as you can see there is it's still quite a lot worse so there is still some work to be done for example the real lower hardware is capable of even decoding packets that go below the noise floor so whereas my because of and that's because of the methods that I chose for the preamble sync and or preamble detection and synchronization so there's still some work to be done here an advantage of my approach is that it's virtually not affected by frequency errors so and that's because of the gradient decoding that I implemented but essentially this allows you to decode multiple channels at the same time and that was useful for my research all right so special thanks also to my students William for implementing some optimizations there are some other decoders out there which you should also check out there's Laura SDR and then Bestie Research is G.R. Laura so maybe they will work better for your use case all the bands I guess and then let's move on to doing something with the decoder so let's go to our first application which will be fingerprinting Laura devices using neural networks now you might wonder why would you want to fingerprint devices and people usually give us the advantages in the defensive use case is for example an extra layer of defense for critical infrastructure suppose you have some kind of temperature sensor that sends highly important data for example you don't want an attacker to spoof temperature readings and don't want to make it too hot in your house for example then you could use something like that to just check whether the fingerprint matches also what you can do is counter against relay attacks for example somebody wants to clone your car key signal and forward it somewhere else you could perfectly detect that using a fingerprinting algorithm or you could just measure the degree of privacy provided by your device so how unique is your radio signal essentially and the offensive use case is exactly the opposite so you could use it to link anonymous transmissions for example people have used this before in the case of MAC address randomization so even though the MAC address is randomized in that case you can still go back to the physical layer and fingerprint devices on that level and de-anonymize them based on the physical layer signal so and then there is tracking the location of sensors or mimicking radio signatures so that's the opposite of providing this layer of defense and so this is kind of a cat and mouse game because as an attacker if you have knowledge of the algorithm that's used to fingerprint you can probably exploit this and craft your own signal that more closely mimics the signature of the radio device that you're trying to spoof so it's really a cat and mouse game between attacker and defender now the theory of physical layer fingerprinting says essentially that no two radios can be perfectly identical so when a radio chip exits the factory it will have some kind of variance between the crystal oscillator frequency for example the components and these small errors will manifest as transmission errors because for example if you have a faster crystal oscillator or a slower one this will affect the frequency offset of the device now usually in datasheets they have defined some tolerance for these values for example in the case of Laura I believe you can have 12 kilohertz of offset so in a larger tolerance also implies more entropy of course because the more you allow your devices to have different properties the more the probability is higher than that you can fingerprint them of course so the challenge here is to distinguish these particular errors caused by the radio hardware from noise from the channel now traditionally people have used what I like to call expert features so people would just think about what would be useful or useful features to use to fingerprint the device for example they could think about okay what are the properties of Bluetooth and for example is a preamble transient of Bluetooth signals interesting and then they would do some human selection on these features and then take some statistical measures for example the mean or variance or just use some SVM to train on this data but what I suggested in my approach is that we simply use a machine learning algorithm to train on the raw radio signal and this line of thought is similar to the computer vision world where you have similar techniques applied to face recognition and image classification so you can see an image as a 2D radio signal essentially and so what they do there is just take the raw pixels and use convolution filters and stuff like that to come to the recognition of the face and so this is similar to the objective of fingerprinting radios. To simplify this comparison a little bit so here we have the raw radio signal and this is what I would like to call them the human filtering so you have for example somebody that decides okay we're going to take the FFT and then extract the mean of this signal or the variance. However what we can also do is interpret the radio signal as the features themselves so instead of these switches we just use the raw samples and then forward propagate them through a neural network and so the goal is that the machine learning or the idea is that the machine learning algorithm will automatically filter those unimportant features away to the weight values so. How would it work in practice? So we label our transmission with a certain Laura device. We feed the data true and then based on the weights and the biases it will calculate a certain value for example here so this is the predicted value and we will evaluate how accurate that result is compared to the true distribution so in this case the true distribution is that Y3 is a real Laura device while the machine learning algorithm predicts Y2 so we have to update the weights and biases and repeat the step in order to come closer to the real distribution. I've done this in an experience where I have fingerprinted 22 Laura devices from three different vendors and to do so I just for example for the next slide I will show you a simple MLP so multi-layer perceptron one like this and I used 100,000 symbols of training data 1000 symbols of test data and this turns out to have 95% accuracy. However and there is a strong nuance here because this strongly depends on the atmospheric conditions in the room and also the channel so you are inadvertently also fingerprinting the channel a little bit which makes it so that if you fingerprint a device in one room and then move to the other one then it won't work as reliably anymore. So yeah noise is a problem and so here are the results. These are all the RN devices. These are the two ARAF devices and this is the SX device. So you can even see that between vendors there appears to be even more of a difference between the devices than within a single vendor so that was quite interesting for me to see. By the way what this visualization shows is just the other output features projected onto a 2D plane so you can easily visualize what it looks like. All right so now we've seen how to fingerprint a device and you could potentially track it down. So let's suppose that you want to track down your neighbor's temperature sensor again and you not only want to track it but you also want to know how hot is my neighbor actually. So you want to decrypt the messages that are sent out by the temperature sensor. And for that we can use sidenal attacks. A sidenal attack is basically an attack that exploits the fact that while your processor is doing some kind of computation it leaks some information to the outside world through a side channel. So an attacker can gain advantage of that based on a disleaked information. And there are numerous types. For example you have timing, some algorithms will take longer or slower or some parts of algorithms will take longer or slower. You have acoustic power consumption, temperature, cache and those might even be correlated. So you can use multiple side channels at the same time if you would want to. So yeah so we're going to apply this on AES because it's used in Lora, it's used in Wi-Fi, TLS, IPsec, a lot of apps also. So there's definitely a lot of potential. And the attack techniques that I'm about to present are nothing new by any means. But the thing is that they often require some expensive equipment and I would like to show you guys how you can do it with a cheap SDR for example. So how does the general attack methodology work? So as an attacker you would first communicate with some kind of for example web service. You would send some data for the Lora device to encrypt, the gateway will forward to the Lora device and the Lora device will have to respond so it will have to do an encryption. And by doing that it will leak some information through the electromagnetic channel. And the attacker can then use this information, capture it and then perform some analysis. Now for doing this analysis we need a model and a model is basically just a way to mathematically predict how that's a certain input to a chip will result in a certain electromagnetic radiation and therefore of course the accuracy of the model is very crucial for a successful attack. How, now some observations that previous works have already made is that the amplitude of the EM radiation is proportional to the power and that you need power to change the state of the circuits of course because if you change from zero to a one or from one to zero you need to use some power. So that in turn makes it that state changes also cause variations in the electromagnetic radiations and out by device. Now what would happen if we would AMD modulate AS encryptions? So what I'm about to show you now is an example where I did this for an ATmega328P. In this case it was a device provided to me by a company called Riscure. They held kind of a competition to perform a side channel attack on a black box so the key was absolutely not known to me and you had to find it out by doing a side channel attack. Now looking at previous works we can learn that lower frequencies must be favored. Apparently there is more information than that and then the harmonics of the CPU clock frequency also contain useful information. So now let's take a look at using a USRP an amplifier and an EM probe. So with this I took about 18,000 samples and then you get something like this. Now as you imagine I was quite disappointed when I first saw this graph because this just seems like random noise, right? So it is really noisy. But after doing a low pass filter there's something more interesting going on. So here you can see there are some higher amplitudes here then it goes back to a lower amplitude and back to a higher amplitude. So it almost seems like there's some power being drawn away here and that's interesting. Also if you look at the individual colors of the graph you can see that there is a repeating pattern. So maybe if we use our trick from lower synchronization and apply it to this maybe we can get a better result. And that turns out to be true because if you line them you can clearly see that there's something else going on. So here you can even see the different rounds of AS being executed. So here you can see round one, round two, round three and so forth. So this highly suggests that there is a 10 round AS going on so that means 128 bit key. Now the next question is where in this graph here is the key effectively used? Because that's what we wanna attack, right? We wanna know the key of the AS encryption. And for that we simply look at the specification of AS. So we have add round key here. And what it does is simply it takes a plain text bytes and takes a key byte and XORs it and that's the output. Now what is a round key? Turns out that in the first round, so that's this sentence over here, in the first round the round key is simply equal to the key of the user itself, so the cypher key. And that's something you can exploit. After the add round key phase it moves on to the sub bytes stage where it will, where the output byte of the previous stage will just be substituted with another one based on the S box and it will store it again. So assume for now that the output of this is the vulnerable point that we wanna attack. Now the question is what happens inside the chip? So the initial state is unknown and let's call it R. And then after the add round key and sub bytes stages the state will be D which is equal to the S box of P so the plain text XOR with the key. Now we also know that in order to change from one state to another we need to consume power. So therefore we can just calculate what this power would be by taking the hamming distance between R and D. So in this case we have one, two, three, four. So we have hamming distance of four. And turns out that in practice the hamming weight also works if R is equal to zero. So now that we have that we can build our model for each possible key bytes. So let's suppose that our key byte is zero. We calculate our model and do this for every plain text that we transmitted. And then we repeat this process for every possible key byte. Next we wanna measure the reality. So we have our model already. Now it's time to measure the reality. And for that we need to look at the first part because it's the first round where the key is used. And then we can do correlation. So correlation is basically a way to find, well, yeah, a linear correlation. Pearson correlation is a way to find a linear correlation between two variables. So for example we have our model here and our hypothesis here. And we can correlate them in order to find the best model. So we wanna find out which key was our best guess. And here's the result. So for this if you wanna implement this attack yourself and found it a little bit too fast, I'm sorry for that, but yeah, we don't have a lot of time. And if you wanna implement it yourself, what you can also do is just use ChipWisprer, which is open source software that implements these attacks. Usually it also comes with, it's intended to be used with the commercial hardware of ChipWisprer. But in fact, I wrote a plugin for it so that you can use your own SDR with ChipWisprer. So feel free to check it out. It's still in beta. That's why I didn't upload it to GitHub yet because I still have to clean the code, but it works quite well. And I obtained all these results with this plugin. And so here we can see that there is a clear favoring of one particular byte of the key. And this is another visualization of that. So in the beginning we have all of our models which have similar correlation. Then after including more and more traces, you filter more of the noise away and you eventually end up with only one key byte that jumps out and is the real key used by the user. Unfortunately, ChipWisprer is written completely in Python and therefore it's single core if I remember correctly. So that's why I started implementing my own framework for it called Emma. And with this framework you can use multiple cores and run it on multiple machines as well. And so as you can see the result is exactly the same, but it takes only 60 seconds to find a single byte of the key. And I hope to improve this further to include some other techniques and algorithms in order to get the key faster or with less traces. To wrap this up, all of my finished research is open source, so feel free to check it out. It's the decoder. It can be downloaded here. There is a VM with all of my fingerprinting experiments where you can run them again. All the data was captured to a MongoDB database, so you can play with it if you want. And then there is the ChipWisprer plugin which you can also download from this location. Some of the ideas that I had for my current research directions are to use this machine learning in order to create some kind of advantage. So maybe we can do some other stuff that might be useful. Maybe we can do some other stuff that might increase the correlation faster, for example, given a less amount of traces. Or maybe we can increase the range of these EM attacks to, for example, to go through walls or something like that. So if anyone wants to collaborate with me on that, feel free to contact me and then we can maybe work something out. Here are some other related papers which I found interesting about fingerprinting and side-channel attacks, so feel free to check them out as well. And then there are some nice examples that I found online from another university in Israel. So what they did is they can fully extract the decryption keys by just holding a PETA close to a laptop so that was kind of cool. And the same for iOS devices, for example. I had prepared a demo and I tested it in the hallway and it's perfectly working, but of course because the room here is changing and also because it's pretty sensitive to temperature, the fingerprinting demo didn't really work anymore so I had to change the machine learning algorithm again and then by the time we have the last session maybe I can show it to you guys or you can just see me in the hallway and then we can try it over there. And if there are any questions then we have plenty of time for that now, I guess. Thank you. Sorry, I didn't understand your question. I saw the slide that you had with the Disney where you, you know, put it in the chart with all the technical trouble. This one or with what? Yeah, okay. TSNE, yeah. No, those were, actually I didn't explain that. It's a very good question. So the question is what are these points and they are just individual symbols of a Lora transmission. So I did the fingerprinting on a single symbol. So what you're seeing here is the color corresponds to the device that sent the symbol. So in this case if you have a fully colored red one that means that the prediction was correct. So if you have the outline different that it means that it thought that it was another device sending the symbol. So yeah. Okay, sure. No, because this is, so it doesn't mean that the encryption is broken. This is only related to, so this is not related to the encryption at all because what this does is just look at the raw radio signal and select a single symbol and just try to identify which was the transmitter of the Lora device. I mean even the encryption wouldn't help you there because the signature would still be the same because it's transmitted by the same hardware. Yeah, it's just it's like meeting someone in person and looking at his face for example. I mean it's just like humans interact and identify each other. There is no encryption involved in that. Yes. So how does the accuracy improve with more symbols fed into it? Well, the more data you have, the less variance you will have essentially. So yeah, more data and machine learning is always better. The more data you can feed to your algorithm, the better it will work because for example, when I train this model in my laboratory at the university and then move it somewhere else, then it probably will not classify correctly anymore because the channel changed and maybe the temperature even changed. And because you have to measure such fine-grained errors, those are manufacturing errors. So even temperature has an effect on these models. So that means that not to get a good model maybe you don't have to look at for example only the frequency, but you have to incorporate other features as well. So it's really a complex task of getting the right data for your model and then trying to get a good model for predicting which device is transmitting. And that's actually an active field of research right now where people are trying to figure out new ways to fingerprint devices. Yes at the back? Okay, so what was the sampling rate used for measuring this? I don't remember exactly. I would have to check my notes again, but I think it was around four mega-samples per second. So you can the USRP will do that for you. So it's just a baseband signal that I analyzed. Well, I have to send the frequency at about 64 megahertz I believe, yeah. The synchronization you did for the AES About what? Oh yeah, yeah, yeah. So about the synchronization of the AES signal. So this one right here. What I did here is I just took one trace as the reference. So one filter trace, so with the low pass filter already applied. And then I just did a cross correlation and I cut off the signals at the right positions, so that they would overlap in this region. Sorry, I don't understand. Yeah, yeah, I guess, yeah. It's like the same algorithm as for lower synchronization you mean. Yeah, essentially, yes, but then instead of choosing a locally generated preamble, I just used one signal that was already available on in the captures for visualizing this or the probe. Yeah, I don't remember the model number. Yeah, it's a thing from Vietnam, yeah. But I don't remember. Do you know the model number? Yeah, I don't know. Thank you very much. You're welcome. We're going to do a quick sampling.