 My name is Matt Knight. I'm a software engineer with Bestial Networks, and today I'm going to be talking about this new wireless protocol called Lora. We'll get into that in a minute. I love electronics and wireless. I've, you know, came from a doubly background, worked mostly with SDR these days. This is my passion, so I'm excited to be here. I also want to give a shout-out to the wireless village people. I've absolutely haunted this place for the last, like, four or five DEF CONs, and I'm really excited to be here and present for you guys. Okay, before we get started, this is going to be more of a Phi radio talk than, like, a security talk. No zero days, no exploits, but we are going to look at a very low level into a brand new cutting-edge protocol. So before we get started, just want to gauge the room. Hands up if you've used an SDR before. Okay, good. If you've used a spectrum analyzer. All right, cool. And you know what an FFT is, and who knows what a symbol is in the wireless context. All right, cool. All right, so why is this relevant? Why do we want to look at a low level at these different radio fies? Cisco projects that by 2020, there are going to be 50 billion devices connected to the internet in some way, and fewer and fewer of those are going to be connected with wires every year. And in terms of, you know, security research and having insight into what they're really doing, we need to develop tools to be able to stay ahead and really be curious about these things. Because Wireshark exists because somebody made it, and monitor mode capable Wi-Fi cards weren't always a thing. So we need to make the tools so that we can do this low level introspection and stay ahead of the game here. So I'm going to introduce this new class of network called LP-WAN. We're going to review some technical concepts, and then we're going to get really low into the Lore-Fi and talk about how it works. Finally, I'm going to introduce an open source tool that's coming out soon, and we'll hopefully get this into more hands. Before we get started, everybody's favorite marketing buzzword right now is IoT, Internet of Things. And I kind of reset the term, because really it's just fancy speak for connected embedded devices. Okay, how's that? Alright, so embedded can mean a number of things, but a common theme is that there are hardware constraints involved. You often have low-intelligent CPUs, they're battery powered, so you can't do complex operations on them. They get installed in hard-to-reach places. They need to last for a long time. They might not get patched. And they do often require installation and provisioning. You have to have somebody go put them somewhere and change some settings to get them to connect. So given this an ideal interface, if we could design it, it would be wireless, so you don't have to run cable to it. Easy on the battery, so it can last a while. Capable to be installed anywhere, whether it's in the middle of a field or in a building in an urban environment. No configuration, easy to provision. You don't want to have to type in an SSID and a password to get it on Wi-Fi or pair it with the coordinator. You need a gateway on-site or something like that. And finally, it would be inexpensive. Now let's talk about what's not required. And oftentimes when you're connecting devices like this, high throughput and persistent and always on connections are often things that we don't really need. If you have a sensor that you're just trying to report some data off of GPS every 10 minutes or so, it doesn't really require an LTE pipe to do that. So as it currently stands, IoT devices are grossly over-served by some of the common interfaces that are out there. So when we talk about IoT interfaces, we're often talking about things like 802.15.4 and all of its friends. ZigBee is a layer three in up flavor that people confuse. But 802.11, Wi-Fi gets used a lot. We have Bluetooth, Bluetooth, low energy, etc. So what's wrong with all of these? We have these standards. Why don't we use these? All of these protocols that we talked about require some local provisioning. Whether it's ZigBee, you need to ZigBee or 802.11, you need to connect to a coordinator. If it's Wi-Fi, you need to get on an AP, etc. In the case of 802.11, it's not too power efficient. So what's ideal then? What if you want an application where, say, your device is moving around? You're monitoring a vehicle fleet that drives throughout a city or across the country? Or if you want to monitor fuel tanks, if you have heating oil tanks in New England and you want to remotely report back on their level and how full they are. You don't want to have your installers having to go and get on the home Wi-Fi of everybody there. How about cellular networks? They work everywhere and they're easy to install, pretty much guaranteed. One big problem with them is that they're very power-intensive. A second problem is that some of these FIs are going away. So 2G in particular. AT&T is scheduled to sunset it at the end of this year. So January 1st, 2017, they're turning off their GPRS and Edge FIs. Other major carriers are going to follow. These networks are really popular for a lot of these IoT applications because they're battery conscious, you get service everywhere, and they're somewhat inexpensive. So if you are a company that is deploying a device on a GPRS or Edge network now, where are you going to move to? You can move to 3G, which is a little more expensive and has some harder power requirements, or you can wait for this thing called LTEM Release 13. It's part of the LTE standard and basically it's going to allow for devices to use narrower bandwidths and lower data rates to get some of these higher performance applications for long-term embedded applications. However, the roadmap as it currently stands, at least what I've gleaned publicly, is that that's not going to be deployed until the end of 2017 or early 2018. So until then, there is a big hole in the market for these embedded devices to connect. So that brings us to the topic that we're going to discuss today, which are these low-power wide area networks, LP WANs. The way to think of these things is they're just like cellular, but optimized for IoT and M2M applications. So you have a network of base stations that are deployed over some area of coverage. It could be local, it could be a city, it could be the entire world. And then devices connect up to them as they would in a star network. So you have the central base stations and then devices and nodes connect directly. There's no meshing, no routing, just device wireless link to the base station. They can up and down like traffic in many cases and they have a range typical of miles. So again, really think of this just like being cellular, but for low data rate applications. There are a whole bunch of standards that are popping up, but the most popular ones, so the ones that have the most momentum are Laura and SigFox. They've raised a ton of money in the last 18 months or so. Laura raised, sorry, SigFox raised $115 million last year and Wall Street Journal reports that they're going to IPO. So they're growing like crazy. Raise your hand if you've heard of them, not too many hands. And finally, two companies that are big Laura backers raised $51 million last year. So there's a ton of investment. Investors see the relevance here. So expect to see these protocols being more and more relevant going forward. Let's talk about the stacks and what makes them cool. As I mentioned, they're optimized for IoT applications, meaning the very battery conscious. That's probably the big thing. SigFox, one of those standards I just mentioned, gets 10 years on a single AA battery. That's what they advertise. That's like pretty crazy to be sending data over miles for 10 years on a single AA. You cannot do that on LTE. And Laura advertises 13 miles of range between the end node and the base station. So pretty crazy coverage. Compare that with like 2G, which has a limit in the standard of 22 miles. All these other things are really local short range, and they are not great on battery. So this is a huge step forward for embedded devices. You know, how can they do this? They embrace the fact that embedded in IoT applications can accept some level of compromise. So they embrace this and they duty cycle the message messages heavily. So you're not sending that much traffic. You're often not always listening all the time. So you might have scheduled Windows where the end device will wake up and keep the radio on. Look for look for download from the base station. Really small payloads talk about that in a minute. And they're very highly rate limited. So some examples, SIGFox limits devices to 140 12 byte data grams per day. Think about that. That's like, I mean, what is that? That's like a single UDP packet. And finally, weightless end is uplink only. So it can only send messages up to a, up to a base station. It cannot accept messages down. And finally, Laura Class A devices only can receive a download message from the base station after sending an uplink message. So they're not promiscuously listening. So this is quite different. And there's some really unique capabilities. There's some really unique features that they contribute to this. So the rest of the talk, we're going to spend talking about one of these LP WANs in really great depth. We're going to talk about this this fight called Laura. Laura is an LP WAN fight that was developed by a semiconductor company called Semtech is actually developed by a startup that was acquired by Semtech. But Semtech is the company that's evangelizing it and pushing it worldwide. The fight was patented in 2014 in the first definition of the the Mac and network stack that they're pushing products on came out last year. So these things are brand new. They really haven't been battle tested and are just starting to become to become available. Just to clear something up before we continue, oftentimes you'll hear the term Laura used. Sometimes you'll hear about Laura WAN. Laura refers only to the file error. And Laura WAN is the higher layers built on top of it. So think of it as being like 802, 802 15 four versus ZigBee, different layers of the stack that often get get used together. Laura WAN defines a whole bunch of security features. They have a pretty, they've they've done a pretty good job of thinking about these things, things ahead. But some interesting features is that the Mac stack is controlled by an IP based network server that does all the intelligent coordination in place of the base stations. Probably do some more on that later if there's time. But they also define the security architecture that does enable a unique key being used per device, but that's left up to the that's left up to the application to to implement, of course. And they also have two different keys. So you have the application key, which is end to end between, you know, your company's application server in the end node, but then there's also a network key that protects everything going from the end node to the network server. So if you're a developer, you can use these application keys to make it such that the carrier never sees your your content in plain text. That's all we're going to say about Laura WAN. We're going to talk about the Laura Phi from here on out. So one of the really unique things about about this Phi is that it operates an unlicensed spectrum. So when we say unlicensed, we're talking about the ISM bands in the US, primarily 900 megahertz and 2.4 gigahertz. And in these bands, you have devices like Wi-Fi, 802.15.4, cordless phones, etc. You don't need a special license to operate on those bands, so long as you abide by certain rules and regulations that the FCC has put out. You know, if you go to Best Buy and you buy a router, you don't need to, you don't need to petition the FCC to get to install it in your house, you can just do it. It's because, you know, neckier or whoever you're getting your router from plays by the FCC's rules, they get it certified and anybody can use it mass market. Contrast this with cellular, right? Cellular operates on private and protected spectrum that is very expensive. The FCC auctions the air rights of the spectrum for billions of dollars, which restricts building infrastructure to the biggest companies, you know, Verizon, AT&T. You can't put up, legally put up your own 4G base station in your house if you wanted to. You'll get a phone call. So I don't know if you can see the numbers here, but this is a price list from the FCC's reverse auction of some very high value wireless, or TV white space spectrum that was recently auctioned. Circle one number there, you probably can't see it. That is $900 million for WCBS TV in New York. So, you know, if you want to start a cell network, you know, maybe you can pass the hat and buy some spectrum at the end of the con. So there are a couple companies that are building up cellular type networks on this LoRaWAN technology. The two big ones are Senate, which actually started with that heating oil monitoring scenario that I mentioned earlier, and now is doing a commercial data network, you know, kind of at scale. And the Things Network is another one that's really interesting. That's a totally crowd-sourced network. So, you know, they own a couple base stations here and there, but really what they do is they provide that network back into architecture open on the internet so that anybody can buy their own LoRa hardware, set it up wherever they want, and then send the data back to their network servers. So think about how radical this is. You don't need a spectrum license to stand up a base station that can cover miles and miles of range. You know, one day you might be able to go to Best Buy and buy a LoRa LoRaWAN gateway and send up a network that either could be private or could be hooked up to something like the Things Network that could cover, you know, all of Manhattan, right? That's really cool and that's like a pretty radical shift for connected devices and IoT applications going forward. Okay, so we're going to run through this next section. I figured I'm the first, you know, technical talk of the session here at the Wireless Village in DefCon. We might have varying levels of experience. Let's just kind of even the playing field a little bit. This is going to be obscenely short, so bear with me here. We're talking about the file layer for the rest of the talk, which is this lowest level on the OSI stack, and that refers to how bits of data get mapped into, get mapped into the physical electrical characteristics that are used to carry it. So we're talking about, you know, voltage, you know, current, actually moving electrons. So that's where we're going to be moving forward. We're talking about a wireless protocol, which travels over radio frequency, and that's just electromagnetic waves and energy moving through the air. And you can manipulate RF by using a radio, which can either be hardware defined, like a chip that speaks one protocol really well, or software defined, where you have really flexible hardware at the front end, and then you implement the hardware and protocol specific stuff in software. Really flexible and allows you to iterate and prototype things really easily. When we talk about radio fies, one of the most important components is this thing called the modulation, and that defines how the digital data values get mapped to RF energy, how you take the bits and convert them into signals moving through the air. And when you're modulating, there are three parameters that you can play with. You can play with amplitude, frequency and phase, or you can, you know, put them together and use some combination. Modulators can be either analog or digital, and with digital modulation we have this notion of something called a symbol. And a symbol is an RF energy state that represents some quantity of information. It is discreetly sampled. We're going to get into this more, but just think about this concept of a symbol being a state that represents data. We're going to talk about it more going forward, so just remember that. To illustrate some symbols here, we have two different IOT, common IOT fies. On the top we have frequency shift keying. On the bottom we have on-off keying, which is like an ASK type of thing. And on the FSK example, a symbol representing a single bit of information is a frequency being on, or is the power being in one frequency versus the other at some instant in time. And with the OOK example, it's the presence of the signal. So these are all different, how different modulations represent digital information over the air. There are some more complicated fies that are used as well. On the right there we have an 802.15.4 packet. And what 802.15.4 does is it spreads its information across the wider, it spreads its information across more spectrum to increase resiliency and get some low power features in there too. Makes it more resilient to RF noise and interference. Some other more complicated fies are Bluetooth and Bluetooth low energy that do some frequency hopping and things like that too. So we mentioned hardware and software defined radias earlier. We're going to use one of each in this talk. On the top we have a hardware defined Lora module. That's what I used to generate the traffic that we're going to take apart. And on the bottom we have a really popular EDS B210 which is a commodity software defined radio that we're going to use to prototype our receiver and iterate quickly on this protocol. Last thing I want to talk about is a Fast Fourier Transform. What this does is if you feed it some samples it will decompose the signal into the component frequencies to comprise it and we can visualize this using a spectrogram. So here we have a whole bunch of ffts that we've run over a signal in time and then we have time in the y-axis the frequency increasing in the x-axis and finally power in the z-axis how intense the signal is there. We're going to be seeing some more of these. Okay cool that's enough with the crash course. Back to the main event here. Lora implements a proprietary PHY that's built on a modulation called Chirp Spread Spectrum, CSS. Now what is a chirp? A chirp is a signal that is continuously increasing or decreasing in frequency. It's like a sweep tone. We have some visuals here. An up chirp is on top and here you can see the frequency is linearly increasing until it hits the edge of the band and then it wraps around to the bottom of the band and continues increasing. The frequency can change instantaneously but the rate at which is changing is constant. The first derivative of the frequency is some constant value that's that's going up. We can have the complex conjugate of that too which is the down chirp with the frequency decreasing. So it just keeps going and then wraps when it hits the edge of the band. Why would you use CSS to carry data? It's very resilient to interference and gives you great link budget, multi-path performance for real-world deployments. So it provides some benefits. It's pretty interesting where else would you see something like this and answer is radar. CSS features are derivative of some RF elements that you often see in radar systems. Just some examples here. Marine radars will use chirps for the ranging applications. Also chirps are used in some scientific and atmospheric radars. There's an open source project that's pretty interesting that has some really cool visuals online if you want to check it out of chirps being received that had been reflected off of the ionosphere to measure space weather and geomagnetic activity. So that's pretty cool. So I first heard of this Laura protocol in December of last year. I thought it was pretty cool so I went to look for it. I got my SDR and went on a little fox hunt. I couldn't find Laura in Boston, Atlanta, San Francisco or New York pretty big engineering urban centers. However you know a few weeks after I heard of it I ran into this company Senate the Meetup in Cambridge. They were talking about their company and their network and one of the things they mentioned was that they were from or one of the things they mentioned was that they were growing pretty rapidly. So I was watching one of the marketing videos and they popped up this this pretty slick graphic looks like a coverage map right and I look a little closer here and you know where is that? That's Portsmouth, New Hampshire. So at the time I was living in Cambridge Mass and you know Portsmouth isn't too far away so grab my radio went for a little drive and here's my here I am sitting in my car you serve on the dash and the result of this little field trip was some Laura found in the wild. So luckily that that graph they showed was not that was that was real live data on their deployment. The base stations that were on that picture were real so. So there you go. So let's take a closer look at this random signal that we captured in the parking lot in Portsmouth. This is a part of the five frame that we pulled out. Some features we can immediately identify is up at the top we have some continuous repeated upchirps. So the signal is just increasing kind of continuously. And finally about halfway down we have I'm sorry that's we can think of that as being like a preamble. Finally halfway down we see the chirp direction changes. We have two downchirps which you know looks like a starter frame delimiter some sort of synchronization element. Finally after that we have these choppy upchirps of varying length. So again notice the frequency is always increasing. Although it might change instantaneously. So you know the the rate of change for the frequency is always constant in that five data unit. But the the frequency might jump within that band. And that's how we modulate data onto this wave form. So yeah we just talked about that a little bit jump the gun on the slide. But one way to think about the way the data is encoded is it's kind of like frequency modulated chirps right. The chirps are are you know kind of your signal that you modulate and you modulate it by changing the instantaneous position within the band. So let's take this thing apart and try to get the data out of it. So Laura is a closed phi it's proprietary the spec is not published. The Laura WAN spec that defines the Mac and network layer that's open you can look that up. But the actual modulation is closed. And that's because you know Semtech they make ICs they don't want other people knowing how it works. But there are some documents out there that leak a little bit of information that we can we can use to start gauging our exploration here. They have a European patent application which talks about some very high level concepts. They have that that Mac and network spec which talks about file elements without necessarily going into too much detail. We have some application notes which talk about some radio specific considerations. That's if you have one of their Laura ICs you know some some of the features that they implement. And finally there's a little bit of prior art out there. There was an open source project called RTL strange love that had an attempted to Laura decoder. I never quite got at working but there were some some good good hints there. And finally some pretty high level observations on the final wiki page. So from all this documentation we will start to pull out some definitions about features that are actually occurring in the phi. We have the bandwidth which is the width of the chirp how much frequency the chirp traverses if just allowed to run continuously. Finally or secondly we have this notion of the spreading factor and this is very important this is a very important concept that we'll talk about. The spreading factor represents the number of bits encoded per symbol. So this is a spread spectrum modulation meaning that we have multiple bits being encoded into each symbol. So the spreading factor is the number of bits in each RF energy state that we're going to measure. And finally we talked about the chirp rate that's the first derivative of the frequency. All of these things are mathematically defined and from these documents we can start to figure out how they relate to each other. The bandwidth in the US is 125, 250 or 500 kilohertz. The spreading factor ranges from 7 to 12 bits per symbol and finally the chirp rate is a function of the first two. So you know if we know the modulation parameters we can figure out what the chirp rate is. So it's a symbol in this case. We mentioned it being a frequency modulated chirp. But what we're going to be trying to do when we demodulate this is measure the changes in frequency throughout the band as that chirp jumps. Alright so here's our little checklist here. When writing software to find and decode this there are a few steps that we need to do. Pretty much all digital radio systems have a preamble and a start a frame delimiter for training the receiver to pick up an incoming message. So the first thing we're going to do is we're going to find the preamble which is those repeated up chirps at the beginning of the frame. Then once we have that and we think we're about to receive a packet we're going to start looking for those two down chirps which we're going to use to synchronize on the start of the message. Finally once we have that we're going to extract the data from the frequency transitions by doing some math. So how can we do this? One technique that I found that was pretty cool was if we if we do a little transformation on the signal it puts it into a form that makes it pretty easy to work with. So we can de-chirp the signal by generating a local copy of each of those two chirps and multiplying it against the original baseband. And we know what the chirp rate is because we there are a finite number of modulation parameters we can kind of guess them until we get the correct chirp rate which is just a function of those two features spreading factor and bandwidth. So we do that and we start to see something interesting here. This is just an IQ signal being run through a new radio filograph that does this. So some repeated elements here. I don't know if you can see it. We'll move quickly through this and look at a better picture in a minute. But here we can start to see those chirps that were previously you know diagonal moving throughout the spectrogram have been rotated and are now vertical. And that looks like something we can start to play with. So going back to symbols we know that the symbol is the energy state representing how many bits of information there are. And since Laura's spread spectrum there are multiple bits per symbol. We can derive from this the number of possible symbols that there can be and that's just two to the spreading factor because you know the each bit can either be a one or zero right and that's just how it works out. So we can leverage this to come up with a method for extracting symbols from this from this D chirp signal. We can do that using a fast Fourier transform. If we set the width of the FFT that is the number of component frequencies that we're looking at and run the D chirp signal through it then each bin will represent a possible symbol and then the symbol which is present is just the bin that has the most amount of energy in it. We'll illustrate this in a minute. So here's our original signal next to the D chirp signal. Again, chirp can either be positive or negative up chirp or down chirp. So when we do the multiplication we do it against both and we get out these two different IQ streams. So on the left is the D chirp up chirp and on the right is the D chirp down chirp. So you can see that the preamble and the data which are always up chirps are going in the right direction and those come out when you D chirp the up chirp and finally the SFD those two down chirps you can see is a pretty intense little red spot right there on the right. Going back to our flow graph we need to start working through this process and actually writing software that does this. So identifying the beginning of the packet is signified by finding this preamble and the preamble in the case of Laura is the same symbol being transmitted over and over again. Remember it's just a continuous up chirp. So when we D chirp it all that energy winds up being in the same FFT bin. So we can do this and just look for a number of consecutive symbols having the same value and that says okay we're probably getting we're probably getting some data out of this. We're probably about to receive a packet. Finally once we found that we need to start looking for that synchronization word that we're going to use to lock on to the start of the packet and we do that by looking at the opposite D chirp signal. So again remember we found the preamble in the D chirp up chirp signal. So we're now going to look at the D chirp to down chirp signal and look for that SFD. And in the Laura Phi it's I found it to be two symbols so we just look for you know two symbols that have the same value and then we can use that to synchronize. We're missing something important here and that's that SFD detection is essential for having accurate synchronization. And if we have a bad sync in this case what can happen is we can wind up spreading energy between multiple consecutive FFT bins and that can lead to reading the data out incorrectly. Here's an illustration if you look at these bits that I've for these bins that I pulled out we have element 39 and element 50 you can see that there are basically two peaks in each of those FFTs right. Ideally we should just see one because there's only you know one symbol because we're reading out one symbol per FFT ideally. So basically what's happening here is we have multiple symbols leaking into each FFT. You can think of it as if you have a buffer and each symbol is is two to the spreading factor number of samples. In this case it would be 56 because this is an SF8 spreading factor 8 signal. You can think of it as being that 128 samples in that buffer are from symbol n and then 128 samples in that buffer are from symbol n plus one. So we need to basically realign that and get it so that 256 all 256 samples in that FFT are are from one symbol. So one way we can do that is we can increase our FFT resolution in time by overlapping them. So basically what we do is if you have your FFT buffer you kind of walk it through in process each sample multiple times as it traverses through the buffer and this has the effect of getting you better resolution in time. So we overlap the buffers and we get this nice picture on the bottom here. On the top we have a non-overlapped FFT, same picture we saw earlier with the collisions, and on the bottom you can see those features start to become much more defined. So if we do that when we're looking for the SFD, oh yeah sorry this is the the overlapped FFT, zoomed in a little bit, here it is, looks much cleaner. So if we overlap these FFTs when we're looking for the sync word we can get a very precise, we can very precisely calibrate our receiver to that and read the data correctly. So here are all three, on the left we have the non-overlapped, in the middle we have the overlap FFTs and on the right we have the synchronized, the synchronized non-overlapped FFT that we compute after we find that sync. So we'll zoom in just a minute here, look at line 39 there, 30 and 39, look at how much better that looks. I know there's there's you know some noise in there but you can you can see where the maximum is right, it's much cleaner, much more defined. So there it is we went from the the unsynchronized synchronized and we get much better precision out of that. Finally step three we need to extract the data the the pay the data from the payload section which we'll get to in a minute, and then we normalize it about the preamble. So if we take whatever whatever bin the preamble occurred in as being data element zero then every then we can just rotate the data values within or you know relative to that normalize it and we're good to go. That's it right? Hardly we're just getting started. That's the demodulation and the data that's being sent over the air is being encoded. What is what is encoding? Basically the transfer the data gets transformed before it gets sent to the demodulator to increase the over-the-air resiliency of the signal. Why would you want to do this? Why would you want to add the complexity of adding this encoding onto your signal? Well a smart guy once showed some information with me. Is he here? You should you should talk to him. He's he knows a lot. I didn't quite understand this at the time but I think what he was getting at is the fact that radio frequency is a really brutal environment to operate in. All systems can see interference from from the environment whether it's whether other devices and then add in the fact that this PHY is designed to operate an unlicensed spectrum where you're guaranteed to have lots of other traffic right? You have to contend with Wi-Fi. You have to contend with Bluetooth. If somebody is microwaving their leftovers that's going to produce some emissions in 2.4 and that's going to that's going to interfere with you right? So encoding scrambles and replicates the data within the frame to increase resiliency. So what do we think we have here? Let's take a closer look. We have some clues from those documents that we looked at earlier. Specifically the patent suggests that there are four stages of encoding being being applied here. First we have this thing called gray indexing which adds error tolerance for off-by-one symbol errors. So if when you do that that FFT on your D-trip signal the power accidentally registers as being off by one plus or minus one bin the gray coding will correct that when you when you when you apply it. Secondly we have data whitening which induces randomness lots of little transitions for your receiver to synchronize on. Third we have interleaving which scrambles the bits within the frame just ship takes all the bits that are there moves them around. Let's talk about that in a minute. And finally forward error correction which adds correcting parity bits. So we have some from our open source intelligence we have these four distinct operations to reverse. So when the order that's being presented here is the receive the receivers the order that the receiver is going to process them in. So the receiver would start with step one you know go through two three and four. The transmitter does the inverse. So steps go four through one when the transmitter is preparing this data. But again we're reversing this so we're thinking about this from the receiver perspective. Now I'm going to do I'm going to go a little bit out of order out of order here and I'll explain why. I'm going to start by explaining the forward error correction. The reason why is because the forward error correction is essential for reversing the other steps. We're going to exploit some properties there that make the other steps a little easier to understand if we if we can understand what the fact does does the data before it gets sent. So forward error correction you can think of it as being like parity bits on steroids. They are parity bits that in addition to detecting errors can repair them. A common effect scheme is the hamming hamming scheme where depending on how many bits you have there are some you know some little rules that define how many bits get you a certain amount of tolerance you can repair errors. So if you have let me explain these numbers here. So we have two pairs of numbers in each of these these different modes. The first number is the total number of bits in the in the encoded data code word. And the second number is the number of data bits that get input. So the difference of the two is the number of error correcting bits that get added. And if you have if you have one or two error correcting bits for four data bits you can compute parity just like having a parity bit in like RS232 or something like that. But if you have three then you can go a step further and correct an error bit. And then if you have if you have four parity bits in addition to correcting that bit error you can detect if there were two. You can't correct two but you can you know you can at least know. Okay so I'm missing a slide here. No I'm not. Okay. So we're gonna we're gonna come back to this later but just remember that when we're talking about fact we're adding parity bits to the data bits. You're increasing your data size as you apply fact. Okay so going back to that that flow graph right this is step one the gray indexing. The patent suggested that gray coding was applied to the signal before it was sent. In fact experimentally it was determined that the the gray encoding was actually the reverse that the payloads that were sent were degrade before they went over the air. And we were able to determine that experimentally but while working in the whitening. Again the whitening is that step two which is induces the randomness into the data. So the way the whitening works is the transmitter will XOR the data against a pseudo random sequence that is that is known to both the transmitter and the receiver. So this basically takes data applies it to some mask and then sends it. The receiver will then XOR the receive data against that same pseudo random sequence and that will return the original frame because XOR is its own inverse. Why would you want to do this? Randomizing the data increase or add some add some features to it that make it easier for the receiver to lock on to. You can think of it as being like line coding like Manchester and coding. However whitening has an advantage in that it doesn't it doesn't reduce the effective bit rate that your signal can use. Manchester imposes a penalty of your effective bit rate becomes half of what it would be if you weren't applying Manchester and coding. Whereas whitening since it's just an XOR there's no penalty there no overhead. So when we have to find this whitening sequence to know what to XOR receive data against. There are a couple different whitening algorithms to find in one of those reference designs. One of those application notes I'd mentioned. Also a few that were published in RTL strange love that open source project. However none of them worked. I implemented them all. I tried them against the received data and it just didn't didn't make any sense. However we can again go back to our friend the XOR operator and start to do something clever to back this out. One awesome property of XOR is that if you XOR data against a stream of zero bits you get the data out. You get the original data out. So if we transmit a frame of all zeros from the from the transmitter's perspective the receiver is going to get the whitening sequence out of that. So that's kind of cool right? Yeah so so we're gonna do that. And you know some things that we can that that help us get this right as we mentioned the handling correction. Handing so handing eight four all of the code words contain four one bits and four zero bits except for the code words for data zero and data f right? And the code word for data zero is just all zeros. So there's no effect being applied here so it doesn't mess with our receive whitening sequence. And also if the interleaving is non-additive then there are no bits being added. So so basically by sending all zeros these two stages fall away. We don't have to worry about them and we the only the only other thing that we need to control for is the gray indexing. So you know there are just three different states that we could try. Graying, no-graying and de-graying and then we when we get the zeros out we know we got it right. So that's how we're kind of able to solve for these first two steps the gray indexing and the whitening kind of in one go. All right step three finding the interleaver. There was an interleaver defined in the SEMTAC European patent application that I mentioned. And it suggests a diagonal interleaver abiding by that by that formula there. Guess what that also didn't work. Fantastic. So we're one for four on documented features right now. The gray indexing was actually de-gray like the the inverse before it got sent. The whitening algorithms that they gave us did not work. I still have no idea what those algorithms are applied to. You know they're in the documentation but they're they're not what not what actually we're actually implemented in the PHY. And now this the interleaver that's in the patent is not the one that's there. So we're gonna have to figure this out experimentally. Okay so deducing the interleaver. This was this was hard. This is the hardest part of all this. Bear with me. I have some graphics that I think will make it a little easier to understand. But just like we did with the whitening we're going to exploit properties of the Hamming forward error correction to get the interleaver to reveal some some patterns about itself. So remember that most Hamming code words contain four set bits except for zero and f right? So zero contains no set bits, f all the bits are set. So if we send it if we craft transmissions that are that are all zeros except for one set of f's and we walk where where the f is through the through the packet we can start to reveal some properties of the interleaver here. I'll make it easy for you. Does anybody see what's happening here? So I'll draw your attention to the bottom row second from the right. That's payload for zero f and then the rest zeros and you can see that it's almost perfectly diagonal right? And if you look at the other ones you can see that it's also similarly diagonal with an offset. One other thing that I'll call your attention to is that the most significant bits in each interleaver block are flipped. So you know looking at the data was it didn't make any sense until I constructed these payloads to try to back this out. And this was a this was the first step to start seeing the patterns here. Okay so with this we've mapped out which diagonals correspond to each to each four bit positions within or to each of the bit positions within the transmitted payload. So with that in hand we need to think about aligning each of those code words within each diagonal right? So we can map the diagonals to the code word but the the bits that we get out of each code word are scrambled around. So we need to start looking a little more granularly into each of those diagonals to figure out what's actually happening there. So our solution again just like before is to transmit some known words and look for the hamming encoded the hamming feck encoded signals within each diagonal. So we'll take a known you know well well loved known payload here and we will map out the diagonals into a table. You guys see what's going on here? Basically we know the diagonal position from that exercise that we did before so we're just reading each position that we know from where the F was into this table here. We populate this little matrix here and here it is next to the data that we're expecting on the left right? So we have the the unencoded data in the middle column there and finally the parity bits that we expect to be applied to it on the right. Does anybody anybody see anything here right off the bat? This might be a little bit harder. Again I'll make it easy. Oh wait no sorry jumping the gun. One thing that or yes so just to help us along here if you reverse the endianness of this then you start to see patterns here. So here we've got the data that we find in these bit positions here and if we go a step further we can start to correlate and map some of those hamming feck fields into these interleaver codeword positions. So these two fell out pretty or these six fell out pretty nicely. Interestingly hamming parity bits three and four were flipped and that's something that was just deduced experimentally by looking at the actual values versus the expected parity here. And yeah so that's a huge step we're almost there. If we apply the forward error correction to these bits that we have here we've got it that's it that's the whole thing. So that's the fight we just went from from that weird you know diagonal chirped you know modulated signal to to the data that's in there just based on some open-source documentation and gumption. So how are we able to how are we able to do this? So we have these four steps right and I hinted at this earlier that we were able to control for the first two and make an assumption about the last step the feck based on some documentation that were that were leaked. And basically what that meant is that the only experimental variable that we really had to solve for that was really hard was the interleaving. But since we were able to control these other three then we had one that was we only had one one variable that was that was unbounded and then we could solve it like algebra. Okay so what's next you guys want to play with this right and you guys want some you guys want some tools working on that. So I want to briefly introduce GNU radio Laura. It's an out-of-tree GNU radio module that implements the Laura Phi. I'm working on it currently it's going to be released between now and GNU radio conference in September so this will be out there you guys can grab it play with it. You know it's going to have GRC and all that all that good stuff. I have a proof-of-concept receiver complete. I just got the the time warning I don't think I have time for a full live demo but if anyone wants to see it I will be happy to show it to you right after this talk. So finally to conclude these LP WANs have a ton of momentum and are rapidly proliferating. I mean you're going to be seeing these everywhere with all the money they're raising with how you know how much everybody's emphasizing IoT it's inevitable these these things are going to be everywhere. One other point I want to make is that RF stacks are becoming more diverse right when you think about wireless you know oftentimes in the enterprise we think about 802.11 but wireless is not just Wi-Fi anymore you have to start to think about these protocols you know Bluetooth the cellular it's all it's all out there and it's all starting to become connected to the internet and more and more interrelated. Third we show we've shown how to go from some obscure RF signal to bits and fourth we're adding a new tool to the RF security researchers arsenal so you can start to take a look at this brand new network that is totally greenfield and start to start to find things and make it better. Finally I want to acknowledge my team at Bastille especially on Balenciber for for helping make this happen. The open source contributors for starting to look at this and the wireless fields for hosting. Thank you very much I'm happy to take your questions. Okay so the the first question is have I heard from the company in question and I actually have so you know I first published this research back in May and you know not too much happened but finally I did get an email from from one of the guys who worked at the or from one of the founders of the startup that invented this modulation that was bought by Laura so the guy that invented Laura shot me a note and said he thought it was awesome so I haven't gotten it gotten any C&Ds or anything legal but I did get get a digital high-five from the the guy that made the five. No it's being applied in in in the inverse order so yeah that's a good question I actually haven't looked at that because I only have one radio yeah I did I did do some look at some some basic jamming Laura's really good in contested channels so even just using a usurp on max power I wasn't really able to you know jam it with wide band noise or anything I really wonder what would happen if you were to stack a whole bunch of Laura carriers on top of each other if that would just totally throw a receiver out yeah exactly just imagine just drawing diagonals all over the 900 megahertz spectrum would that would that work for for MFSK basically though the whole point of dechirping is that it it converts those signals from the the chirps to these those nice linear pulses so we can basically treat the whole thing like multiple frequency shift keying no I hadn't thought of that so harbors out there you can get dev modules on digikey they're not too expensive of course you know ICs are available in volume but honestly like there are more and more signals in the wild I mentioned I had that slide about my road trip to Portsmouth I had to go to the source to find it there I see Laura everywhere now I actually haven't checked here in Vegas but you know all over Boston all over San Francisco it's it's popping up so so it's out there so so it's kind of all over the board you know I mentioned the I mentioned the heating oil company Senate that it's monitoring gas tanks in New England I think there are people doing vehicle fleet monitoring you know just reporting GPS and things like that one really interesting case that I heard about I met this guy at a conference who he was making 802-154 connected rat traps these are things I know right iot who you put these things in your crawl space or whatever and and it used a thermal sensor to detect when a rat had been caught and it actually killed right it looked for like the temperature going down to room temperature this is a real thing I'm not making this up but yeah so he had made that on 802-154 and Zigby and he was and he was psyched about Laura and couldn't wait to make Laura connected euthanization for rats anyone else okay I'll be around come by me if you want to talk about this thanks again okay thanks Matt