 So today I want to talk about security and cognitive radio networks. Now, there are a lot of companies right now that are spending a lot of money on a prediction. And the prediction is that over the next ten years, the number of connected devices and the amount of data that they're going to use is going to increase very, very fast. And so new protocols and new specifications and new hardware even is being developed to deal with this engineering problem of how do we add billions more devices to networks that are already really congested, especially wireless networks. And a lot of the times when these new protocols are designed, it's really important obviously to think about how we make sure data integrity is maintained. And in the kind of networks I'm going to talk about today, not a whole lot of thought has really gone into it yet. And I think the way that security research kind of normally works is we start out with a system that's deployed over a really large area and then someone discovers a problem with it, right? And then we talk about it and everyone freaks out. And then it's up to the vendor or whoever is responsible to try to repair it. And then everyone sort of relaxes a little bit until it happens all over again. And so we sort of had this cyclical band-aiding of almost everything that we interact with. And I think we're at a really unique point in cognitive radio networks because it's far long enough now to where there are actual deployments in the field, but it's not so far along now where if we discover problems with it and we can actually make suggestions to how things should be changed, it's not going to affect millions of people yet. But to understand how we can make these improvements, first we have to understand the system that we're talking about. So initially we had radio, right? In radio you've got a transmitter and a receiver. And if you want to change how the radio operates, then you have to turn a knob. And you've got to be standing there physically to turn the knob. And the knob only turns so far one way or another. Okay, so then technology advanced a little bit further and we came up with software to find radio. Now we can turn those knobs in software and we can do it remotely and we can even control how far the knob turns one way or another. So sort of the next logical step from this is cognitive radio, which is adding a feedback loop into the radio itself. So now the radio is capable of observing its surroundings and then changing itself, changing its own parameters to optimize whatever it's trying to do. And we have a whole bunch of these together and we have, of course, a cognitive radio network. So now not only are individual radios talking amongst themselves trying to understand how they can better increase their performance, but they can talk to each other and they can talk back to a central base station somewhere and inform each other about what's going on. So another word for this might be adoptive network, where they're actually teaching each other. Now to actually send information in the physical layer, it's important that we understand this, otherwise it will be a little difficult to follow. But to actually send information over a wave, we can fully define a wave by three parameters. Its frequency, which is how often a certain point reappears, its amplitude, and its phase. And so by fiddling with one or more of these parameters than we do is called modulation. So that's how we actually represent information in a wave. Now there are lots of different ways you can do modulation, of course. If we start with the simple cosine wave like this, the top is the time domain and the bottom is the frequency domain. If we frequency modulate this, then in the time domain, we see the frequency is actually changing with time. And in the frequency domain, you can see all those frequencies. There are more complicated ways to do this too. For instance, this is quadrature amplitude modulation. It's more difficult just by eye to see what's going on here. But the point is, there are lots of different ways you can do this, and each different form of modulation has different tradeoffs. So by changing individual parameters in each different form of modulation, we can trade off things like how fast it will go versus the bandwidth that will take up and stuff like that. So cognitive invention is sort of the hyped up word for what you might describe as the thing that actually controls and commands individual cognitive nodes. So there are lots of different parameters that the cognitive invention controls. And these are examples of some of them. Now, you'll notice that these are all on different layers of abstraction. So some of them are all the way down in the physical layer, and some of them are higher. And some of them even depend on each other. So the cognitive invention is trying to accomplish some task. Like let's say it's trying to minimize interference, and it's trying to maximize data rate. So we would call that its objective function. So it's got a bunch of inputs to its objective function. These are examples of inputs that it would have. And by changing those inputs, it's going to achieve some output of this function, and it's going to either try to minimize or maximize this function. So if we were to plot that, this is a simple three-dimensional example. Normally, you would have way more than three dimensions, so it would be impossible to visualize like this. But a simple technique that a cognitive invention might use to try to optimize whatever it's trying to do is something like gradient descent or gradient descent or other simple machine learning techniques. And this works relatively well, but obviously there are dangers of hitting local maximum in a set of global. And so there are several other techniques that it can use. Another one is game theory. And this actually works really well, because Spectrum is a resource that is being fought over by lots of different people. And so we can model this in game theory really easily. And one of the ways that we can do this is to try to achieve what's called Pareto optimality, which means that if you've got a single cognitive radio network in a room by itself and there's no one else there, then there's nothing it can do. No change it can make when it's at its Pareto optimality that will increase its performance without also decreasing the performance of another node. So this is sort of like the idealized case where you win every time. Now in real life, of course, this is never true. You're always competing with other people, and there's other things going on in the network. And maybe there's malicious users in the network. So in that case, you can try to attempt for a Nash equilibrium, which means that now instead of trying to win everything all the time, you're basically trying to not lose all the time. In other words, as long as everyone is using the same strategy, there is no change that any one player can make on their strategy that will also not decrease the strategy of another player. Now again, this gets difficult to actually achieve in real life, and so you can try to approximate it. But the point is there's a lot of really interesting ways that a cognitive engine could try to optimize this objective function. Back in the 1800s when wireless telegraph was first showing up, it was really, really spectrally inefficient. And that was because it used spark-gapped transmitters, which just trashed the spectrum. So when operators were trying to talk to each other, they would have to listen to see if anyone was there. And if there wasn't, then they could go ahead and start talking. And so as you can imagine, this did not end up working very well at all. And in fact, one of the more common messages that people sent was GTOOMQRT, which stood for go to hell, old man, I'm trying to transmit. And so they would just yell at each other until someone like gave way and they could actually transmit. And this really became a big problem when the Titanic sank, believe it or not, because after they investigated the sinking, they realized that some of the shore transmitters were actually interfering with some of the ship rescue efforts. And so in 1912, the Radio Act of 1912 was passed, which created the FCC and created licensing. So now people realized to make sure that no one is interfering with each other, we need to have licenses. Everyone is responsible for their own chunk of spectrum and you're definitely not allowed to transmit where anyone else is. And this worked well for a while. But eventually, as more and more wireless devices came online, and especially in the 80s and 90s when cell phone companies started becoming big and they started aggressively lobbying Congress to buy more and more spectrum, the spectrum was divided up into smaller and smaller pieces and some rudimentary spectrum sharing began because as long as you're not obviously in the same physical location or you're not operating at the same time, then you're not going to interfere with each other. Now, more recently, something really interesting has happened. Back in the 90s, when people started seeing this problem of, okay, maybe we should figure out how to ration spectrum a little better, people began setting up spectrum observatories, which are basically just a spectrum analyzer on a building somewhere within antenna, and they would watch this really wide band and see what was actually going on. How are people really using their licenses? And they found out that surprisingly it's actually full of holes, and a lot of the spectrum that people are buying isn't actually used, or at least it's not used very efficiently. So recently, several years ago, the FCC, after being petitioned by Google, decided to make unlicensed the old analog TV channels. So this is called TV White Space. And Google was kind of the primary push of that. So the problem is, because the licenses are not the same everywhere, the availability of stuff like White Space isn't available everywhere. So this is a plot, for example, of one channel of TV White Space across the United States. And all the blue is where it's available, and the green is where it's not available. So as you change channels, the availability of different frequencies changes as well. So this is the entire United States. It's a little deceiving on this plot, because it's not plotting density, it's plotting different channels and different colors. But you can see it's mostly around a couple big cities where there's not a lot of spectrum available, but everywhere else in the United States, in the middle of the desert, there's no TV channels out there, so why not use it? And there are several companies now that have begun really taking advantage of this. The FCC has set up specific databases with Microsoft and a company called Spectrum Bridge and Google, and the way it works is you query these databases, and you tell them where you are, and they'll return to you a list of the frequencies that you're allowed to use without paying for. These companies have also, Microsoft and Google have begun using this in what some people are calling super Wi-Fi, which is basically just taking 802.11 and shifting it down to these frequencies, which is around, it's in UHF, so it's around 500 to 600 megahertz. And they're doing lots of trials right now. There's around 40 experimental installations in the United States. They've also got trials going all over Africa and Kenya and South Africa, Tanzania, Singapore, Senegal, everywhere. So this is really interesting. But there's other uses for this as well. There's a company in France called Sigfox that rather than use this as just another way to do Wi-Fi is trying to use this for long range wireless sensors that are specifically, that are really low power. And so rather than connect people in a traditional network, they're connecting, for instance, farmers who need to measure the moisture of their field, and you've got a really large area. This is the kind of stuff that they're working on. So especially for these low power devices, we need new protocols, we need new ways to deal with these interesting physical and political properties. So there's another company in England called Newell that is developing a protocol called Weightless. And the Weightless is kind of interesting because it's set up as a special interest group, but it's set up as a private special interest group. Now, Bluetooth did the same thing. So that means that if you wanna contribute to the spec, you have to pay a bunch of money. And in the case of Bluetooth, the way it worked was you paid a bunch of money to contribute and then afterwards, once they released the spec, you can download the entire spec for free. Weightless for some reason is working a little differently and if you wanna just read the spec, which they've now released version 1.0, it costs almost $1,000. And they claim that it's an open spec, so I'm not really sure how that works, but I hope that they perhaps take a turn because I'm sure there'd be a lot of people interested in how this actually works and poking around at it. So now briefly I wanna talk about some of the kinds of attacks that specifically apply to cognitive radio networks. Now, a lot of traditional network attacks will also work on cognitive radio networks, but a lot of times they'll work in different ways. And so obviously I'm not gonna enumerate every single kind of attack here, but I wanna give you an idea of the kinds of things you have to think about when dealing with networks like this because it takes a little bit of different thinking. So I'm sure one attack everyone in here is familiar with is a replay attack, right? You take some traffic off the network and then you store it and then you play it back in a later time. So in a regular network, if you don't handle that correctly, that can be bad, but on a cognitive radio network, it can mean different things because your cognitive engine is trying to, is constantly monitoring the network and trying to decide how it can better optimize it, what improvements it can make and how it can make sure it's not interfering with everyone else. So if it sees traffic returning to it that it has already seen, then it may assume one of two things. It may assume that there's a routing problem, especially if it's an ad hoc kind of network, or it may assume that there's some weird RF thing happening, perhaps if it's seeing a large reflection off of a surface or something, and it may try to adjust for it. So taking advantage of the assumptions that the cognitive engine makes is I think a large attack surface. One of the more maybe obvious methods that you might have think of when attacking these kind of networks is changing the observations that individual nodes can see. So if a legitimate node can observe an incumbent on some channel, and the way it would observe an incumbent, there's several different ways. It can use something as simple as energy thresholding. So if there's some power above some threshold that decides that there's a person there, or it can use more complicated ways. For instance, cycle of stationary analysis or wavelet analysis, and it can actually try to characterize the signal more. I'm not gonna go into exactly how those work because the math is really hairy, but in any case, it discovers that there's a person there. And we'll forward this message along the network until it hits perhaps a compromised node, which case the message can be changed. And once this is forwarded along to the base station, this can cause different decisions to be made. And this can do one of two things. First of all, it means that the real incumbent is now gonna be ignored, and you can effectively turn the entire network into your own jammer. And the other advantage of this particular attack is that you don't have to transmit anything to make it work. So rather than having to set up your own radio and potentially be triangulated or something, you can simply change traffic to change what appeared to be observations. A simpler version of this would be a routing disruption. Again, another attack that is well-documented in traditional networks. But if a node either starts dropping packets or completely drops off the network, then this can be really bad for the cognitive engine because if that particular area physically where the node is located is collecting really valuable data, then it's now blind in that part of the network. And so that can drastically change how the entire network will behave. By the way, you don't need some kind of complicated exploit to make a small node, especially the kinds that are typically used in sensor networks to act like a black hole, right? A baseball bat will also work to take the node off the network. The civil attack, another originally designed for peer-to-peer networks, the idea being that if you've got a trust relationship between individual nodes or the base station, then you can take advantage of that by either taking over additional nodes or adding more to the network. So especially in these cases, it's really important to know who you can trust information from and when you can trust that it's real. And so keeping track of individual nodes and whether or not they're being suspicious is really important. So if you get enough of your own nodes on the network, then you now have basically voting majority and you can vouch for compromised nodes on behalf of each other. And in this way, you can indirectly control the decisions that the network is gonna make because you can just feed it whatever you want it to hear and then you can get a pretty good idea of what it's gonna have to do in response. A priority attack is another interesting attack that, another interesting attack that is perhaps unique to this kind of network. The idea is that you've got sensors in different places like let's say you've got a sensor in a laboratory that's measuring the moisture of a fern or something. And then you've got another sensor in the same network that is measuring toxic fume levels in the lab. Okay, well clearly the one that's measuring fume levels should be much higher priority than the one measuring moisture in the plant. And so by exploiting, by telling the cognitive engine that you're higher priority than you really are, then you can derive resources away from places that really need it. Because especially in these cases where you're sharing spectrum, there's a finite amount of resources to go around. And so it's more easy to sort of clamp those off. Whenever people are designing hardware, especially these kind of networks, it's really, really easy, especially when you're designing small nodes to rationalize weak crypto. And the reason this is because if you're, if you're working on a microcontroller and you're writing an assembly and you're trying to squeeze every cycle out, it can be really easy to say, you know, it doesn't like, who's really gonna try to break into this or it doesn't really matter if someone is able to read this traffic. And a lot of it also comes down to speed versus security, right, because in a network like this, you've got a lot more network overhead. And so understanding what trade off to be made is really hard, but it's really, really important and it's easy to screw up. Data privacy on a normal network is obviously important, but in these kind of networks, it gives you more information about the nodes themselves than perhaps on a regular network. For instance, location can be much easier to discover for an individual node because the spectrum that it's observing is really, really specific to where it physically is. And individual trees and buildings and stuff around specific nodes can drastically affect the spectrum that they're observing and you can characterize that. And so you can figure out physically where they are. And this can be really bad if you're trying to keep that secure. So this, I think, primary user emulation is the most challenging attack that we're gonna have to deal with in these kinds of networks. Primary user emulation, well first I guess explain what a primary user is in the context of the FCC. What they talk about there is a, for instance, a TV transmitter who actually owns the license. They would be the primary user and then everyone who's sharing the spectrum with them is the secondary user. So normally the way this works is you look up the database and that tells you all the primary users and you know, okay, well, if they're not on this list then they must not be a primary user. So this is kind of an exploit of both technology and policy because the problem is as the law stands right now, there's no way to authenticate a primary user. So if you set up a radio somewhere and you start rebroadcasting episodes of happy days in the middle of a network, then you can potentially dost the entire network off the air. And even though the network may very, very well suspect that you're doing something bad, there's nothing legally they can do about it because they have to get out of your way. And so figuring out exactly how to deal with this has been really tricky and there's several papers that have been written on special cases for this, but no one has really figured out how to deal with it. And I think it's gonna require a combination of some really clever algorithms for characterizing real primary users versus fake ones and also some policy change on how we're able to detect them and what you're able to do once you discover that they're there. Because at this point, once they doster off the entire network, then essentially becomes a jamming problem and you can try to use something like spread spectrum to get around that, but it's not ideal. So those were obviously not every possible attack. Those are just general ideas. And similarly, these are just general ideas on countermeasures for how to deal with some of these things. And not all of this has been enumerated yet, but I think there are some several key important ideas that we're gonna have to think of when we're trying to solve these issues. The first is using cooperative intrusion detection. Traditionally, we see maybe a single intrusion detection system, but in the case where you've got a bunch of nodes that are all talking to each other, because they can inform each other, they should be able to inform each other about each other. So not only are they observing the spectrum in general, but they should be observing each other's behavior and keeping each other accountable. If they observe strange traffic, then they need to alert each other and adjust their trust functions accordingly. Device reputation is another really important thing. By keeping track of the quality of the spectrum that each individual node is receiving, again, as well as their traffic, then you can build this trust function and you can sort of weight your decisions based on that. And this is not only something malicious, right? If there's something physically wrong with a node and it starts reading weird spectrum, then that's another legitimate reason why you need to know, okay, we should factor these observations less into our decisions. And device location, again, is another important aspect for this because physical security on nodes like this is a big deal. And having physical access to them, even a single node, can significantly affect the network and change the decisions that are being made. So why does this matter? Why should we try to work on some of these problems? Well, this plot right here has probably been seen by, I would guess, every major networking executive in the entire world. And this is showing, of course, the mobile data predictions over the next couple of years. And Cisco is predicting insane numbers. They're saying that by 2020, there'll be 50 billion devices connected together on the network. Right now there's about 10. So there are a lot of these companies that are both predicting and kind of freaking out in preparing for what they think is gonna be this really big deal. So it almost doesn't matter whether or not this actually happens because they're sort of self-fulfilling prophecy. They're predicting that it's gaping and preparing for it. So I think that this will be relevant either way. But this is the spectrum app in the United States right now and you can see how fragmented it is. This goes from three kilohertz to 300 gigahertz. So this is the current solution, is chopping it up into smaller pieces. And obviously we can't keep doing this. And eventually we're gonna have to figure out how to deal with that. So this is another application I think of cognitive radio specifically, is cell phone towers. As the density of cell phone towers increases and as we see the proliferation of femto cells, I mean the number of transmitters is getting closer and closer together and so they have to make sure they're not interfering with each other. And if you've got a femto cell in every house, then you're gonna have problems. And some of these are already beginning to do some very, very simple cognitive aspects to them where they'll try to avoid each other. And I'm only seeing that as gonna continue to increase because it's gonna let them be even more efficient. So to sort of do experiments with this and to play around with this, we need tools. And if you've ever done any work in RF before, then probably the first thing that comes to your mind is the USRP, which is this really neat little software-defined radio. The only downside is that it's a little expensive. I've actually written some experimental cognitive engine code, some base station code in GNU radio that will run on the USRP and I'll link to it at the end if you wanna play with it. So this is good for acting like sort of the base station that will make decisions and command smaller nodes. However, if you're trying to do experiments with the network, then you typically need multiple nodes and maybe you can afford one USRP but you probably can't afford like five of them. So the other end of this is the really, really cheap XB type thing, which is just a little wireless module. Kind of you put data in one side and it magically comes out the other side. And the good thing is that it's really, really cheap so you could buy a bunch of them. But the problem is they're not frequency agile at all and they're not very customizable. You can't control them very well. So I kind of wanted something in between. And so there wasn't really anything at the time so I built something. So I built this board that I called level and it goes from 30 megahertz to 4.4 gigahertz. Outputs about 60 milliwatts. It uses a chip by TI that's based on the MSP430 which I'll talk about in a second. And so it's compatible with TI's really cool off the shelf mesh networking stack called Simplicity. And it fits onto our Arduino shields. It isn't an Arduino shield, which I'll clarify what that means in a second. And they cost about a hundred bucks. So this is what it looks like. And I'll briefly go over some of the topology here. This is the CC430 which is microcontroller, like I said, by Texas Instruments. It's got a MSP430 core in it as well as a CC1101 transceiver core in it. It's low power and it's relatively low bandwidth as well. So it's good for doing low power sensor kind of stuff. The local oscillator is this part by analog devices. This is an ADF4351 wideband VCO. Those are mixed together in this ADEX10L. This is a passive mixer. And because it uses a single antenna, it's got two RF switches that are controlled by GPIO on the MSP430 so you can switch from transmit to receive mode. And then it runs through a bunch of filters and some amplifiers. And I also added these two things which are optionally populated. These are directional couplers. And these basically let you tap into the RF signal that's coming out of the MSP430 directly and these ADF4351 directly without going through the mixer and filters and amplifiers and everything. So it helps with debugging. This is what I meant by it fits onto Arduino Shields. As I was building this, I thought it'd be pretty cool if you could actually interact with other devices. For instance, your laptop obviously can do Wi-Fi but it can't do stuff in 500 megahertz. So I was working with TV White Space and I wanted to play around with that. So I realized that Arduino Shields typically have similar SPI pinouts for a lot of these breakout boards for pretty much everything. And so it fits right on there. And once you've de-packetized whatever you're receiving on the top board, then you can just send it over serial to an Arduino Shield which will typically do all the hard parts for you and turn it into 802.11 or whatever you want. So this actually is on a Wi-Fi Shield and this is on an Ethernet Shield. So this board by the way is still, I would still consider it kind of a prototype. And I don't really have a way to mass manufacture them right now. However, code and firmware and everything and schematics are all on GitHub and I'll link that at the end. And if there's enough interest then we can see what we can do. There are other tools out there too that are really good for this stuff. The HackRF by Michael Lassman which just launched on Kickstarter a couple days ago. Pretty neat tool. The BladeRF which also was on Kickstarter earlier this year. And then there's another board that I found out about very recently called the MyriadRF which is this pretty neat little board that it's not quite as frequency agile as the other two boards but it's really neat. And so all three of these are really good tools for playing with this. And all three of them didn't exist when I was originally designing my board which is why I didn't use any of them. So what's next? Well, the whole spectrum crunch thing depending on who you talk to some people would say that it's imminent and we're all doomed. And some people will say, well maybe we have a little more time than we thought. There's new techniques people are using that might buy us some more time. But here's what we know. We know that a lot of these companies that have a whole lot of money are investing a lot of money in cognitive radio networks. They've been doing experiments in turning entire cell phone towers into cognitive nodes. And I think that we're at a really unique time because like I said, these are deployed to the point where there's actually real networks in the field right now. I mean in France with SIGFOX, they've got at this point apparently thousands of devices connected to paying customers. And there's dozens and dozens of installations in the United States right now. Actually West Virginia University just a couple of weeks ago started serving wifi to some of their dorms over TV white space. And so we're at this really cool time where the networks actually exist but they're not used by so many millions of people that it's too late to really change fundamentally how they work. And so I think by attacking these kinds of problems and by trying to solve these, I mean really non-trivial issues be it either technological or political. We can really get, we can really be on our way towards making sort of the next generation network and making sure that we're able to deal with whatever the results of these predictions end up being. Thank you.