 Good morning everyone. I think we'll start right now. Hi, I'm Rabimba Karanjay. Today we are going to talk a little bit about something I termed as security pie, which essentially is like a big hack of a lot of scripts cobbled together, a lot of open source software cobbled together in your Raspberry Pi which can run and hopefully give your Raspberry Pi a little more insight if it is getting attacked or not and if your IoT device is getting attacked or not. So, with that a little bit of introduction. I am Rabimba Karanjay. Right now I'm a student at Rice University and I also contribute to, I used to contribute to Mozilla's emerging technology division and I work a little bit on VR and other stuff. This work mostly started as a hobby project and with some push from Mozilla about a certain project which is not there anymore. And so this is completely different to what I do in my school. So first why? Whenever you go to home, what are the devices you see? These are some of the things I don't say in my room but in a lot of people's room. You see a smart TV, I see a Nest, you have little bots which you can control from your mobile and it just runs around. You of course have your mobile which is essentially a mini computer always connected and of course you have other intelligent devices like a little connected toothbrush and now it can even actually upload your data. So great, these are all connected devices. Yeah, but I'm assuming the computer is a little bit more secure from the operating system perspective. So I'm mostly going to concentrate on the devices which I'm assuming are not. But the solution applies. Now let us see, how are they all getting connected? In your home, all these devices, how are they getting connected to the network, to the internet? Most of them don't have any inbuilt connection to the internet, so they rely on your home Wi-Fi to connect to the internet. So we have a single point of contact which is probably my Wi-Fi router and if I can probably protect that, maybe I can protect these. It's the same with IoT devices, you do it yourself IoT devices and other things, we can do that. Now let me ask you a question, how many of you have tried to do in some point of time like you do it yourself project with Raspberry Pi or maybe a Big O bone black? Quite a lot. And how many of you just wanted to get the feel of the product? Okay, I'm going to try the new Raspberry Pi 3 or maybe the Intel Edison. I'm going to install something and just turn these lights on and off, maybe built a kind of like coffee on off machine or something like that. So when you do that, you quickly code, put something, it works and you are excited. Yeah, it works. I have a dashboard, I built a dashboard. How many of you actually think that, okay, these are getting connected? So do you do anything to protect them? Like installing firewall? I see one person, two, three, okay, that's even quite a lot. I myself didn't used to. So because, okay, it works, it works. And when I was actually doing like making something for demo and I realized the new Raspbian, if you do an update even, it doesn't connect to HTTPS sources. It's connected to HTTP sources for repo. Now, we need to protect these devices, these hardware which are mostly unpatched and protected because they are hard to upgrade. Some of you were here for the previous talk. You know how hard it is to upgrade like embedded devices. So we need other ways. So we need to protect the legacy devices. Wait, do we need protection? So this is a graph I just pulled out from an article from Symantec and they were trying to see how attacks on IoT devices increased or decreased over the time. And this was published like beginning of 2006, I think. So as you can see, it sharply increased into 2015 and they predict it will keep on doing like that. So yeah, we probably need some amount of protection. And another case is that the recent DDoS attacks, a lot of them are now in the way of using our household devices. Maybe your refrigerator is taking part in a botnet. Maybe your Raspberry Pi, which you put there to just sense the temperature. Now it's part of a big botnet bringing down things like GitHub or things. Now we need to protect it somehow and we don't need to break a branch for that. So my tools for the trade is a Raspberry Pi 3 with a case to protect itself and I need a micro SD card for noobs and I need a power adapter, which is very important in this case. So why Raspberry Pi? When I asked how many of you used Raspberry Pi, a lot of hands went up. Matter of fact, they sold more than 10 million devices until 2016 end, September, which is quite a lot. And these are like cute little small $35 literal computers which you can carry around or put into your project. They are capable of running everything a computer does, but like you pointed out, I did not put computer in my talk because they are not as protected as a computer. So to do that and to protect other devices, let me start. So I'm going to install a noobs, like an image on that, mostly the Raspbian, and we'll start doing something with it. So the end goal of this talk is that we'll have a device which if you plug in your network, it will try to protect your other devices, which are connected to the network and which are not upgradable or patchable. How to do that? So what about my network? What do I know about my network? Mostly nothing. Apart from the fact that there are packets getting flown around all the time. So all the packets belong to us. Now we need to configure the network. Now there are three ways as I found out, which I can do. So first is that I am defining a gateway in my Raspberry Pi and I connect to that gateway and all my traffic flows through that which is not at all a good solution because somebody trying to do something nefarious or I'm assuming your device has already got hacked and they can just bypass it by changing the gateway settings. The option two I found out which was pretty elegant is that some routers, some hardware support mirror port where you can actually have a port which takes all your traffic and sends it to your computer or Raspberry Pi. The problem with this approach is that this is pretty device specific and most of the devices I found supports it are not exactly household devices. So they are like pretty big switches and costly devices. So not a good fit for a grad student. So the grad student way put your Raspberry Pi in between your router and the internet and I get all the traffic. So the problem with this approach is that there are performance implication my Raspberry Pi might not be able to handle all the traffic and it might hinder my Netflix streaming. But let's try. So what are the things I'm going to do? First I'm going to try to sneak the packets. I'm going to try a deep packet inspection for my packets. Now not exactly deep packet. So I'm going to go with a tool called Bro which is IDS. So what it essentially does is that it gets into your device. It can capture all your traffic. So it can also take a pcap file but it can also life capture your traffic and it does a lot of things on that. Now getting Bro into Raspberry Pi was a little challenging because it has a lot of dependencies but there are a lot of great articles out there so I quickly found out how to do that. And so we install Raspberry Pi, so Bro in the Raspberry Pi and all of these which I'm going to show now and other slides, they're all cobbled together in a script which I'm going to show later, the link to. So don't worry about all the codes. These slides look too dry. Now once we have Bro, what happens is that this is my packet, this is the captured packet, with the Bro I have something like this. So Bro actually takes your traffic and passes it and does a lot of things like connection log, DSCP log, DNS log and all of these has different indicators. We can actually act with these indicators but I want to make it more intelligent. How can we make it more intelligent? We need a huge list of data to do that or maybe a label data where I know which is attack which is not or which are bad actors which are not. How do I do that? I want to do fingerprint analysis like CSI does. Now to do that and to make Bro great again I'm going to integrate critical stack. So critical stack I think recently got again acquired by Capital One but this was an Intel company and once you go there they have a lot of different sources of data. For example, if you see I have phishing domains, I have fault domains and there are a lot of things. So if I get these data then I have a way of knowing which are actually known phishing domains which are if something is running from my Raspberry Pi and sending those links I know there is something wrong. So this gives me a place of data. Now installing it in Raspberry Pi is fairly simple. You have an arm distribution for it, you install it and you're good to go. Now you need an API key to access it and so the API key is actually free. I even tried last night to see if it really is still free and it's still free. And these are all code snippets from the GitHub where it's hosted which I'm going to show later. Now once I have this I also need a way to actually audit my device. I'm assuming my device got hacked, even my Raspberry Pi got hacked somehow. So I can never ensure that it won't be attacked or it won't be hacked, I can never ensure that. But what I can do is that to have a way to audit it so that I know how it happened. So I need a way to log it all and to see. So what about my logs? I'm going to stash the log in LogStash. So how many of you have worked with LogStash or Elasticsearch before? Okay, so I'm going to use three devices which are commonly called ELK, so Elasticsearch, LogStash and Kibana. And all three of them together will give me the system which I'm going to show today. So what LogStash does is that it takes the data source, it takes as input plugin, it filters the plugin and outputs the plugin and you get the logged data. So this data you can have different filters which you can put by coding yourself and have different notifications on that. For example, I can make my Raspberry Pi to actually inform me, to notify me in case something bad happens. So this notification part LogStash will do. So in short, what it does is that it can take different type of inputs which you can see some of them here. Then we have a different type of filters like GROC, search and space, a lot of things. Some of these I used are GROC and GUIP filters and some other. And then the output can go to a different type of things. For example, in a database maybe or maybe a visualization tool or somewhere you can actually see and analyze the data. For our case, I used Elasticsearch because that was easier and pretty. So what we will do, we'll utilize some custom patterns to actually see what is going on and we'll use GROC messaging patterns, we'll add some custom fields, use GUIP. There is a very specific reason for that. I'll show a later slide. We'll use the date match and translation of threat intel. So in short, what it will do is that if somebody from a different country is connecting to your device or your device is sending data simultaneously to lot of weird countries, it will show a flag. Okay, no, what is going on? If I have different message formats or if I have a package inside which is downloading maybe the frack all the magazines and then it will show me that, okay, no, this is something weird going on. So getting lockstash. So to put lockstash in Raspberry Pi is a little problematic. So there's a simple way of getting it downloading and trying to install it, which probably won't work because it needs some other installation files to do it. Once we have all of that, how do I see the logs? All the logs we have will eventually go to Elasticsearch. So to get Elasticsearch, I just download the date file and install it. Once you install it, you will have to update the cluster file name in the YML file. This you will have to do manually because my script doesn't do that. Now, this is all fancy, but I actually want to see. When I want to see, I mean visually see what is happening. So I need Kibana. Kibana is kind of like a visual dashboard where you can have a way of showing all this data and digging deep into it. So this is how we install Kibana, but if you try all this, this is like an official method and then you get an error because it needs something in ARM, which is a node version. So we install the node version of ARM and then Kibana works. This all, by the way, takes a lot of time and trial and error to essentially get done. Hence the script and the stock today. So once we have Kibana, we can actually configure an index pattern of how this all will work. So now comes the hard part. So once we have the configuration file, you see where the position it serves and where the Elasticsearch node goes. And we have an option to provide the cluster file name, which is Elasticsearch for main case. Now we can provide different filters. This is a simple GOT filter I have put. What it does is that it tries to take my whole message in a string format. It tries to differentiate based on my IP, which will be my client, based on different words, what will be my method, and the URL where it is getting connected. Once we have that, we save all these patterns in a file. So once we save this pattern in a file, they will be saved in a different messages, and we actually label these messages. So this number is my label. Why do I need the labeling and the fields? So once I actually have all these parsed data and I add labels and fields on it, next time when something bad is happening and Kibana has to search between all, like if it is already in the database or not, it doesn't have to do the computation extensive operation of regular expression. In the whole message. Instead, it will just search for those specific tags to see those numbers. So these numbers, like 291009, which makes it much easier. Now, what will I have with my GUIP data? So if I want to put more GUIP data than what I have, then I define a specific field here and put my GUIP file. If I have something from my Honeypot somewhere else, which essentially I have. Now, we are going to also see that what happens if my device tries to connect to those bad IPs, which I already have defined. And once it does, by bad IP, I'm also assuming that I will have information from different sources. For example, the IPs, which are TorExit nodes, maybe. So if I can put all the TorExit nodes inside and once I go through, if my device is connecting to any Tor node and I don't use Tor in that case, it will show me a notification that, okay, something is wrong. So this is mostly how it gets done. If it gets something like this, then bad IP, something like that, very bad IP, just normal search. So these are the two sources where I get my IPs. So TorExit IP I get from here and my bad IP source, which I get from here, which we'll have to input to the Kibana database. Now what do I know? I just told you we import a lot of files, we download a lot of files from a lot of websites, put it in our database. We still don't actually see what is going on in our device. We don't have insight. So to get the insight, this is a simple email we are going to get. So I'm assuming I have a simple server, which we have to protect. And so if the search matches and you have a device which connects to a TorExit node, then you get a mail that somebody is connecting to Tor from your devices or any of your household devices. And am I using Tor? I'm not. Then what is going on? Now these are the different type of alerts this script is right now able to handle. The TorAppian malicious IP list you can update with if you have more and if you don't have you can use the existing list and it will go on. The thing is that there is no learning part involved in it. The system does not learn yet. Now if I see in Kibana, so this is something we get. This is the imported data and you see that the attacks different IPs from different regions how it all works out. And the awesome thing about Kibana is you actually cannot drill down into the data. So you actually can see what is going on from which country, how many and everything like that. So I very much like this database but it doesn't do us much good. Now all of these I told assuming I already got attacked and hacked even. So all of these are like, okay this already happened and I am trying to prevent that of doing more damage. But what about proactivity? So here we have in-map which will try to schedule a scan of the subnet. So if somebody is scanning and if you have a new device connected or anything it will parse it and put it in a file which we can see that okay these devices these are normal devices which gets connected these are new devices which got connected. So all these wrapped up in a nice script is here which you can download and if you just initiate the initial installation script it will download and install all for you. Something I missed out on my slide is there is also another portion of code you will see there which mostly deals with network discovery. So we try to actually see which devices are getting connected or disconnected to my device. Now learn more. This is not part of that code this is a different code but I thought okay if I am getting this much amount of data why not try to do something more? Machine learning is all raged nowadays so I try to do something similar. So this is specifically for a device which was kept open with just SSH open to the public and by open I mean not anonymous login is allowed no it's not it has username and password entry and it logs every access attempt normally what it does. So I wrote a simple person which tries to parse all the logs I get and see what is going on and put all the insights I had from different GUIP. So as you can see same user which I had a demo user setup who logged in at the same time from very two different geolocations which was okay that's weird. Now on the right side if you see that's a graph that's showing how many users try to log in so since this is a honeypot they can log in they can try to do different stuff which will see they are succeeding after some point they'll just get like they'll get disconnected and it will go into the block list. So next time they won't get connected at least from the same IP. So these are the different login times I got from in a different time span and the most interesting thing is this so the X and Y axis denotes that how much time a user was logged in so these are all attack users because I never gave actually the password to anyone or the accounts so these device actually got compromised and this is the most interesting part which is this user got user was logged in for a negative time so I actually got a SSH login who was logged in for a negative time period which was awesome interesting I don't have an explanation other than the fact he changed geolocation time zone whatever or my system just failed. So does it work? So this is something else I got so the left chart is from the same semantic article I got from this morning which they tried in their own honeypot of semantic and these are the different countries which actually attacked and are attacking mostly the semantic honeypots and on the right side these are my two Raspberry Pis which got kind of owned now if you see the country list they don't exactly match but they match pretty good enough so this is just a trend that what semantic with their very big infrastructure is noticing actually happens actually can happen to us as well another lesson to not only use SSH with username and password so this is where the remaining part of the code resides where you can get the code for the parser and getting how you can actually cluster them so this actually has a lot of more things going on for example we do a lot of try to do a lot of came in clustering to cluster different activities and stuff which I'm not covering in the talk so this is where the code resides now is there a commercial solution available? there is actually so SS and TrinMicro actually collaborated and made these awesome devices which kind of does the similar work and say they cost like around 140 to 350 dollars they called it the AI protection so artificial intelligence is now protecting us what it does is that it's exactly same it sits in your home it acts as a barrier between your device and internet and the most awesome claim it does is that it tells you that it can patch on the fly your devices without actually patching them so it can do on the fly patching so that was like I was mind blown what is happening how can somebody do on the fly patching what essentially turns out is that they do deep packet inspection and do something very similar to what we do with initially with critical stack they have their own vulnerabilities what happens with the existing device once your device gets connected to these routers they get registered that this device is connecting and this device is connecting when all your data goes in they do a deep packet inspection if they can find a similar type of attack they just block it so that's what their auto patching does but since I realized this is too costly for me so I was checking with my Raspberry Pi which is 35 dollar total solution probably will come around 50, 60 maybe so that was mostly all I had for today and thank you everybody I'm open for any questions you might have yeah so the question was instead of trying to use a shell to actually download everything can't we use Ansible or something like that which essentially makes it much easier and life much easier to like maintain it later so the problem with that is that Raspberry Pi has a limited capacity so I didn't want to install anything more than I needed to like make it work so that is the same logic for installing grow instead of snot I actually tried to use snot for the same purpose which got installed nicely then the device literally crawled so I could not do anything in the Raspberry Pi at all after that so that was one of the reason and I have not tried it with Ansible in Raspberry Pi I'm pretty sure if it can handle it it will work just fine so this was just like a kind of concept that what you can do I'm sure all of you can like write much better easier script to do all these in one go but I tried to do that in shell script yeah which one so there are two I did so I actually so something I have left out is that after doing all this I tried to do like install which actually scans my environment so OpenVas things like that so with just bro ELK installed the Raspberry Pi works pretty fine I can watch Netflix I have tried watching Netflix in 4K so packet goes fine and you can even log into that you can do single stuff like it works so the load is around more than 50% I didn't need a like cooler or anything but with OpenVas installed it still works but it takes a lot of time I mean it pretty much if you install like next time you go inside it it try to do any different stuff it kind of crawls you will get very like big lags and things like that so till now it's okay and Raspberry Pi 3 I have not tried in 2 so no idea I have not tried with OpenWRT and DDWRT I just got my hands on in one for in my lab for a different purpose maybe I'll try it later on I did try it with similar things for example Intel Edison but then again Edison costs like almost $100 and which essentially brings it much makes it much less desirable but it works much faster yeah that's what I did so I just passed it through so that's why I said inline so the elegant solution would have been yeah so my limiting or my assumptions are that the bad actors won't have physical access to my router or anywhere near that so they will only access to the wifi so that once they connect to it I can monitor the remaining part it's not applicable for any IoT devices that they don't have physical access to your network endpoint or anywhere where they can bypass any monitoring yeah it's just for the Raspberry Pi I have had instances where it just randomly stopped or restarted with my mobile charger so that is one essential piece I learned that you save a lot of debugging time if you have a proper 2.5 volt like charger there are a lot of other threads so I mean I just looked into a very specific set of it for example there are two files in the repo where you can like put what other things to look for and for me I just looked at IP, GUIP, time and things like that I wanted to do a time series kind of thing but there are a lot of other aspects for example in this one when we are extracting features you can think of a lot of different things to actually incorporate as your feature and not extraction just think of that in critical stack when you are getting a lot of indicators you can mix and match them with different properties and you have your like this is bad or this is good for me where I was testing this with so I was testing this with in my home which had essentially I had almost around 11-12 I think 11-12 Wi-Fi devices that was another eye-opener for me how many even we don't have a lot of things how many Wi-Fi devices we actually connect every day I have tried that which essentially got me locked down locked out of the system so there in the code you will find instances where it uses fail to ban not users actually just to use their like block list or even IP tables will work that and which is not a very good way of doing it because getting notification is okay but if that was real traffic or something really was going on you are doing something in a laptop and it automatically blocks you so that might not be a very good scenario but if you have a like pretty good confidence things that a confidence score of like okay these are not good so anything like this happens block it so you can do that so essentially just IP tables or block list integration part I mean in the code there is a way to integrate it it's a very bad way I won't recommend it since I got locked out and there probably is better way but there is a way in the code to actually I just changed there so when you actually use fail to ban it actually maintains two lists like which are good and which are bad and you can just mimic the same behavior if I think that okay these these IPs are doing this kind of thing I just send it like okay these tried five times which is their indicator of putting it in block list it will just block that IP that's specific instance any more questions no it does not have a SSL strip so also if you have SSL certificate certificate meaning none of these will actually work that much yes you can so thank you everybody