 Thank you for coming to my talk, Chigula, a Wi-Fi 4N6 framework. There are a lot of talks going on, so I really appreciate you coming down. And hopefully it will be fun. Before I begin, a little introduction about myself. Started as an electronics and communications engineer. Worked with a bunch of companies as a layer 2 programmer and architect. Moved on to wireless security. I discovered the Cafe Latte attack. Cracked web cloaking. A couple of other interesting things. And then started my own training company. I also run security tube.net and pentester academy. I have written a couple of books on wireless security. The latest one is make your own hacker gadget which we also have in the vendor area. So before we begin with today's topic, this has been a team effort. I started off with the idea of creating a Wi-Fi 4N6 framework around January of this year. So in the beginning when I worked with a lot of interesting ideas, I eventually figured the amount of work requires a team. So Julian, Ashish and me were primary responsible for creating this. Chigula will be available at pentesteracademy.com slash chigula right after this talk. Can download it and try it out. Okay. So what was the motivation? Now I have been doing wireless security probably for the last 10 to 15 years. And a lot of times I have struggled trying to use wire shark to get meaningful information from packet traces. How many of you have tried to do 4N6 with wire shark for Wi-Fi? How many of you succeeded? So given the sheer amount of traffic at the very same time, all those different packets, I figured wire shark is of course great at being a sniffer, right? Showing you the packets, parsing them. Unfortunately, when you want a framework to build stuff, wire shark is difficult to work with. Yes, you could write protocol dissectors, maybe in Lua, even in C for that matter, but the learning curve is actually quite high. So I really wanted a Wi-Fi 4N6 framework first for myself. I was the end customer to begin with. And something which pentesters can use, not for programmers, right? Because a programmer could probably write his own parsing code and do a lot of stuff. I wanted something which would allow pentesters who just knew SQL to go ahead, run arbitrary queries and get meaningful information. So what is Chigula? It's a set of tools and scripts for wireless 4N6. The core is written in C++. This basically involves all the packet parsing, pushing into the database, and all the plugins which you can write are in Python. Actually, you could write plugins in any language which has a SQLite driver. You don't have to only know or use Python. What platforms can Chigula work on? It can work on Windows as well as Linux, actually any POSIX system. The packet parser itself could just be ported on any platform for that matter. So here is a screenshot of Chigula on Windows. Here is the same on, this is Ubuntu 14. Now, what can Chigula help us with? So what we've done is we take a pcap file, pick up individual packets. We parse each and every header field and map it to a corresponding SQLite table. So you could actually write SQL queries which could be as granular as select star from Mac headers where FS type equals something, dot subtype equals something. And pretty much go ahead and write tools on top of that which can probably work on multiple queries. And we'll take up some examples today. Now, Chigula can allow you to create blacklist signatures. So if you wanted to go ahead and create a signature for AirBaseNG or any Honeypot tool for that matter or Air Replay, it can allow you to detect and visualize Wi-Fi attacks. So we'll even look at how Mac address spoofing can be detected reasonably reliably using Chigula. So what is the architecture? At a very high level, we have data sources which basically feed the packet into the actual packet parsing engine. The engine pushes that into a data store. And through that data store, you can use any data query tool which can work with SQLite or write your own scripts which can query the database. Now, Chigula can work both in offline and online mode. In the offline mode, it's going to take PCAP files. And the online mode which is currently experimental and works only with POSIX based systems can actually sniff packets off the air as well. Okay, it's POSIX not POS. Looks like the captioning, it's probably a little off the mark. And of course I have an accent so I don't know who's captioning it. So either way, as I said in my last talk yesterday, you'll have fun probably looking at Chigula or looking at the captions. So now the packet parser itself is a standalone tool which can be used for a lot of other purposes. In Chigula, we convert packets into SQLite databases but the standalone tool PCAP to XML and SQLite also can convert packets into XML so that you can go ahead and write X queries, X path and all of that stuff would work as well. The focus of Chigula is to use SQLite but you can use PCAP to XML and SQLite to generate both if you want. Now, we currently parse most management control and data packets of interest. So all those little subfields inside a beacon frame can be parsed. So you could write arbitrary queries which are searching for a specific information element in beacon frames across the packet traces which you have. Now the data store is a SQLite database. The reason we chose SQLite to begin with is it's available on all platforms, does not require a server based process. You could even port this to embedded systems, right? This isn't limited to just your workstations. Of course, at the very same time, you can migrate the schema to pretty much any database you like. Now once all the data is inside, you can use any data query tool, SQLite browser on Windows or whatever is probably the best SQLite querying tool on Linux. Analysis scripts can pretty much be written in any language as I mentioned. So let me go ahead and show you a quick demo. Actually, rest of the talk is really just demos. That's probably going to be the most difficult part because I have to understand, here Windows has color change. What's that? Screen background. I just said this. Is this better? Okay, great. So I have a PCAP file here. Pick up a sample, right? I think this has around 400,000 packets in it. We've tested Chigula up to 3 to 4 gigs of PCAP file. There hasn't been any problem, but if you report any bugs, you can always send it to us. So you can give the PCAP file as an input. The hyphen S option stands for the output SQLite database. Okay. Not the best thing to happen. Okay. So here is kind of I figured out what kind of went wrong. At the end we check if there is a latest version. So I have to do a try catch there. So it works. So if we open up the database file, here is the schema. For literally every single header field, we actually have a table in there. So as an example, let's begin with MAC headers. So you have the protocol version, the type, the subtype, every single field, including bit fields have been mapped to an actual column. Now this extends across different frame types. And at the very same time, we even save all the TLVs, the type length values, which are pretty much all the time the IE's or information elements separately so that you can query them. Now here is how the data looks like. So we could look at the MAC headers table. And you can see we have around 400,000 entries in there. And every single field is mapped. So now let's actually run some interesting queries. Wouldn't want you to kind of see me type everything out. So here is the first query, finding all beacon frames. This is something Wireshark can do as well. But let's try this with Shigula. Paste it in here. Run the query and there you go. It returns 36 rows. There are 36 beacon frames basically inside the packet capture file. Now let's look at something Wireshark cannot do with Shigula can. With Wireshark, can you figure out all the unique devices around and get a distinct list? So I'd like to know just the unique list of all clients and IE's. Now with Shigula, this is actually quite simple. All we have to do is select distinct across address 1, 2, 3, and 4. And the moment we do that, you'll actually see how powerful the framework is. And there you go. We have 395 rows in there. Right now, of course, the broadcast MAC is included, but you can always write a little wrapper to exclude broadcast and multicast addresses. This is just a query to look at all distinct addresses across the different fields. So we have 395 rows. There are 394 devices actually in there. Now if you wanted to find all the access points, again, extremely easy, all we have to do is find the distinct addresses which send out beacon frames. There you go. 36 APs. Right? Should you ever manage to do this with Wireshark? And you can actually write your own scripts which can query and take this out, put them inside reports and do a lot of other interesting things. Now if you wanted to look at all devices sending data packets, could basically just do a select distinct over address 1. Go in here and there you go. You have all devices in your PCAPs which send data packets. Now we can actually even do other interesting macro level statistics. What's that? So basically we have a fake AP in there which does interesting stuff. When we look at MAC address spoofing, you'll see that demo. So that's an interesting thing he noted, but you're using the same thing for the attacks as well. So you can actually go in here to a select average and the average frame length is 106 bytes. Now here is another interesting query. You could actually do a time delta between packets where you can figure out what is the time difference. Of course you can add the MAC addresses and all of that as well in case you're interested in just one device. There you go. Now we can do simple queries but what about more complicated ones? If we wanted to find the list of SSIDs and MAC addresses and return that as a table, right? Something currently you maybe rely on AeroDumpNG to export everything into a CSV and then maybe you read the CSV. You can actually do this quite easily using a join in Chigula. Actually go in here, paste it and there you go. This also returns hidden SSID networks. Right? So we see the AP MAC address and the SSID, it is broadcasting. Now what we've done is we've parsed the packets and actually added more macro statistics in there. So things like encryption, you would not have to go ahead and parse the IS yourself. We already have it in tables which we create. So we can go in here. Come to that in just a bit but you can also do the hidden SSID networks. Let's show you this first. So these are all the hidden SSID networks which are around, right? All the distinct list. Now if you wanted to figure out the different SSIDs on a given channel, you can do that as well. It's again a join between multiple tables. The entire SQL schema is published along with the tools so you can write your own queries. If you love SQL injection, you'll probably love Chigula. Well for what it's worth, Chigula could have a SQL injection, right? If the packet parsing code isn't great. Now one of the other interesting things as a pen tester which I always wanted is a list of client MAC addresses and SSIDs they are querying for, right? Again as a distinct list. So this is something we can easily do with Chigula. Actually pick this up here and run the query and there you go. This actually gives you the MAC and the SSID. Now keep in mind every single client also sends a null probe request, right? So you have the null probes including a ton of other SSIDs being queried for. Now you can even optimize this on a per client basis. So you can go ahead and see, hey, for this specific client which are the SSIDs being queried for. So there you go. This client with the MAC address 80, 6C, 1B etc. is actually querying for all of these SSIDs. Now we are talking about macro tables, right? With the SSID, authentication, encryption and all of that. So here is a query which can actually give us the BSID, SSID, authentication and encryption with just simple query. There you go, right? You could have parsed the packets yourself and figured out a lot of this information. But Chigula holds many of these in macro tables as well. You can reconstruct beacon frames. People who were there in my talk yesterday for cellum, beacon frames was one of the ways we were using to go ahead and create signatures. So you could construct entire beacon frames using the TLVs for any network in there. You can do the same for client probe request as well, right? So these are all the TLVs sent by the client. The value is base 64 encoded. That just makes it easy for you to go ahead and put it in your scripts rather than having to work with binary values. Now we can even go ahead and look at transmissions on a per device basis. So for a given MAC address, we can look at different timestamps when the device transmitted. You could even join tables to figure out what were the actual frames. So this is Chigula going ahead and querying multiple data from the P-CAP which is there in the form of a SQLite table. So these are queries which you can write, you can extend, you can add your own if you like. Any questions about the querying or anything about the architecture? Now this is step one, right? Step two is being able to actually do forensics and IDS with this. Now broadly, if you look at forensics or intrusion detection, the first is being able to detect signatures of existing attack tools. This is actually quite easy to do and again people who came to my talk yesterday, most attack tools have fixed TLVs, most of them don't even change the values. Of course, finger printing tools would depend on the version of the tool. The author of the tool can change that but you can do this with Chigula. I'll just show you in a bit. The second attack or rather class of attacks are replay attacks. So how many of you cracked web? Where we replayed the same packet over and over again, right? Another example of a replay attack could be D-Oth. We go ahead send a huge burst of D-Oth packets from time to time to ensure that we can disconnect clients from the authorized network, right? So this is an example of a replay attack. Then you have unauthorized devices and associations. So for example, you can have an access point which you created for your network but someone can create fake APs and honeypots, right? How do we go about identifying them using extremely simple scripts? Now last but probably one of the more interesting ones is how do you detect MAC address spoofing? So let's look at each of these. Now the first signature we look at is MDK3. Anyone used MDK3? A couple of people. Okay, MDK3 can create fake beacon frames. The most common use is to randomize the SSID so that your devices get confused. Everything looks completely garbled. Now if you analyze the beacon frame for MDK3, what you would actually find is it has both fixed values IE as well as the SSID values overall if you look at the alphanumeric nature of it are very skewed. So we've written two scripts, one for each. So let me show you that. I'm sorry. Wrong directory. Okay. Curl won't resolve because it's checking if basically we have a newer version. Now what you would actually find is Chigula detects that all of these are SSIDs created by MDK3. Let's actually look at how simple the script is to go ahead and do this. How many of you have coded in Python before? Okay. So this is Python. Most of the stuff which you see in between is just arguments and help just so that people can look at it later. So here is an example where what we check is if the number of alphanumeric characters on an average is above a specific threshold, right? That's one way of looking at completely random SSIDs. Now the other interesting way is to go ahead and look at the TLVs. Okay. The screen is actually giving me. So here is what we are doing. We go ahead, open the PCAP file and we've written a very simple library which you can use directly called models. And through that you can go ahead import most of the intelligence Chigula has. And all you have to do then is basically tell Chigula what kind of packet you would like to select and analyze, right? So we have a Mac headers module and you can go ahead query Mac headers and look at different BSSIDs and things like that and then associate it with SSIDs and then check which ones are skewed. Now we can actually do the same thing with Airbase NG. Now Airbase NG has a very specific signature. Does anyone know the only four elements Airbase NG has in its probe request packets and beacon packets? For probably the last seven years, it's quite easy to fingerprint. So it has the SSID, of course, the channel, the rates and the extended rates. And at times some of the older ones even have a bug where the probe request packets probably go ahead and add a null SSID as an additional tag. So using that we can go ahead and create a signature run over the entire trace and if there are any issues in packet parsing it will tell you the specific packets. In most cases some of these packets have been truncated by Wireshark. So it can't complete the whole analysis so it just tells you I have to stop somewhere in between because I don't have the full packet. And there you go. It actually tells you that this is the MAC address which creates the fake AP. And we can verify that by looking at the trace file. So this is something actually created at Defcon. Let me go ahead and apply the filter. And all of these beacon frames as well as probe requests have been created using Airbase. So if you select any of these actually find that these are the only tagged parameters. Most attack tools are actually quite easy to fingerprint if you look at the beacon frames. They're unbelievably easy to fingerprint. Now attack signatures are of course easy to write. What about replay attacks? Replay attacks require us to take a single frame and then check the entire SQLI database to see if something matches exactly and if it matches above a certain threshold. Now keep in mind that in Wi-Fi there can actually be a lot of retries. So you need to check that this isn't a retry packet for a previous one. So with Chigula you can actually go ahead and check for replay attacks for any frame. I mean this is frame agnostic even though then you could tune it further for let's say de-auth frames or disassociation frames or anything you like. Let me show you a quick demo. So one of the things which we are adding is also a hash for every frame. So that this check can be much more faster. That's something you're already working on. And Chigula tells you the senders and the number of frames replayed by that sender. And it also gives you the frame ID. So you can actually go back to the pcap file, open it up, look at the very first frame which is there in your pcap which was actually replayed. And the replay attack even has an option to isolate the frames which have been replayed. Now detecting honey pots basically is a two level problem. At level one which is actually quite easy to do, the admin or whoever is checking has a list of allowed BSSIDs and a list of allowed clients. And if you have any associations of clients or APs which differ from this list then of course you have some kind of a honey pod around. Now the biggest problem with this kind of detection is AP max spoofing. What happens if an attacker can spoof the Mac of an access point in which case such a detection would be very difficult to succeed. So in the first run I'll show you the plain vanilla way of looking at it without AP max spoofing. And in the next example I'll show you how to detect AP max spoofing. And all these scripts are available so you can try them out later along with your own. So here it is. I already have a list of authorized APs. Just put in two of them. Just as a sample. And then I'm going to be running against the pcap. And this gives me the list of all unauthorized BSSIDs advertising the same SSID, right? Now this is of course quite simple and trivial which brings us to the next part, max spoofing. What happens when an attacker spoofs the Mac of a legitimate access point? Can someone tell me how you could go ahead and detect that using something like Chigula? And whoever answers that gets a free book. The beacon frames might even look identical actually. So you can take an existing beacon frame and just replay it. A time difference might not be it simply because the attacker's AP would be running kind of concurrently as you know the enterprise AP. I mean almost at the same time always. There could even be multiple fake access points. And this is probably one of the more difficult things to crack. Transmission power is one dimension unfortunately with just one sensor. It's difficult to go ahead and ascertain that reliably because in any wireless communication you have things like multipath fading. So at the same location you can have maximas and minimas depending upon how the environment changes. So all your commercial IDS, IPS vendors at least have more than three sensors to triangulate things like rogue access points and all of that on your floor. Anything else? So if an attacker clones let's say the beacon frame or clones the MAC address and he clones all the information elements as well, then how do you figure the difference out? If he doesn't clone the IE fantastic, that's what Chellam was about yesterday which I spoke which is the IE cannot be cloned exactly because the attacker won't have prior knowledge. But in this kind of a fake AP attack where the attacker probably is in the parking lot or somewhere close to the enterprise, he can actually clone at least the beacons. The attacker could probably do an MITM attack where he can have reroutes the traffic back to the real network as well. He could have two client Wi-Fi cards reroute the traffic back. Anyone? Who's that? Okay. What are sequence numbers? How do you guess? Okay. You're on the right track. So a little bit more. A little bit more. Yeah. We might not be able to look at the protocol layer because it could be entirely encrypted. Right? It could be encrypted communication. And Chigula right now, I mean we are just looking at layer two. We aren't looking at layer three and above. Sequence number is like the right starting points. I'm going to give you a book for that. But who can explain? How sequence numbers can be used? Okay. For two books. So sequence numbers unlike things like TTL, in the case of IP, they basically just get incremented and then wrap around. I mean unlike TTL where you could say that a Linux based system could have, I don't remember, but 128 as the default or windows could have 64 as a default. In Wi-Fi, the sequence number just gets incremented one per frame and when we reach the end we wrap around. So it's not operating system specific. Okay. Very good. Here you go. So if you have two devices, so if you have multiple devices and we actually do a sequence number analysis for a MAC address, so here is what we do. For every given MAC in the trace file which you have, do a sequence ID analysis and what you'll actually see is you have two different sequence sequences, sequence ID sequences or clusters, right, which are moving in completely different directions. Now you might say what if the attacker tries to clone the sequence number as close as possible, then what you would end up having is duplicates, right. Either way it's going to be duplicates. If the attacker, the moment he sees the real AP sent out a frame, picks the sequence number, immediately replicates it in his next frame, right, you have duplicates. That's very difficult to do simply because the attacker can't predict when the AP will speak next, right. The other possibility is of course that both the devices, both the devices are progressing on their own. So this is probably the standard way of detecting MAC address spoofing almost in all wireless intrusion detection systems. This is an extremely powerful metric. In 2007 I gave a talk called cracking web cloaking at DEF CON which was basically how do you crack web in the presence of shaft packets which is absolute garbage injected by wireless IDS's. And one of the key metrics I actually used to separate different kinds of packets was actually sequence numbers. So let's actually see how Chigula can allow you to do a very easy and simple analysis with sequence numbers and we can open up the image. There you go. So if you looked at the pcap trace file which I'll just show you in a bit. In the very beginning we had one device and really this curve could extrapolate here as well. It just depends when you started your trace file, right. So you have one device but after some time you have another device which seems to have a separate sequence number which is kind of going in a very different value of it. So let me just show you how it looks like in the sample. And you can create a fake AP with airbase ng, same MAC address and try this out. This detection actually works quite well. So closely look at the SN. SN is basically sequence numbers. So as we scroll down you currently see one device which has an incrementing sequence number. And now you're probably starting to see something in the range of 1000 and something else in the range of 3500, right. Interestingly you can even detect the number of fake APs by figuring out each one of these progressions and clusters. And Chigula allows you to do that, right. Because we are basically picking up the individual sequence numbers and we are plotting it. So it'll just work by default. I mean there's no additional magic required to do this. And you can see that. Now if you started collecting the trace file in the very beginning you would probably see this stretch out absolutely from the very beginning of your collection time as well, right. This actually has allowed being a lot of times as a pen tester to scan an area, find rogue devices which are mimicking mac addresses of authorized devices using wire shark. But this allows me really to explain that there are multiple devices to an admin or people higher up. Okay. So we are going to be releasing Chigula. You can download it. Today itself may be the next one hour. The link should be live. All the scripts are included. We are already developing more plugins, more attacks and visualizations which can be detected. The community can contribute. It's going to be completely open source. So feel free to look at all the bad code we've written. You already witnessed a crash right in the front. So the other thing which you're going to do is something like a metasploit auto-pond. Where I don't know how many of you have used the auto-pond feature when it was available. Where metasploit automatically tries every single exploit. So what we'd like to do is for every device have it and go through each of these individual plugins so that we can detect different attacks which may be happening. So streamlining so that the moment you pass the pcap file it can actually automatically tell you these five MAC addresses are being spoofed. There is a replay attack happening for these frames and all of the stuff which you saw right now but in a disjoint way for different MAC addresses. Finally reporting capability. So do we have time for questions? Couple questions? Okay questions. Yes. Yes so there is an option available where you can go ahead and sniff packets live as well. So you'll actually notice there are sniffing options available. This will only work for POSIX. So you can actually feed the interface in. Right? Any more questions? I have a question. Yeah.