 My name is Isaac and this is Ian, and we're going to speak today about uh, VoIP botnets. Um, a topic which is very popular nowadays, nowadays all the VoIP getting into more cooperates and companies, not always people understand the risk involved with implementing VoIP and the controller associated with it. So, um, this is kind of our research into the area to see what can be done using the VoIP capability. It's not for a specific protocol, it's not for a CIP or, or a Skypey, but it could be leverage, uh, either. And, uh, we're going to show some demos of how we can do stuff over VoIP, um, it's going to be very cool. So, Ian. Alright, let's kick this one off. VoIP. Intro to VoIP. And this is not an intro to VoIP talk. Alright, heard that. Alright, so we're not going to do an intro to VoIP. VoIP is everywhere. Um, we've got tons of residential VoIP providers such as Skypey, Avage, Phone.com, Uma. They pop up like mushrooms after the rain. Uh, obviously in corporations, it's really the factor standard. Uh, all the internal communications for every company that I walk into today is VoIP. Uh, Cisco, Avias, the Alcatel Lucent, whatever. It's, it's a vendor fest out there. Um, and it just works. Now, one of the biggest problems of VoIP is that it works so well. Okay? You get a vendor in, it drops a box, turns it on, and it works. You start plugging extensions, having fun, conference calls, trunks, everything's very easy. Uh, even the free stuff, like asterisk. Okay? You just download the VM, it's already made, it's, everything's installed. If you have a Digium card, recognizes it, trunk it out, and get a SIP provider, or, or just a plain old PSTN. That's the biggest problem of VoIP in, in our view. Uh, because that basic configuration is just not good enough. No one takes care to actually fully utilize the capabilities that VoIP provides you in terms of security. We're not saying that VoIP is really fucked up. It's just, kind of, and, and people just don't care, because it just works. SIP is the protocol that we like to play around with. Anyone here that doesn't know how HTTP works? Get out. Alright, so if you know HTTP, you can basically know SIP. Alright, it's a, it's a request response protocol. Forget the fancy graphics behind me. It's, it's really simple. If you want to, again, read more about it, Google it, um, because this is not a interest with SIP talk either. So what's new with VoIP and security? Let's dig in. We're going to talk about two issues or two eras of research that we've been digging into. First one is VoIP is a getaway car. We'll see some interesting techniques that relate more to data exfiltration and back channels and see how VoIP can be utilized to do that and how people are doing that. And the second topic, which is really cool is VoIP botnets. We're going to see how VoIP can be used as a command control channel to message between a bot master and a lot of bots. So let's start off with a getaway car. So, VoIP can traverse firewalls very easily. Alright, that's by design. It can pierce nets. It can go over PSTN. If you're in a confined environment, let's say, of a very secure network that doesn't have an internet connection, but has a VoIP network on top of it, there is some PSTN line that allows that VoIP network to go out and make calls and receive calls from the outside world. And on top of that, no one really looks at VoIP. You're really looking hard at your web traffic and email traffic because you've learned that all the bad stuff is there. That's where all the lulls are. Voice over IP is kind of just works, which means you can X-Roll trade. You don't have to go through the hoops of transcoding your data and modifying it and encrypting it and trying to hide it and kind of sift it, drip it bit by bit to avoid pesky stuff like DLP and stuff like that. You can just use version of IP to push the data out. Second topic, voice botnets. We're kind of rushing through this to get to the meaty stuff and also because the demo is really prone not to work. It worked what? Twice at the office? That was like six months ago? Yeah. It billed on us at B-Side, so we're really hoping and we gave it at a church there. So all our prayers weren't answered. Hopefully this is going to be a little better. So voice over IP botnets. Basically take your Google standard botnet, disconnect all command control channels. All the bot hunters these days look for specific communication. They look for IRC, for HTTP, they're looking for specific domains that host all those command control servers and they're doing a phenomenal job. I know that. I'm dealing with that on a daily basis sometimes. So disconnect all those command control channels that everyone is looking at and replace with voice. Simple so far, everyone on the same page. You get a command control channel that's fully mobilized. Net piercing, no problems in terms of running through different networks. It looks really legit because it's voice over IP traffic and we established that no one really looks at voice over IP traffic and worse come to worse you'll see some phone calls coming in and out and some discussions or sounds going over those phone calls if you really care to listen up to them and it's really harder to pick into. Again if you do start looking at the actual communication you need to start deciphering what's going on over RTP and other streaming H3 through 3 whatever and start buffering it, decoding it, figuring out what the hell is going on there because we're not talking about actual data, we're talking about sound. So this is all streamed. Well everyone actually needs a voice botnet. When we look at what is being done right now from the criminal side of things, most of the work is focused on establishing secure channels, back channels, stuff that's not going to break, peer-to-peer communication so that when my one point of failure fails, my botnet doesn't just disappear from the face of the earth. The botmaster is literally more mobile using a voice over IP channel because you can actually walk around with a cell phone or pop into some phone booth, dial a number and connect to your botnet and issue commands. So it really detaches the botmasters, the bad guys from their laptops and their command control servers. They can come to Vegas. It's much more anonymous and what we're demoing and what we're looking at is command control servers that are actually conference call numbers. You can get conference call numbers for free, no identification, no nothing. You can set up your own pretty easily. I mentioned asterisk before, phenomenal, right? And it's already. Just pop a VM, not pop a VM, like tech a VM and put it somewhere, use it and you have a SIP communication really made. And if you really want to go through trouble, you can actually get a real phone number, cost you a couple of bucks a month and switch that straight back to the conference call number. And you can actually transfer fair amounts of data in and out. If you guys remember, I know I do good old modern days, it's actually pretty fast. Especially if you're trying to just send a few commands and get some status, not doc someone's 20 gigs of whatever. And it is starting to show up as an alternative method of communication. We're seeing that already. So anyone who nods too vigorously right now, someone can grab him and take him to spot the fed. So let's see what, all right, botnet in action. We're being lazy today? God damn it. We practiced this. Red team testing is one aspect of this, where we're using it, where other people are starting to use this, looking more carefully into those channels. And you will find, not you can find, you will find stuff that will make the hair in the back of your neck just go scary stuff. Botnets in no internet or closed networks. I mentioned before, especially and we're going to see that in the data exploration technique, ways to communicate outside of non networked networks, if it makes sense. Basically, high security networks that are fully disconnected and air gapped from the internet or from other networks. Okay, if you see voice or IP somewhere in there, just to facilitate internal communication or just phones, you can get out. Okay, and you can have a lot of fun there. And the last thing is botnets for voice or IP phones. Okay, we're starting to get smarter and smarter phones. They're not there yet, but that's, you know, it's an attempt. And these things actually start to run SIP and voice or IP. And they're powerful enough to be considered as PCs that are worthy to participate in a botnet. Okay, we're connected to high speed networks, 3G, often Wi-Fi, why not use them to as part of the botnet that does DDoS or whatever it is. Are you okay now? Yeah, I'm great. Thanks. Okay, so we're going to see how exactly this white botnet can actually be implemented. So the first thing we have to keep in mind is we're looking on the telephony system, which has not been changed in the recent years, and basically have two ways to communicate. We can either have the botmaster calling each of the bots individually and send them commands. That would be a very ideal channel for a private conversation. But again, since the botnet will grow larger and larger, the test to maintain these phone numbers, these identifier of the bots is harder. So the Unicast method would not work for this kind of botnet. What we're going to try to do is to create a similar to an IRC channel field over the telephony system, and we're going to use it during a conference call concept. So a conference call is a very simple concept. Basically, you call in, you get to participate in a conference with other people. And whatever you're saying, they can hear and vice versa. This is a very ideal IRC style channel over VoIP. Basically, we have a conference call. Conference call can have a PSTN connection, which means you can call it from a pay phone, your mobile phone. So basically, you have a way to get in. It's also accessible. Some conference calls through pure SIP. This means you can have a pure TCP AP connection to the conference call. And at the same time, have a person using a pay phone somewhere in New Jersey. So it's a very ideal infrastructure when you're going to use it today to emphasize the VoIP botnet. So this is an example of a very simplistic conference call. You can see the bots are dialing to the conference call. They can either dial through SIP or they can actually have other access to the conference call. And the botmaster, as you can see below, it can either use a pay phone, a throw away cell phone, or it can actually use SIP as well to get on the call. So like I mentioned before, the interesting concept about VoIP is that they actually bridge together two different entities. We have the PSTN, the Telephone Network, which is like the internet. It's external. And many people can get into it and from it. And then we have the internet with it. So basically, both the bot and the botmaster has to make a choice and they can use either PSTN to call the bridge and get online or use the TCP AP option. The most interesting idea over here, and we're going to get to it a little bit later on, is that we're going to use only technology. I mean, if you're going to use the PSTN and you're going to claim that you can operate the botnet using a pay phone in New Jersey, we're also going to stick into telephony style technology, which means it will be accessible and the botmaster could control the botnet using a standard phone. And that's a very interesting thing that you're going to see later on. So basically, all the demos we're going to present has already been, it's not theoretically demos are practical. And you can of course download your copy right now. Mushy Mushy is actually alone Japanese and it's basically the botnet that we're going to demonstrate today. The interesting stuff about how Mushy Mushy's works is it's going to use SIP as the wide protocol. But again, it's not limited to SIP. It's just for the easy views. We're going to use DTMF tons for input. Again, to emphasize that even if you make the call through PSTN and you have no internet connection and it's a pure telephony system, you can still communicate with your bot and get it to do what you want to do. And of course, since the telephony system is not visual, the bot cannot per se give you an image of the information or have it displayed to you in a very convenient way. You're going to use text, speech ends and engines as an output method. So basically we're going to punch some numbers and the bot is going to speak back to us. Okay. So DTMF is just one of the ways that you can communicate over the funds. The reason why we choose DTMF is because it's a standard. It means that you don't have to do any predefined negotiation. It will work from my phone, your phone, the pay phone outside, and et cetera. This means that actually the bot can be communicated with from any phone that is that has DTMF tons in it. This is a very interesting feature. And in early conversation and debates we also had people asking whether the bot can use DTMF to response back. Well, the answer is yes. The bot could use DTMF to response to the bot master. However, that's not going to be graphically visualized on the bot master phone. So he would have to use a special software to decode it back. Now I know in iPhone and Android there's tons of software like that, apps that do it for you. But then again, it's not generic and it won't work for a pay phone. But yes, it is possible to do DTMF both way. Much like Ian said, we're going to show demos. We're going to have like a two-letter setup over here. Ian's computer is going to be the actual PBX which I'm going to dial to. And my computer is going to be the actual infected bot. Now again, like Ian said, there's very interesting thing about the void because you can get it instantly up and running in a matter of minutes. We actually did it the same way. We didn't took the time to study asterisks or design our own PBX. We simply went and get the first thing that supposedly work in five minutes and it is working in five minutes. So we're cool with that. What you have to know if you're going to build your own conference call, you're not going to use an only existing one, is that you have to pass the DTMF to the other participates. Now, normally in a conference call, if you ever have to experience that, whether you punch up numbers, it's actually a command for your user. For instance, you can mute yourself, you can skip another conference call, and etc. So per se, DTMF won't rely to other participates unless you're going to configure it that way. So for our demos, we're using asterisks, now Linux distribution, and to simply to have the DTMF relying to other participates, we just edit extensions.conf and edit the F in the meet me is the conference call bridge for asterisks. If you're going to use a commercial conference call or any other conference call that you have in mind, check out for this feature, you have to configure it in order to get the bot running. So the idea that we're going to present first is how to use the DTMF passing, like I said, it's going to be a very simple demo. Basically the bot master is going to dial a DTMF sequence to the conference call, and the conference call in its turn is going to relate to all the bots on the line. So basically the bot master communicates once and all the bots here is command. Some misconception about DTMF is since it's very limited or very poorly designed, it's not as good as any other language or is it limited in some way? Well the truth is it's not limited in any other way. You can go wild with it and create your own languages and your own syntax. It sounds like a quick example of what we did in Mushy Mushy. So basically we have like a grammar that we have asterisks at the end of flying a hash pound as a delimiter. And then you can use examples like zero pound asterisks which means invoke the command zero without no arguments. You can do one pound, one to three pound asterisks which is like invoke command number one with the argument one to three and etc. So basically you're not limited in any other way and you can develop it into much more sophisticated language for the proof of concept and for just to show the capabilities. It's good enough. Okay so we're gonna try to do the demo. So we have two ways to do the demo. The first one is going to be through a mobile phone. Basically I'm going to connect to the PBX and Ian is gonna call in through a landline and is going to activate a simple application on my PC. If it works. Yeah. If it's not we have a backup for it. So we're going to dial in right now. Where are you connected to the DEF CON network? Come on. Something is wrong about your life. Something's terrible wrong. Something when terribly wrong is our IVR intro. Okay. You're in? Yep. Doesn't feel like. Okay. So by now the bot should be analyzing the DTMF tones to actually see the commands coming but it's it's not working for some reason. So we want to try again or we're gonna go to the backup. Yeah we'll give it one more shot and then we have like the soft phone PB. The soft phone demo which soft phone works. Yeah. Okay. Sorry. It's enabled. It's enabled. It's enabled. It's actually working. Only person in this conference. That was supposed to be zero. Ah shit. Okay. Okay. Seriously. Seriously. Okay. So we kind of jump to the second demo right now but we're gonna do a step back and gonna show the first one. That's the problem with live demos. One last. Yeah. It's gonna we're gonna have to do it. Okay. So right about now excise should pop up in my left corner of the screen. Here we go. I hope you guys saw it because we're not gonna do it again. It's the best use for a blackberry by the way. Yeah. The only thing it's good for. So basically what we saw right now is Ian used. We had the PBX with a DAD line in Vegas. Ian just called in and I connected through the C protocol to Ian's computer to the PBX and he just when I command the DTF command and it popped up excise. For the second demo which we actually prematurely showed it to you is basically we're gonna do a ping. Just to show that again you can do you can pass arguments to the to the bout. It's not limited to like a one way straight commands. So we're gonna try to pull that off to the phone. If it's not we're gonna do our own soft one demo which is again using the same concept. Somebody got a drop of the call. We'll try again real quick. I did not drop of the call. Come on try again. Are you in? I think the line quality is really bad because somewhere on the way we're gonna lose the DTF man. Oh. Says AT&T here. Okay. So yeah we're gonna move to the soft demo which basically you want to do it from okay. So basically Ian's gonna call the PBX using a soft phone so it's gonna be cipped directly to the PBX and I'm connected to the bot to receive to Ian's computer. Nothing shows. We're showing some faith. Kind of worked. Well it worked the first time. Just of the demos are basically you can do everything. Okay it's a question of how do you design your language as Isaac said before to respond to DTF commands. So zero was popping eyes. It could have been you know initiate a DDoS on some predefined address. I can chain commands. I can provide parameters that ping that will work. That will work. You can basically punch in any IP that you want. So again you can walk around, jump into Hooters, get bad service, Hooters.com and it's done. Okay. So it's working from a landline but it's working from a South America. Awesome. You connected to mine? Yeah. Okay. While Isaac is setting up another VM with asterisk we're gonna switch over and do a quick data talk a little bit about data exfiltration which is the second part and what we discussed before. So data exfiltration. The reason or the research that started this whole this whole thing basically started in the red team where the situation was very simple. We could break into a network. That's not a problem. Drop a USB talk to someone. You know the drill. There are like a gazillion talks here about breaking into networks. That's fine. The problem was that that network was basically air-gapped. No connection. The whole facility is fenced and barbed wire and whatever you want and you need to get data out to show the risk. To show that it's possible and to start working on mitigations and monitoring for VORS RIP channels. What the... So the deal was very simple. We figured you know what we saw while you know during the recon and intelligence gathering that we had a VORS RIP accessible inside that network. Right there is a few soft phones that were needed for some reason and the rest was just handsets which meant that the computer network was overlapped with the VORS RIP network. Which meant that our payload could talk to the VORS RIP network. So the deal was very simple. We took... We wanted to... What we wanted to do is to take the binary data that we wanted to exfiltrate. In that case it was a few war documents that were like classified or whatever it is. And basically modulate them to audio. We took half bytes, right, 16 bits and each half byte was translated to one of 16 different octaves within the human audio range of 200 Hertz to 2000 Hertz. We generated a tone corresponding to that octave using the payload. The octaves were spread out evenly enough to be easily detected or identified later on and we'll talk about it in a second. And basically recorded a half second tone of that octave. Marrying up all those different tones, you basically transcode any binary file to music, to tones. So the demo that we're going to show here is very simple. We have a proof-of-concept code written in Python which is very optimistic and very simple and is not really designed to be used to actually do this. It's more... It's a proof-of-concept, okay? The main issues with the proof-of-concept is that it's using only 16 octaves. You can easily use 32 and just transcode the whole bytes one at a time. You can shorten the length of every tone so you don't spend half a second just generating it. And basically you can take the Linux self-mode-in driver and get all those hints from old-school gangsta-mode-in stuff to compress and transcode your data more efficiently over audio. Proof-of-concept is called... First of all we have a message, alright? In this case it's message.txt. Some secret files, blah blah blah blah. The Python script is data to sound. We're taking an input of message.txt and an output of sound.wave. We're assuring that file that we just generated is indeed a riff, a wave file, and we dial out. We dial out to a voicemail. In this case and my usual preference is Google Voice because you can get on email when you have a voicemail and you can, you know, get an mp3 of that voicemail. It's just super easy to use. So that was message.wave. At that point we've got some music playing in our voicemail. We downloaded that mp3, transformed it to wave because my Python skills don't allow me to use mp3s. Oh, fuck. And we have another script. This is our VM. It's voicemail.wave. It is a riff. The second part of proof of concept is basically sound to data with an input of VM.wave, an output of output.file. We see that the file type is indeed ASCII, just the one we sent, and secrets. Doc's dad. Hi. So we have the demo ready for the rest of the MashaMasha botnet. So basically, again, I'm going to call the conference call on it here. Can you put the... What do you have two holes? Good question. That's what she says. True story. So I'm going to dial the conference call. I'm going to use this lovely soft phone to show how we are going to do more than just a one time command. So basically, I'm going to ping a computer using just the tmftones. So that's a really quick way if somebody was thinking about null of service and whether it's possible over using five botnets. It's network problem tracking. Yeah. Really? Thank you. Oh, yeah. Yeah, that's not going to work. Okay. So far, we only did the bot, the master, the bot, and now we're going to do some reverse traffic. We're going to do reading out of information. Of course, one of the things that the bot master would like to do is to get information outside of the computer network that he has access to it. And like we said before, the telephony system does not provide a good visual way to communicate that. So we're going to do the text to speech engine trick right now. So basically, this is a very simple trick I'm going to do. It's playing out to picking up my root entry in my ETC password. Again, I'm going to dial, I will be dialed to the conference call. So my computer is going to read out this line in a few minutes. Yeah. So, I'm going to read out this line in a few minutes. That was a quick way to get a sentence out of the computer. Again, using only the phone, only DTMF, no other software involved. Okay, so we're a bit short in time, so we're going to go back to the best demo that we give. I'm going to show you right now. So basically we say it's possible to do a one line, but if we'd like to do something more, like to take an entire file outside, it's also possible. Basically, we have the very top secret document. As you can see, highly sensitive and confidential. And basically what I'm going to do, we're going to ask the bot to read out, to take this word file, to convert it to text, and read it back to us. Secret. Some company statement for 2011. We made a $1 billion and we are expected to do $4 billion. Next year, with con 19 rocks exclamation. Thank you. So I'm going to do a quick jump back to the presentation just to go over some of the theoretical stuff at the time that we have left and we'll take questions of course later. So we saw the DTMF demo and we got the idea that DTMF could be a very good way, a common way to communicate with any VoIP botnet. We saw the text to speech could be an ideal data leakage concept. Basically, if you cannot see the data, the bot can read you back the data. We saw Ian in his concept of using the voicemail as a callback. Basically the bot calls the voicemail, drops the message. No, no, no, no. The ultimate idea that we think that the VoIP botnet could provide eventually is to basically create a VPN and how the VPN will be created by simply bringing some seriously old school shit back to the game. Bringing back the modern HDLC and PPP protocols. Basically you can have the idea that the bot will create a VPN, will call back to the master and will get an IP and in that way you can see it's similar to a VPN. All the blue dots are basically what's going to be on the public infrastructure and all the red dots could be another layer of communication and that's going to work with hardware modems, software modem. It works within the voice frequency, it worked in the past, it will work now. Works under pro connectivity issues and of course it's a two-way communication. So basically if you look at the last bullets the bot master can simply explore his bot. He can then use protocols such as IRCHTP and whatever else he wants. He has an IP access within the organization and he can do whatever he wants with it. So that's to conclude. What you guys heard today is that the VoIP botnet is as good as any other botnet's out there. It's not less or for bigger than HDP. Every communication protocols has its pros and cons but we believe that VoIP has much more pros in it and therefore we suspect it's going to be if it's not already been used it's going to be popularly used in the future. And it's something that the industry needs to think of, it's something that your company needs to think of the next time someone says oh we can save a lot of money by implementing this technology and it's cool and it's fast and it's easy but then again there is no controllers, there is no awareness, there is not no help to understand what actually can happen. That's that's kind of the result. A word about the counter measures. So yeah the first thing will be to separate the VoIP network from the cooperate network. I know it's tempting to have a soft phone in your computer and have the ability to go on and off with it but again the risk is that if there is a VoIP infrastructure that comes all the way from a pay phone to New Jersey to your computer then this is a risk that you have to think of over the advantage of having a soft phone installed. Definitely monitor the VoIP activity. That's something that people don't do and like Ian said we do it for web, we do it for emails, we do it for what else but for VoIP it's like it's going to be okay. Don't worry about it and that's the problem. And the last bullet is kind of experimental thought it's if the company or yourself are using some very specific conference calls bridge. So there is actually no reason for anyone else to connect to a foreign conference call for that better. So again although it's not going to solve their problem it could be a way toward the solution just of a better understanding of auditing your bills and allowing which numbers can be made calls to. How's the future is going to be? Well the future is going to be very soundish. It's going to be speech to text as an input. Again we today we emphasize the DTMF as an input vector. Again the future could be commands spoken by, do that, give me that, attack this it's capable. We have the mobile angle which basically going through SMS get ways and basically get visual appearance. The bot can text you, you can text them back. And some of the going back basically one way to get a screenshot which is a very interesting vector is basically have the VoIP, sorry have the VoIP, the bot taking a screenshot and then communicate it as effects. So you can basically have a screenshot of the computer that you're looking at through the effects. And of course for the internal VPN start communication bring back the modern protocols and these protocol communication. That's about it.