 Hi, my name is Ian Harris. I'm a professor at University of California Irvine. I'm going to describe a fuzzer. Oh, please cut that music in the background. I can still hear We Will Rock You going on the back of my... I don't know if you can hear it, but I can hear it. It's quite distracting. I'm at University of California Irvine. I'm going to describe this fuzzer that we have made that fuzzes SIP, so Session Initiation Protocol for Voice Over IP Phones. It fuzzes the server, so we send messages to the phone to try to break the phone, basically. And there have been several talks about fuzzing, so I don't have to talk too much to introduce the ideas. But the co-authors, most of the co-authors, all the first co-authors are all my students at UCI. And then the co-author on the right is from Fort Consult, that's Marcel Carlson. He's in the back over there. Okay. All right, so fuzzing basics, in case you haven't seen any of the numerous talks before this to talk about fuzzing. So the idea is, and we're doing network fuzzing, so we're sending messages, actually UDP messages, vests over UDP, SIP messages over UDP to a SIP phone. And what you do is it's a text protocol. So all you do is you take fields and you fuzz them. You have what I'll call fuzz functions. I don't know what other guys have been calling them. But you take the field, and we're doing, if I use the definitions of the previous talks, we're doing not mutation-based, but generation-based. So we don't take an existing sequence of messages and modify it. We actually generate the sequence of messages. And then we modify the one we generated. So we generate a good message, and then we randomly fuzz something. So here's an example of something we might fuzz. Actually, this isn't something we do, but you put some junk characters in here just to try to cause a buffer overflow, say, right? So if you just make a field, and I'm probably pretty sure that most people know these things, but one thing you would do to a field is you make it extremely long, try to cause a buffer overflow in the parsing. Another thing you might do is command injection. So maybe the field is being passed directly to a shell, right? So you might want to insert shell meta-characters just to see how the phone reacts to it. You might do SQL injection. Actually, we wouldn't, because SQL is unlikely that you're going to take a field and pass it to an SQL interpreter in this context. But another common fuzzing function is SQL injection. You put SQL commands, because maybe the field goes, you know, straight to some kind of SQL interpreter. And so you put some command in there, see how it reacts. So that's the idea. We're generating a sequence of these messages, and then we fuzz them. We randomly throw in different types of garbage made to stimulate common errors that people make in their coding. So session initiation protocol. The job of that protocol is to start and end phone sessions, voice over IP sessions. It's the standard. Not everybody uses it, you know, Skype uses something else, but basically everybody else uses it. Most of the other phones are SIP phones, and I know it's a little hard to read that. But what I'm showing here is a typical transaction. So there's a user agent client, user agent client and user agent server. The client is the phone that starts the phone call. The server is the receiving phone. So we are fuzzing the server. Now, of course, any phone that you get, you want it to be able to start phones and receive phone, start calls and receive calls. It should be able to do both. But you can segment the code. According to the code that does the server stuff and the code does the client stuff. So we're testing the server stuff. So we are fuzzing. It acts like a phone that is starting a phone call with another phone. And so it's testing that other phone, the server phone, to see if it has any bugs in its server side activities. So it's a typical conversation between a client and a server. So you might send an invite message to start the phone call. In fact, you'd have to send an invite message to start the phone call. 100 trying means you got the invite. 180 ringing, you send back to say, look, my phone is ringing now. We're waiting for somebody to pick up. Then 200 okay, you might send. After somebody's actually picked it up, you say, okay, I'm ready to start the phone call. Then this client would send an act message back to say, look, I see that you are ready to start my phone call now. So let's start talking. So after that, there would be a media session right there. Now, the media session isn't handled by SIP, it's handled by some other protocol. Usually RTP, real-time transport protocol, but it doesn't matter what it is. It could be something else. But some other protocol handles that. And then at the end, somebody hangs up the phone, so say the client hangs up the phone, he sends a buy message. And then the SIP server sends an okay to say, and then just done. So that's what SIP does, it starts phone calls and it tears down phone calls. So I just want to go over previous work in SIP fuzzers really briefly. First, one that if you know anything about the stuff you probably heard of is Protos fuzzing suite. So this isn't actually, they don't provide you with a fuzzer, they provide you with a suite that was generated by a fuzzer. So a sequence of messages, of SIP messages that was generated by their fuzzer was done by these Finns University of Finland somewhere, I can't remember now. And CodeNomicon is the company that they spun off. So now it's an industrial project. So it's a predefined test suite, this many test cases. It basically fuzzers the invite message. So it takes the invite message and every time it sends an invite it throws lots of different fuzzers into it. Lots of different weird things into it and they have a long list of different type and after the invite they tear down the call with a cancel act message. So the sequence is always invite cancel act, invite cancel act. They detected, actually I didn't go into any depth but they detected a lot of vulnerabilities and they were basically, as far as I know, the first fuzzing protocol suite for SIP. So Snooze Fuzzer is another one. These guys, they're at University of California Santa Barbara and besides these two that I mentioned there are more I know the only ones I could find to actually publish data easily, I could find out exactly what's going on inside the fuzzer. There's a Snooze Fuzzer where what they do is they have a protocol state machine. So this protocol is a stateful protocol. So there's a state machine that describes the protocol and I'll show that in a couple of slides. But somebody, some user has to describe that protocol in an XML format. They read that in and then they use what's called, what they're calling a fuzzing scenario. So basically the user, the person who wants to do the fuzzing has to define the sequence of messages that they want. So say I want invite and then cancel act, that's the sequence I want. So I would define that in a file, say that's the sequence that I want and then I would, I also define what fuzzing primitives to use, maybe I wanted to do SQL injection, maybe I wanted to do command injection, whatever and also which fields to fuzz. So I as a user would define that and then it would go about automatically sending these message sequences according to my parameters. So it's not fully automated. The scenarios had to be generated manually. So contributions sort of new things of what we're doing. First, we're automatically exploring the state machine on the server. So we, like the like the snooze fuzzer, we take the state machine but we automatically explore it where they sort of, they had in fact if somebody out here, if I'm wrong about this and somebody from UC Santa Barbara is here, please come out and tell me Professor Viney I think was doing it. But what they do is they follow a strict, a path specified by the user, right? So if you want to explore the whole state machine go through lots of different paths which is often important to testing lots of different code paths, you would need to do that manually. You'd have to enumerate manually all the different paths where we'll do that, we'll automatically just walk through to a random walk through the state space. Also we evaluate the response messages and this is not completely new but what we do is, so one way to check if you've, if you cause a failure is just to see if they, if you never get a response back, right? So maybe you've killed the server, it's now dead and you'll never get a response back. So after a certain time out, you say okay, succeeded, I crashed it. But there are other errors, that's sort of a drastic way to die. Another thing could happen is that instead of sending back, you could send back a message but you might send back the wrong message. The server might respond the wrong way. So we check the messages that come back and verify that they are the type of messages that we expect according to the protocol as described in the RFC, you know, which would be in the state machine. We check that they're the right type and we check the dialogue information about it. So there's tag information that uniquely identifies a dialogue, the two tags, the call ID, we check all those things to make sure that they match what they should be. Also we control the server GUI during fuzzing. So this is something other people don't do. The idea is that if you look at the server state machine which I'll show you in a slide, I think, there are certain edges that in order to traverse the edge, the user, the person on the receiving end of the phone call, the person who's sitting at the server phone has to do. So for instance, if I'm fuzzing a phone and nobody ever accepts the phone call, then you're missing a lot of the state space. You're not testing a lot of the state space because there's a lot of interesting state space out there which you can only reach if somebody accepts the phone call. So we will control the GUI, so we will force the accept to happen or decline to happen on the user side. Now one drawback of that is you might say, well then maybe that's not a real vulnerability because it depends, it's not purely it does not purely depend on what the attacker does in certain terms of sending messages, it depends on what the person at the phone does. But sometimes, for instance the bug that we found, the person has to accept the phone call but it's very easy to get somebody to accept the phone call, meaning it's not uncommon that someone accepts a phone call. So if there's an exploit that you can only use if somebody accepts a phone call that's valid, that's important. So we have to control the GUI to find those type of things. So yeah, social engineering, and in fact, there's a lot of social engineering stuff you can do. So for instance, there's another, I'll show you in a few slides, but in order to get something to happen, you need to get the person to decline a phone call. You can do that. You just call them a million times from the same phone number and eventually they're going to just decline. And then that one time, well then that's the vulnerable time. So you've got to figure, you can in some ways control the person at the server end. If all you want to do is accept and reject a phone call, you can do that, you can get him to do that. So we do that and that's different. Okay, here's basically what the system looks like. Almost out of time. We got the protocol description right here. So that's input to us, right? Somebody gives us this state machine which describes a protocol. There's a sequence generator which sends out requests, you know, messages and sends out GUI, GUI command so press accept reject. And then there's a response analyzer which gets back the responses. Also uses the data from the protocol description to check to see if they match what the protocol description says. Yeah, okay, so here's the state machine and it's hard to read anyway, but just to walk through one path through the state machine, so the common path. So you're at the start state, somebody says you send an invite and it goes to the invite state where it stays there just for a short time. Oh, I should describe how these are linked. You can't read it too well right here. I'll show the next slide. So it starts ringing, the phone rings so you're staying in this ring state until somebody picks up the phone, then you go to the okay state, then you send back the once it gets the act message back then it says okay I'm ready to start so you're in the media connection state, media connection you just sit there until the, you know, let RTP do all the work until eventually somebody says buy and when they click on buy, you go back to the start state. Okay, go ahead two. Not many, okay phone and go ahead. Yeah, okay, so what he just said just to repeat, he's talking about fingerprinting the type of phone that you're using based on the bugs that you find in the phone. So you can say that's what, if I heard you right, so if you find certain type of bugs you say okay that's got to be Cisco, right? We didn't do anything like that. You know, maybe soon. But we found other problems, other things though, because when you go from one phone to another, they don't necessarily adhere to the RFC exactly, right? So you need to tweak the state machine? Yes. Yes, absolutely, there are open, there are undefined things in the RFC. So our state, the state machine that we're using exactly fits the phones that we're using. The vast majority of it stays the same, independent of the phone. But there was this manual discovery process that we did where you go to the vague parts in the spec and you just see what happens and you fill in the state machine based on that. So the whole, the generation of the state machine is tedious and manual from the RFC and there are these vague parts in the RFC that you have to deal with somehow. So that's how we dealt with it. Question? I'm sorry, I didn't hear it. God, I don't know. I mean I only used two, so I don't know. I mean they almost, those two, K-phone, K-phone on the side, they're pretty close. Okay, so I believe you. We didn't... No, and I believe you. I believe you. We didn't test that. Like I said, we only used two phones and they were actually very similar phones, so. But, okay, any other questions? I better hurry up. But if you have other questions, I'd be happy to answer in the question and answer room at the end. All right, so the algorithm, I don't even need to say this. All it does is you start in the start state. You assume the service in the start state. You pick a randomly picking outgoing edge from that state. You take that edge and you look at the inputs on that edge. So the inputs meaning the events that trigger it. Okay, so there are three types of events that can trigger an edge. One is receiving a message. I got an invite to say, okay. If that happens, you send the invite. If that's the edge that you want, you send the invite. Another thing is a GUI event. So somebody clicks except. So if that's the edge, then that's the input, then you cause the GUI event. The other thing is a time out. So, like, say the phone's ringing and it's rung enough for a certain time limit and then after a certain time limit it just stops ringing. It goes to a new state. So the time out is the third thing. Time out we basically fudge. We just wait. We know what the timer is from experimenting with the phones. So we just wait for that amount of time and that's how we do that. And, okay, so this is just one example of that. Actually, this is a little zoomed in so you can actually look at the edge. The way this edge is labeled. Start invite here to states. You've got this edge. It's labeled with something on the left and then there's a slash and something on the right. The thing to the left of the slash are the inputs. That's what triggers the edge. So in this case you go from start to invite if you receive an invite message. Then the thing on the right is the output. What comes from the server goes from the server back to the client. So it should send a 100 trying message response to the invite. So we check against that. We say, okay, did we receive that 100 trying or not? So that's for the messages. But then there are some edges where if you're in the ringing state and the time out happens, then you go back to the start state. If you're in the ringing state and somebody clicks on decline, then you go to the decline state. So those are the three things that can happen. Either messages trigger the edge, time out triggers it, or a GUI event. And we just cause all those to happen. And the GUI events we use is a test toolkit you can find on SourceForge. So we can only deal with soft phones. And they have to have an X interface. Because we're using X 11 GUI test. And so the bug we found was in K-Phone we found a bug. Actually it was very fast to find the bug. Actually this phone had been fuzzed before by previous work and they didn't find anything. But we found it very quickly because we could cause the accept or off hook as I'm calling here. So the problem was, what happens is you start state, you send an invite then it starts ringing. So it sends you back a ringing message and it starts a timer when it goes into the ringing state. Then if somebody picks up, if it goes off hook, so the user, the GUI somebody either picks up the phone or clicks accept. If that happens when that happens you go from the ringing state to the okay state, right? But there's a short period of time in there where if you send a buy message right before it gets to the okay state then it will crash the phone. So that's what happened. You know, we randomly tried messages and eventually just sent that buy in the right timing window and it crashed the phone. And this is just detail about exactly what happened. This is, actually it only had to traverse 8 edges before it found that error. It was doing it randomly, right? Doing it randomly and it traversed 8 edges before it actually ran into the bug. And so, conclusion. We automatically explored the state machine randomly. We verified the correct messages, the response messages seems to be the right responses. And we controlled the GUI so that we could explore more of the state space. Future work, okay so test more phones obviously and we're doing that. Debug the phones. So just because it crashed the phone doesn't necessarily mean that there's an exploit there, right? Maybe it's just denial service but maybe it's a stack overflow so we want to debug these phones so I have a student trying to debug the phones right now and then examine hard phones. So hard phones, so we got the Cisco phone that somebody from D-Link gave me. I'm just going to open it up. I actually bought all my nice hardware equipment the other day. Open it up and it's button interface, it's just a bunch of switches so we'll just replace it with solid state switches that are controlled from the machine and then we can control its GUI interface in a hardware way and do the same thing and test the Cisco phone the same way. That's a hope anyway. Okay I don't know how to do that but I'll ask you just a second. Oh yeah, sorry we could auto answer but if we auto answer then it'll always accept, right? But I want to be able to control it except now it declines now. You know what I mean? So I want to have control of it. Good? So how susceptible is man in the middle he's saying how susceptible do I think these SIP phones are to man in the middle of attacks? I don't know. I really don't know. I would refer you to David Angler and who's the other guy who wrote that book? You probably know that book, that hacking voice expose book. That's a good book. They talk about it in there but I don't remember what they say. I don't know. Yeah, I really don't know. Thanks. Okay that's it then and we'll have to take questions offline so there's that room back across the hall that's left a little bit where I'm going to go right now and if you have a question just please follow please come over there. Thanks.