 I'm Kevin Mahaffey. This is John Herring and we're going to talk about mobile applications and some of the research we did into them recently. So really quickly, if any of you guys saw our talk last year, we've spoken for the past few years at Def Con and Black Hat about mobile security. Last year we talked about fuzzing and attacking mobile operating systems more at the OS level. We decided to up level a little bit and think about applications. I mean really in the past year it's been amazing. We've seen absolute explosion in adoption of smartphones and more importantly mobile broadband usage. So a big part of this is mobile apps and we said, okay, what's happening in the mobile app ecosystem? Matt, what can we do in the context of security research there? We're going to start with the vulnerability because that's always fun. So for those of you who don't know much about Android, Android has a login subsystem that you can use when you're developing applications. This is just like any other login system, but this one is Android's. Specifically, if you want to read logs that are produced by the system, so there's a kind of a centralized system login framework, you can request the read logs permission and it allows you to access that. So this is the background for what we're going to talk about. And if you're really interested in this sort of vulnerability, 3 p.m. in this room, there's a really good talk you should see as well. So there is a vulnerability in Android where the location manager service, which is the service that the Android system uses to effectively provide your GPS location via cell tower, Wi-Fi, so on and so forth. And what happened was when you retrieved the location, the location manager service dumped your LAC and cell ID, which are the two GSM identifiers that can be used to basically determine where you are into the logs. So what that means is that any application that has the read logs permission can grab those pieces of data outside of it and determine where your application is. And there's a lot of other things we've also seen with respect to applications logging things furiously and disclosing all sorts of fun information. I liken it to dumpster diving on mobile. So really quickly, I'm sure everyone saw on Monday, interesting example, Citibank had an announcement about their iPhone application leaking credentials, banking credentials to parts of the device where in certain circumstances if a malicious application was exploiting the device could grab the banking credentials and there's also a syncing it to your desktop if you synced with iTunes. We've seen a number of examples of issues like this and this really underscores how bad it can be if you're logging information inappropriately. So imagine you're logging banking credentials to some public part of the device. Any application that doesn't request any sort of special permissions other than the read logs can grab that data and then you're your own. So this is an example of a vulnerability we found in the wild. It basically allowed somebody who's reading the logs on your device to hijack and log into your account. Not good. The background on this is a lot of applications log URLs with varying levels of parameterization to the system logs. So for example the top logStem you see is what the Android browser does when you type in a URL so you can effectively see what query was put in there. And the second thing you see is an example of what the vulnerable app that we found in the wild was actually doing. It was effectively logging any URL it retrieved. So some of you will begin to see where the fail happens. So let's imagine a hypothetical application that has a web view embedded in it. So it's a mobile application that effectively is just a web browser. When you start the application it retrieves the login form from the server. The user goes tap, tap, tap, enters their email and password. They click the button. It posts that information over SSL and let's imagine they're using a single sign-on service. So you have this sign-on service that serves the form and validates the credentials. Let's assume that the user entered the correct credentials and the single sign-on service says okay, you're authenticated. Now redirect to the actual application. Single sign-on service says hey, go here. And in order to actually validate the session it adds information in the get query string about your session ID and all these secret tokens. So, you know, good app says, oh, got a redirect. I'm going to go retrieve this sort of thing. And you're logged into your account. You guys see what goes on here? Let's look at what happens in the logs when this happens. So we get a get URL, you know, SSodacomponent.com. We post and we get our redirect. That's an interesting secret key that we see in the logs. So what could you do bad with this? So imagine a malicious application that effectively is reading the logs on the device and it transmits this get query parameter to a server. Literally all you have to do is paste this in a web browser and you're fully authenticated in that user's account. Just to be clear, imagine if your banking credentials are logged. You grab this string, drop it in a browser and you're logged into their bank account. Fun. So, you know, key, key lessons. You know, we want to orient people to, hey, if you guys take something away from this talk and you guys are writing mobile apps or breaking mobile apps, you know, app developers, you know, seriously, please don't log confidential information bad. Secondarily, web developers, if you're interacting with mobile apps, especially if they're sending URLs between applications on a device, try to pull things out of get query strings, maybe put them as post data or set cookie data so that it's not floating around or at least it's harder for it to float around. So, we found this really bad thing in the wild and we had a couple questions. One, are there any other apps that are vulnerable to this sort of attack? And two, are there any applications in the wild that are actually trying to read the logs and trying to hijack these accounts? I mean, to be clear, the application we found this vulnerability in was a pretty serious application and I could say there's a lot of incentive for people to attack it. So, we asked this question and we, about a few months ago, said, you know, what if we could basically ask questions of every application in the world and get a reasonable answer? So, we built something we call the App Genome Project. Effectively, what the App Genome is? So, basically what we did is we built an engine that allows us to crawl most of the major application environments. So, mainly what we're going to be talking about today is Android and iPhone. So, the Android marketplace in the iPhone app store. We've seen about 300,000 applications, the number's growing really, really quickly. We've deep dived on about 100,000 applications and we've basically taken, we'll talk a little bit more about it, but we've taken this engine that allows us to actually pull down the data and then built in a system that allows us for automated analysis. And we have a system that we can look at, basically metadata application binaries and create a genome, if you will, of applications such that when we say, okay, if this one app has this sort of behavior or has this sort of code in it, what other apps actually have that and we can use that to correlate different types of threats. So, when we understand one type of threat, imagine we can instantly scan 300,000 apps and see what else is exploiting it in the wild. So, this becomes a super powerful security response tool. So, what we're going to talk about today. First, we're going to talk about, you know, why you should think about carrying about mobile apps and then maybe why you should be skeptical about mobile applications. Secondarily, you know, what are our motivations? What's the point? Why are we downloading all these applications? What are we doing? Third, we're going to go into how we built it. I have a little bit under the hood of, you know, it's pretty cool. And we're going to look at what we found. Specifically, I'm going to tell four stories and I think they're pretty good stories. And we're going to talk about how we can use the app genome project in a security response context. So, if a lot of you here are disclosing vulnerabilities, there's a lot of times a big question of when do you disclose publicly? And a lot of times the answer to that question hinges on whether there's exploitation in the wild and we happen to have a giant data set that we can see if things are being exploited in the wild. And then we're going to make some predictions, hopefully. So first, why should you care about mobile applications? So mobile apps obviously are becoming pretty standard use in terms of how people are actually interacting with their mobile devices. I think that we'll see more and more moving to the web over time, but right now it's all about native mobile applications. We've seen a pretty interesting trend where there's huge volumes of people downloading applications. So on average about 22 apps, but there's people who have hundreds of apps on their phone, literally. And the thing that we've noticed with that is most people pay very little attention to what they're actually downloading. So people download apps 10 at a time. Oh, I got a smartphone. That's great. They're not paying attention to the source, to the developer, what's actually happening there. It's just assumed trustworthy, which is not the best assumption to be making. And more and more apps are actually accessing sensitive information. So we talked about examples like bank accounts, location, SMS. And one of the other interesting things we're seeing that Miko Hypenin spoke on at Black Hat is there's premium rate calls and premium rate SMSs. So imagine on a desktop environment how you've monetized an attack. You might steal someone's credit card information. You might turn it into a botnet for spam. On a mobile device, you actually have an in-band payment mechanism. Literally you can send a packet and money goes flying. And that's never been possible before. So we've seen examples of this where malware will have auto dialer mechanisms in it. And they'll sit there at night, call Somalia and ring up a huge, huge cell phone bill for you. It's not cool. So why care about mobile apps? What enables the attacks more easily than ever? So standardized APIs, it's very easy to say, okay, I want to take, I want to grab contact information. I'll call that API and grab that information. It's very scalable in terms of how easy it is to build an attack. And the capabilities are so rich and deep combined with those APIs that it makes it incredibly easy for one to, after you actually have an exploit, grab data. So the incentives like we talked about, there's clearly a monetary incentive and then there's the sensitive information that's also driving those attacks. So the question is, why won't mobile threats matter? We're skeptical dudes. We like skeptical people in general. And we continually hear from people, wow, mobile's never going to be a problem. We've got security handled. First argument, mobile's fragmented, right? Well, just to set context, when we look at the desktop environment, one of the reasons Windows gets exploited so often is because it's a relatively homogenous environment, right? And it's a very large footprint. Well, if you look at how fast mobile device shipments are growing, there's going to be half a billion of these things shipping a year within the next two to three years. There's going to be three Windows. When you think about the size of the iPhone platform, Android platform and something like BlackBerry alone, literally each of those will be the equivalent of Windows. So when you think about the attack surface, yes, they're fragmented. It's like the market just exploded three times. So there's three opportunities to actually get in there. And more importantly, everything is homogenous. So if I have one iPhone exploit, I get to nail every iOS device, iPads, iPhones. Another argument, isn't there a sandbox? The sandbox, that means safe, right? Well, you can piss in the sandbox, right? So we'll show you an example of how this works. And a sandbox is a great step. Don't get me wrong. We think that mobile OSes are pretty forward-thinking in the context of security, but nothing's perfect. And the goal of this talk is to help everyone understand how to take the steps to keep people safe. So for example, the vulnerability we just talked about is an application-level vulnerability. So even though the system is totally safe, an application is leaking credentials, for example, into the system logs. And I think there will be a... there's a lot more vulnerabilities like this where OS does a great job, but applications screw something up and game over. Exactly. And just finally, in terms of a small attack surface saying, oh, well, mobile devices have a small attack surface, it's actually the most rich attack surface we've ever seen. I mean, look at the different types of exploits in the past couple of years. We've seen ranging from SMS bugs like Charlie Miller's SMS bug to attacks against push messaging services and things like the actual App Store itself. And more and more, we're seeing web services associated with mobile devices. So imagine you're loathing through and you nail the web service, which has a direct connection to the mobile device used out as the in-road. So what are our motivations? Why did we go do all this? Ultimately, our goal is to keep people safe. We believe that having good data can help everyone here, everyone who sees the research, make good decisions, whether you're a developer, a network operator, or an IT administrator, understanding what is actually out there as opposed to just kind of speculating it's a good job. Secondarily, we want to identify threats in the wild. Like having a large data set, we can actually go ask very, very probing questions and figure out what's actually going on. Third, understanding platform differences. These two platforms, Android and iPhone, particularly, have very different security models. And if we can understand how that impacts how bad things happen on devices, that allows us to do a really good job or do a much better job than we could otherwise in securing them and preventing things before they get too bad. And the last thing is, we want to understand what apps are actually doing, which is not necessarily the same thing as they say they're doing. And on a mobile device compared to your computer, especially a lot of people in this room, I imagine are inspecting or reverse engineering are doing things on a lot of the apps on their computer. You don't have that ability, in many cases, to do that on your phone. And we want to basically make it so. How do we build it? So a quick architectural overview. We build the distributed crawler that speaks Android and iPhone. And effectively what it does is a piece of software we built that communicates with each app store over its native protocol and interacts with it in the same way you would download something on your device. We tried to run this originally in a single thread and downloaded every mobile app in the world in a single thread. It takes a really, really long time. So try not to do that. So we distributed it out. It made it a lot easier. Secondarily, we're storing everything we're getting so that we can actually look at trends over time. We can do offline analysis. We find a new phone. We can quickly write up something to figure out if anyone in the wild is vulnerable to it or exploiting it. Third is we develop some custom analysis tools because there really aren't too many things that are meant for automated analysis on mobile right now. So we built some things that allow us to ask questions against the dataset in mass. So, like I said, it's a distributed crawler. Essentially, it does the same thing that you would do when you're on an app store. So first, let's imagine you're browsing applications. That's the first step. We're enumerating applications in a given category or in a given set. Second, on Apple, we basically, you would click on an application to retrieve the data. On Android, the side there is you, it actually does that all on the query so it's a little bit more optimized. And the third is once you're on the application page, we download the application. And we do this through a distributed job queue and everyone's happy. So the data store, we are storing all of the application metadata. Specifically, this might be description ratings, version, so on and so forth. And we're actually tracking changes over time. Secondarily, application binaries. This is for free applications. If anyone wants to write me a check to download all the paid applications, I'll be available afterwards and very happy. And yeah, and we're tracking all of this. It's quite a big data set. It's fun. So now that we have all this stuff, what do we do? We have a whole bunch of mobile applications. I can only play so many games of Tetris. So we built some automated analysis tools and we started extracting data to figure out what we're doing here. I'm going to go into detail about specifically what we did on each platform. But generally speaking, we're looking at what things, what APIs and framework constructs on a given platform and app references, what things are implemented inside of the application on Android, what permissions are in the app, and then sometimes we can get strings out of the application if we have specific things we're looking for. So what do we do on Android? So Dalvik is not only a fishing village in Iceland. It's also the VM on Android. It's actually pretty cool. It's similar to a JVM and it has a lot of differences that frankly don't matter to our analysis right now. But it's really optimized for mobile. The application packages for Dalvik are called APKs. They're very, very similar to JARs. And APKs and JARs are basically just special zip files, so they're fairly easy to extract. The main executable in Dalvik is called classes.dex, and so that's one of the main things we're looking at. The other thing of interest in an Android application is called the Android manifest, which is a really kind of proprietary binary encoded XML document that describes the permissions of an application, its components, so on and so forth. Before we kind of dive into what we did for analysis, I think a brief background in what the Android security model is. Specifically, Android has granular permissions for all of its specific capabilities. So up front, Android declares what your app's going to do, and it never goes around that, hopefully. Come at three to have fun. Also important is enforcement is at the process level and the VM level. So if you find an Android VM vulnerability, usually it means absolutely nothing because everything's enforced at the UNIX process level, IPC level, and what your user's permissions are on the actual device, which is really interesting. So what do we do? As I said, Android has permissions. And so one, we looked at package permissions, but simply looking at permissions alone, all that tells you is if an application wants to access data, it doesn't tell you what it actually does. So we also did a deck static analysis, and so one example of a question we could ask is, hey, is this permission requested and is this API referenced? One example of how we can do that for the phone number, we can look at the read phone state permission to retrieve the phone number, and we look for the telephony manager get line warm number API. And with either one of those missing, the application is not likely going to actually access the phone number. Or we can run arbitrary analysis to ask other interesting questions of the data. I think it's important to understand what we can do and what we can't do. For the assumptions right now, we're assuming that applications don't get around the permission model because, for example, if you're rooting the phone, if you're using local privilege escalation exploits, you can run the phone number, most likely. Secondarily, capabilities can be implemented outside of Dalvik, and this is really important to understand because you can bring in native code to Android applications, and you can basically download ARM code from the internet and start executing it. And of course, in static analysis, we're not going to be looking at that, but I think it's important that everyone kind of understands what we're looking at here. So specifically, we're not looking at code download at runtime, we're not looking at encrypted code. So for example, if there's a polymorphic engine that the current static analysis that we're doing doesn't look at that, and we're also not looking at dynamic linkage, so if you're using reflection or any sort of raw IPC calls, so Android has this IPC mechanism called Binder, we're not looking at raw calls there. At some point, we will get there, but we're looking at the vast majority of apps that use the framework as it's built right now. What do we do on iPhone? So iPhone, as a lot of you guys know, uses process-level sandboxing and the app store enforces APIs. Unlike Android, you don't usually have acknowledgement of permissions except for push and location, where there's a user box that says, hey, do you want to allow this for this specific application? What is an iPhone app when you actually download it? Like an APK, it's just a zip file. The application binary is typically in a certain place, and it's an executable format called Moco. Yeah, it's the same executable format as on OS X. The Moco, the header, is a series of load commands, so they specify how the binary is segmented in memory, what frameworks it's linked to at runtime, whether it's encrypted and so on and so forth. There's three segments that we're going to be carrying about today, text, data, and link edit. Text is where the executable code and read-only constants are. Data is where we have the readable data or any sort of things that are initialized that can be mutable, and link edit is all of our dynamic linker fun. If any of you guys are used to reverse engineering portable executable format, the top-level construct is a section, whereas in Moco, the top-level concept is a segment, and a segment has many sections. Totally confusing, but for whatever reason, they decided to change the terminology on this. But one of the big problems is, once we download apps, they're encrypted. So what do we do? I think it's important to look at how the Moco, basically how the Moco is loaded in the kernel. So first, there's basically file segmentation and the text segment is encrypted in memory. So what happens when you load an iPhone app? The kernel effectively decrypts that encrypted segment and maps that into memory. But we don't have keys to decrypt that right now. So if you're interested more in how this loading process works, I think there's a lot of rich research to be done here. This is all open source available right there. So the question is, we're not decrypting the text segment. What can we see? The great thing is we get symbol tables. And almost every binary I've ever seen has very gratuitous symbols in it. So these are all on the linked edit table. They're all plain text. And the other thing we can also see is frameworks. So these are effectively the dynamic libraries. And we can actually see those because those are implemented as Moco load commands. So we can see that all on plain text, too, which is really nice. So what do we do to analyze these apps? First, we're looking at the symbol table, which is defined symbols, the classes and methods you implement in your application. We also look at undefined references. So these are the things you're importing from the platform itself. And then we're looking at Moco load commands. And similar to what we said on Android, we're defining heuristics to say, hey, how do we know if an application accesses a device's contacts? We're looking to a reference to any of the address book APIs on the device. And like Android, we can run arbitrary analysis as well. So to understand what we're looking at and what we're not looking at, we're not decrypting the text section. And because we're not doing that, we can't look at dynamically loaded code. So for example, if you're dynamically importing frameworks or you're bypassing frameworks to access things via private APIs or you're downloading code at runtime, all of these things are probably going to get you rejected from the app store anyway. So it's probably not too impactful, but these are things we're looking at in the future. So if you didn't understand anything I said, here's the summary. We downloaded a whole bunch of Android and iPhone apps. Lots. And we built a whole bunch of analysis tools so we can ask really probing questions of them, you know, against the dataset in mass. Oh, and we don't want to do it manually because I don't have nearly enough Red Bull to go through 100,000 applications. We tried that. So we're going to go through some results and we're going to do this as a series of stories. Should be fun. So the first story is, in the beginning there was data. So what happened here is when we first got this dataset, we were really excited and we asked a first question of, are there any applications that are accessing contacts on my device that maybe shouldn't be? And we were just like, oh yeah, we're going to find like 10 applications that stick out like a sore thumb. And it turns out that a very large number of applications access contacts on both platforms. And one of the motivations behind this, a lot of the early variants of malware, especially on platforms like Symbian, used SMS and other... Basically, autospread. Exactly, autospread. So SMS, MMS would go through your contact list and spread it either via Bluetooth or via SMS or MMS. So we started going through this data and we found a whole bunch of sound boards. Why does a sound board access my contacts? I'd love it to say, Han Solo, I've got a bad feeling about this whenever John calls me, but why? So we actually... This is the disassembly. It's through a tool called BoxMolly. It's a great set of disassembly tools for Android Dex files. And we found, we have a smoking gun that's accessing the contacts API, but what's the context of all this? Well, it turns out that it accesses contacts to set custom ringers. This is a totally legitimate use case. And the key message here is if you look at an application, it seems weird that it's accessing something and you dig in. A lot of times it's totally legitimate. And if we look at the actual disassembly output, the access to contacts is in a method called assign ringtone to contact. Totally makes sense, right? If I want to add a ringtone to a contact, you should be able to access my contacts. So, what's the lesson? Not all apps that access sensitive data are bad. Just to be clear, not all apps that access sensitive data are bad. Not all apps that access sensitive data are bad. If you take anything away from this talk, please remember that. Story number two. So we said, okay, crushing defeat. We didn't find anything wrong with the soundboard applications. Are there any applications that are using my location for purposes that I don't know about? And what does this say about mobile apps who are in the direction? Nearly 30% of every single mobile app that we saw knows where you are. It's interesting. So, we went through a search. We started looking through Android and iPhone apps that access locations, similar to how we did before. We went through a whole bunch of apps and every single one, nearly that we encountered that didn't seem to have a legitimate use for a location, but that actually accessed your location, had a third-party advertising SDK in it. And this was really interesting. So we dug in a little bit deeper, and here's what we saw. So, one SDK in particular, Quattro Wireless. In the Android emulator, you can simulate the lat-lon. So, of course, 3.1337 is where we are. I think it's somewhere in the middle of the ocean somewhere. And we wire-sharked it. And look what we found. We found an ad request to the Quattro Wireless ad server. And we have a poorly-rounded version of our latitude and longitude sent over to the server. Oh, yeah, and it's plain text HTTP, even better. Yay. So we tried to zoom out a little bit and said, okay, well, this seems like almost every application that we encounter has a third-party SDK in it. Given if an application has a third-party SDK, how likely is it to access location? And we looked at that across some of the largest, you know, the most prevalent SDKs out there. Wow. There's a lot of SDKs that are accessing location. You notice a lot of these numbers are higher than the average for the platform. To put these numbers into context, I think we have to look a little bit deeper into how we analyze things. So, for example, on Android, if a developer brings in an SDK but doesn't request the location permissions, we don't say the app requests location or accesses your location because it can't. It doesn't have access to the APIs. So you notice the Android numbers for the SDKs accessing location were lower. Our analysis takes into this account. So if a developer brings in an SDK that tries to access location but doesn't actually request the location permission, the SDK doesn't get location. And I think it's a big credit to the permission-based model on Android and just having user privacy in mind in developing that. And on the iPhone, it's fairly interesting. An application is only allowed to use location if there's a legitimate, I don't want to say legitimate, if there's a benefit to the user for using it. For example, if Steve likes it. If Steve likes it, yes. So if your app uses location-based information, this actually came out as an app store rule earlier this year. They will actually reject your application if it uses location only for ad-serving purposes, which makes sense. And so we kind of looked at third-party code as a whole, and it's actually surprisingly prevalent in applications. And in one of the messages we have with developers, when you're bringing in code, understand what it's doing, because a lot of times it's closed-source, and there are a lot of novice developers developing mobile applications, and we think this is a great thing because it's so easy, and I don't like wrestling with bad development frameworks, but it also brings people who maybe aren't experiencing development and they just throw in ads to their code without thinking about the data they're collecting. And our message is that understand the data they're gathering about their users, because ultimately users trust developers, and as a developer, you want to be responsible with the data you collect and to notify your users about that data you're collecting. One of the most common things we're seeing is just first-time developers. A lot of developers on Android and iPhone have never written an app before at all in their life in introducing vulnerabilities in terms of poorly written code or just bringing in a third-party SDK. They have no idea really what it does. They have to bring in the functionality and then expose all this other functionality. So one SDK in particular when I tip our hat to them, AdMob, they've actually done a really good job in helping educate their developers on what data they're accessing. So specifically, the AdMob SDK defaults location collection on iPhone to false, and they encourage their developers to think about what the ramifications of gathering location are. I would love all SDKs to run and encourage the developers to think about things this way. So the lesson we learned here is developers don't always know what's in their apps. And we want to encourage people to, if you're developing mobile applications, really look closely at what sort of data you're collecting personally and what sort of data maybe third-party libraries you're bringing in are collecting. Story number three. We found SDKs. We found a soundboard app that turned out to be bad. Are there any things that are just bad stuff to me? We said, okay, let's dial up our heuristics and find applications that are accessing a ton of data. What app accesses every permission? So we found a lot of applications in the market that are called system utilities and that's literally the text and nothing more from a developer called RxS. The application names sound like workout plans, you know, Android 15x, Android 16x. And we see inside a totaling code to mobilespilogs.com. And... Yeah. I don't know about you, whenever I see anything going to that URL I'll get a little scared. So, okay, we dug in. Here's our friend, Box Molly, what happened? Get SMS details. Get contact details. Get URL details. Get call contacts. That's a lot of data. What else has happened? This is mobilespilogs.com. It's a well-known purveyor of mobile applications. And what they do is, in order to install it on somebody's device, so, you know, jealous ex-girlfriend or something to that effect, they basically say, hey, go in the market, search for these kind of cryptic names and install that on somebody's device. And to be clear, you know, this is not something that you are likely going to install yourself. But these things are in the market and they're called system utilities. So if this is on your phone, you don't really have much recourse into knowing what it actually is. On my phone, Nicholas, at Black Hat DC, had a really great talk. He talked about a spy phone application and it gathered a whole bunch of applications, a whole bunch of data on iPhone using allowed APIs. And, you know, the interesting thing about iPhone is, on Android, you see the permissions that an application is accessing. But on iPhone, you get the app where you don't. You can see location and push, but that's about it. I thought the keyboard cache was one of the coolest parts of that research. All right, so the talk's available online. I highly recommend checking it out. And the code's on GitHub, so you can check that out, too. Also, I don't know if any of you guys saw last week, there was a 15-year-old developer, literally a 15-year-old that snuck a flashlight app into the iPhone market with a functional SOX proxy. Effectively, this turned it into a tethering application. This was accepted into the app store and, you know, once the internet found it, Apple removed it. And just to be clear, this is not malicious, in terms of the notion that just because something is curated or looked at, the idea of something not being able to make it in, this is a perfect example, a flashlight app with advanced code and functionality made it through. Potentially millions of people could have downloaded this thing. So the lesson to be learned here is apps aren't always up front about what they do. It's up to the user to kind of understand and to be a savvy consumer about applications. Story number four, the orange wallpaper. So we looked at, you know, a whole bunch of permissions. But what about kind of more seemingly innocuous things? You know, your IMEI, your IMSI, your phone number. There are actually only a few hundred applications in the Android market that gather this sort of data. And we originally diced this data and said, are there any patterns? And we found that there were two developers that actually had a near majority of the applications in the market that accessed all these capabilities. And so let's dig in. What do we find? These are the applications from these two developers. So guess which one is the most popular? So let's download one of these applications. You know, I want to pimp my phone. So we search for a wallpaper once we've downloaded the app. Select a wallpaper. Phone is pimped. Wait, so why again are the wallpaper applications accessing my phone number, IMSI and IMEI? Okay, so good friend Wireshark and none of you can read this. So, but this is what happened. We installed the wallpaper app. Phonies. Then we saw this HTTP request in the clear being sent to a server. My SIM serial number, my subscriber ID, my phone number, and my voicemail number. In the clear. Wait, why do they need my voicemail number? Oh, and the interesting thing is on some Android phones, you can actually insert your password into the default voicemail number. That's not my actual voicemail number. I'll change it after this talk. So, okay, so let's dig in. We did some disassembly on the application and we actually found that newly all of these applications had the same class. And the great thing about these kind of app genome projects, we can ask this question fairly easily. We don't actually have to go dig in and like manually reverse engineer every application. We just said, oh, yep, they all have this thing in common. So what does this service do? Here's an excerpt from the disassembly of what actually happened. See device ID, get line number, serial number, subscriber ID, voicemail number. These are all the Android APIs that access the data we just saw. So who owns the domain that the data was being sent to? Yeah. But of course, nobody would ever download an application and agree to permissions without looking, right? Nobody's ever done that. Any guesses to how many downloads these applications had? Yeah. And just to be clear, when we look at this app and you can kick to the next slide, the information being collected isn't necessarily malicious, but you have to ask yourself, do users actually understand that when they're downloading a wallpaper application that their phone number, their voicemail number, and potentially their password, their IMEI and IMZ number, which can be used to get course location, recorded? I think not. And this is just a really good example. Our goal here is to help developers understand that even if this isn't malicious, you're potentially compromising the privacy of your users. Think about this. If you're at a coffee shop and you're on a wifi hotspot with your droid, for example, all this information is being sent in the clear. If you did this at DEF CON, your IMZ, all this stuff would be up on the wall of sheep. I mean, it's ridiculous. And so the claim from the developer is that the phone number was being collected to preserve favorites across devices. If you install the wallpaper app, you throw your phone away and get a new phone, it will preserve that favorite. Google is investigating the apps right now. They're taking down from the market until they reach a decision. So it'll be interesting to see what happens with this. So what's the lesson here? Is applications, if they have capabilities, if they request permissions, as a user, you should assume they're actually accessing them. So summary of the lessons we learned today, just so if you guys take something home. First, I really want to underscore that simply if an app accesses sensitive data, it doesn't mean it's bad, right? There are totally legitimate reasons, and actually that's the power of mobile operating system, that we can access the sensitive data. We can have powerful apps. And it's an interesting question as to whether a given access to sensitive data is something you as a user want or not. Secondarily, developers, if you're in the audience, you may not always know what's in your apps. Please pay attention and only put things that you actually want in there. Third, applications, they're not always upfront about what they do, and so one of the goals of the project is to basically extract what applications are actually doing and help users understand what's actually going on under the hood. And third, be careful what you download. Applications, Android is a great example. You can actually see what an application is doing. So just look at that and assume that if an application accesses something that's for a reason, and it's up to you as a user to choose whether you accept that reason or not. Right, so just imagine what's in the wild. So hypothetically, just the power of this project, as Kevin talked about, is we have a vulnerability. We now have the ability to ask a question about all the apps in the marketplace and say, is this code existent in other places? Is this type of capability existent in other places? And being able to instantly find this. So when Kevin and I initially were looking at this stuff, we saw the one wallpaper app, and we said, oh, 50,000 people had downloaded this. This looks interesting. The way we were able to see that it was, what, 76 plus eight apps in total in between 1.1 and 4.6 million downloads was because of the app genome project and the ability to then say, okay, this one piece of data we found, how do we look at all of the data in the mobile app environment? And we think this is going to be the beginning of some really advanced new security response tools. Yeah, I mean, if you ask how you would know how many applications on Windows would access a given vulnerability, it's almost impossible. So we're going to, we're just about finished here. I think it's three things that are really important. Application models are probably going to remove all really bad apps from the market. But as we've seen with a lot of apps we've talked about today, there's context dependence. It's hard to say whether something's good or bad, especially if it looks good today and maybe you get a whole bunch of people to download it and then it turns bad later, for example, if it updates, imagine, easiest way to get malware on a million phones is to release a game for free, get a million people to download it, then update it with some bad code or flip a bit on the server and execute code dynamically. And these are going to be some really interesting problems we have to deal with with app stores. And just to final point, the future of mobile security is not going to be PC-like viruses. It is going to be in this gray area and we're going to be seeing more and more of this. So pay attention to what you download. Developers, be responsible about how you're developing your apps and administrators. Don't feel like you have to ban apps. Just make sure your users know what is happening in the context of their mobile experience. So we want to thank everyone and thanks to Google.