 Good afternoon everyone. So pleased to be back at EMF camp. I have learned a lot of things this weekend including that my previous understanding that average conference presentation stages contain three spiders is actually a statistical artifact. Average conference stage contains zero spiders, EMF stage A contains approximately 12 billion spiders and should have been excluded as a outlier. My name is Matthew. I'm going to be talking about some things. I'm going to be telling you how you could do some of the things that I did. I would like to emphasize that as my shirt says I would obviously not do any cyber crimes. I will not be answering any questions that are answered by my shirt. And in addition to that, I would say that if you have any concerns about anything, you should definitely speak to a lawyer and not me. I do not even play a lawyer on TV. Anyway, scooters. So the scooters we're going to be talking about are the electric ones that you can walk up to with a phone and then press a thing and then they go beep and then you drive away. And that's all very straightforward. It's a very nice, well okay. I have an app. The app does a thing in the real world. Wonderful. This sounds fine. And so when we think about devices that you plug into things and then you have a phone that does something and the two of these somehow interact with each other, this is something that you might have some degree of recognition of. You may think of other things that may exist in your life or which you may have rejected entirely from your life, understandably, that have similar behavior, which is basically that eScooters are Internet of Things devices. And as we all know, IoT is terrible. That's what the T stands for. It's definitely not for security. So why have we decided in many parts of the world to fill our streets with IoT devices that are just sitting there with nobody watching them? This sounds like a bad idea and it is for a number of reasons which I'm going to go into. But first of all, what is one of these things? You buy a scooter off the shelf. If you're a early-stage startup in this field, things have progressed a little since then, you basically just jam a GPS tracker into the scooter somehow so you know where your scooters are so that you can plot their location on a map that will exist on the app. Because you can't rely on there being enough of these everywhere for people to just be able to walk up to an arbitrary location and expect to find a scooter there. You need to tell them where the scooters are so that they can take themselves to where the scooters are. The scooters are thankfully not self-driving and therefore you cannot have the scooter come to you. Now this is the basic minimum approach. Some vendors did this and actually went to the extent of, you know, how do you lock and unlock the scooter? Because obviously you don't want the scooters just be sitting there and an arbitrary person be able to walk up to the scooter and just steal the scooter. Because even if you've got GPS trackers on all of them, if you suddenly lose several thousand scooters, it's going to take you a while to go around to people's houses and ask for the scooters back and if they say no, then that's even harder. But that's real world crime, not cyber crime, so we're going to ignore that issue entirely. Now some of those were also implemented by basically you do the locking and unlocking by just sending a command over Bluetooth and it was the same command for every scooter that they had and there was no validation that you had, you know, actually paid them any money or anything before doing so. Yeah, that company didn't do so well. So if you want to make a good one, you replace all the control logic with your own custom logic and you add a new board that has a modem in it and then it connects to the net and it's all amazingly IoT and the scooter becomes extremely online and as we know, very online things are very good. So bonus points, obviously, if you can build a scooter that doesn't disintegrate when people are using it like so and obviously you want to make a scooter that's as attractive to people as possible, so maybe you have a scooter that has extremely glowy lights all over it in order to make it more exciting. But, you know, those aren't nice to have, not necessities. If you kill a few people, that's just the cost of doing business. Manufacturing is complicated. The companies that are still in this space have largely now gone to custom scooters rather than just off-the-shelf things. They're in many cases multi-sourcing. I haven't actually looked into what's been happening since the pandemic. I assume it's a combination of it got harder to obtain these scooters and also people weren't actually using them as much. But some of these companies still exist, so presumably they're still making things somewhere. Now, we have had some issues like also probably don't have scooters that catch fire. Not great for the brands, even if it does make them more, you know, aesthetically appealing for a short period of time. So once you put things on the Internet, then things become complicated. Now, so far I've been talking about technology in an abstract way. My background these days is in security. I am, in fact, in a literal sense, unqualified to do that. All my qualifications are in fruit fly genetics, except for the RSA certificates in basic computer literacy and information technology that asserts that I am, in fact, qualified to use a word processor, database, and spreadsheets. And I did pass that with distinction. But we're going to start talking about these technological aspects of this as opposed to the logistic aspects. And why do we care about this? Well, because if you get this wrong, then this sort of thing happens. So I will say that when the global scooter giant said the attack was not funny, they were in fact correct. It was not funny because it was very racist. If it had not been racist, this could have been genuinely funny. And therefore, this was a terrible error on the part of whoever did this. If you are going to do cybercrimes, which obviously I do not encourage you to do, please do not do racist cybercrimes. Racist cybercrimes are pretty much the worst. But if you're building a scooter, for instance, if you are a global scooter giant that does not find your scooters being hacked funny, then you should maybe not have a small plug in the scooter that is easily accessible by removing a couple of bolts, and then have USB running over that connector. And then if you plug that connector into something, you should probably not make it extremely obvious that that speaks the Android debug bridge protocol. And you should definitely not then allow ADB to run as root. So if someone types ADB shell into a laptop that they've plugged the scooter into, they get root on the scooter. And you should definitely not then have all these samples that the scooter plays under various situations, the WAV files that can be replaced with, say, racism. Don't do that. Otherwise, you end up in this situation, which obviously, as we've established, was not funny. Anyway, so if you were, for instance, to have access to one of these boards from after that incident, the not funny incident occurred, then you can, in fact, notice that they contain log files, which show that the headquarters has the ability to run commands on all the scooters. And if you do that, you can see that they ran MD5 some, which thankfully, there was a copy of on the scooters, because why would the scooters not have MD5 some on them. So they obtained a cryptographic hash of all the WAV files so that they could identify over the internet which scooters had been, which scooters were now embodiments of racism and therefore disable those and then just kill ADBD and then make the startup scripts that launched it non-executable. So for field upgradable scooters in order to prevent people engaging in not funny racist attacks, congratulations on actually implementing something that allows you to fix that in the field. This is a kind of miraculous future. Anyway, so even after this, obviously you would not then say observe that the debug ports that contain USB also contained another pair of pins, which appeared to be serial, and you would definitely not have a mechanism over your serial protocol to execute arbitrary commands on the scooter, which for instance, allowed you to re-enable ADBD. And it's fine because the serial is running at 1.8 volts, so it's sufficiently obscure that literally nobody's going to identify this. As I said, not doing any cybercrimes, we're not going to be talking about real-world attacks, we're in fact not going to be talking about hacking the scooters at all. We're going to be talking about taking advantage of the functionality that is made available to the app that you use to obtain the scooter, which is much more exciting and much more straightforward. But for that, we're going to stop talking about scooters and we're going to talk about some amount of Android reverse engineering, which is a topic near and dear to my heart. I'm sorry that this is not necessarily what you signed up for, but I promise that you will leave with a newfound confidence that you too can discover that people who implement backends for IoT devices are not necessarily considering security or privacy at any point in the process. Android apps come in the form of APKs, the Android package format. You can download the APKs for most apps from a variety of websites that basically mirror the things that are in the Google Play Store. So you don't even need a Google account, you don't even need an Android device to obtain one of these APKs. And if you look at these APKs, then you discover that they're zip files. And we'll get to that in a moment. Let's talk a little bit about how the apps speak to the world. And the answer to this is that it's basically HTTP all the way down, except for the case where it's WebSockets, which are generally established over HTTP. The main reason that people use HTTP for this is that it's very straightforward. There's a lot of frameworks that you can use to build HTTP backend infrastructure. There's a lot of HTTP frameworks that you can incorporate into your app without having to write an entire wire protocol yourself. And more to the point, it means that it's very straightforward to also use the same backend APIs in any sort of web interface. So various people who are doing this will have apps. They'll also have a website that allows you access to the same APIs. Browsers all speak HTTP, so we use generally HTTP as an approach here. So as I said, an APK is basically a zip file. You unzip it and you get a bunch of things, including some native libraries, some class files. I don't really know what a class file is. Java is not my field at all. But these are all binary. But the nice thing about Java is that it's a byte script language, and it's actually fairly feasible to do stuff with it. So unzipping an APK will get you just the raw things. APK tool is another tool that gets you an intermediate format. JADX is a thing that will actually get you something that in many cases is surprisingly close to the original source code. So unzip, press unzip, post APK in, and then you receive the raw assets. This is largely not helpful unless all the code that you're looking for is in the form of native libraries. So compile C or something similar that the app is then making use of. If you unzip it, you will get those files, and then that turns into reverse engineering with something like GEDRA or IDA Pro or any sort of existing reverse engineering disassembling tool. That's out of scope here. I'm going to ignore that. APK tool turns the app into an intermediate representation. It basically takes the Java byte code and turns that into an intermediate representation. So it's something that is basically taking the individual Java instructions and turning those back into something that's vaguely human-readable. So it's not assembly language, but it's sort of that kind of level. It's a little higher level than that. The nice thing about APK tool, though, is that you can modify that intermediate representation, rerun APK tool, and turn the thing back into an Android APK. And you can then sign that, you can install it on your phone, and you can run the modified version of the app. This is useful for various things. You can, uh, if, say, there's a debug flag that enables a bunch of functionality in the app, you can just hard code the debug flag to on, rebuild the app, and then run the app in debug mode and find all sorts of things that the vendor didn't really think they were giving you. If you're building Android apps, if you are doing debug builds, you should not just have a runtime flag that is set to zero or one depending on build. You should actually not include the debug code. If you don't want people to have code, don't give them the code. If you give me code, I will look at it. I'm sorry. It's an unfortunate habit. So APK tool, intermediate form, hack, modify, recompile. GenX, I have no idea how this works. It is utterly magical, but you run GenX on an Android APK, and then you get something which usually doesn't have the original function names, but is otherwise a surprisingly good approximation of Java. So it's a lot easier to read than the output of APK tool. Downside is you cannot build this back. It's not sufficiently good at turning stuff into Java in order to be able to round-trip it back into an APK in the general form. Like I said, I'm a biologist, not a computer scientist. Please don't ask me how to explain this stuff. It probably involves linked lists or something. I don't know. Anyway, so what we're going to do here now that we have the ability to look at APK files and identify the contents of apps is we're going to reverse engineer them. We are obviously not going to do any cybercrimes, as I said earlier. No cybercrimes here. Most of these apps are Java. You basically don't need to mode Java in order to be able to do this, to do the stuff that I'm going to be telling you about. If you want to be able to reverse engineer an API, then the most straightforward way if you're going to reverse engineer some code is just grep for HTTP or HTTPS colon slash slash and then get a bunch of host names. And then look for where those are used and you'll probably also get a list of API endpoints. And once you have those, then you can just call those yourself using, say, curl. What if you're not confident that you know how these APIs are called? What if you're not confident about how to authenticate in order to get some sort of authentication token that the API is going to request before you can actually do anything? And this gets into a sort of semantic discussion. Obviously the binaries embody the API in some form. They are speaking to the API. The binaries are aware of what the API is. Therefore fundamentally the truth about the API is embodied in the binary, but it can be somewhat tedious to find that. So the wire protocol, on the other hand, what the app is sending to the server, what it's getting back, that is an obvious embodiment of the API because it's calling the API. You can see that happen, except you can't see that happen because it's happening over HTTPS, which is encrypted as we know encryption is good, except in this case. So wire protocol is fundamentally the thing that we're going to be looking for here to make our lives more straightforward. We will still want to take the app apart for various reasons, but this is a great starting point. We want to be able to look at what the app is sending to the backend services. We want to be able to examine that data to figure out what the API looks like and then be able to make those calls ourselves. Now this is fairly easily done with a program called MITM proxy, which is a web proxy which sits there, which has its own certificate. You speak, you convince your apps to speak to the proxy and then the proxy speaks to the real thing and it decrypts the data, prints it, re-encrypts it and sends it to the API. Now this is not super straightforward especially if you're using Android because obviously the app and in fact your phone are expecting to receive a validly signed certificate from a remote website that asserts its identity and they are not going to accept the certificate that MITM proxy generated for you because that would be a massive contravention of trust. In the past, on versions of Android prior to 7th, you could just install the certificate that MITM proxy generated, tell your phone to trust it and then everything would just work. Since Android 7 that doesn't work anymore, apps need to explicitly opt in. This is a privacy measure in order to avoid your company or anyone else being able to just push a certificate onto your phone and then be able to decrypt all your web traffic. So that's good except in our case that's bad. So Android does not make this straightforward at the system level. There is a requirement that the apps opt in if only we could modify the apps. It turns out that when you have software, you can just change the software. This is not as well understood as it should be. A piece of code that is running on the system you control can be modified and you can make it do different things. And if you look at this link then there's documentation about how you write Android network security policies. You can embed one of those into the app, you can update the Android manifest XML, you repack it all, you re-sign it, you install it and then everything works magically. Except that tells the Android system services, the system infrastructure to trust that certificate. It doesn't tell the app to and in general that doesn't matter but some apps are themselves opinionated and those apps do something called certificate pinning. They know what TLS certificates they expect to get and if they get something different, even if the operating system trusts it, they will reject it. It's tedious. Again, software is mutable. If you know that they're expecting a certificate, you can just unpack the app, replace the certificate with your own certificate, repack the app and then it trusts it. That can be annoying though, you need to find the certificate, you need to embed that, blah, blah, blah, blah. How about instead you just find the call in the app where it enables certificate pinning and you delete that line? Turns out that being lazy actually works a lot better than you might expect and so that's straightforward enough or alternatively you can get the APK MITM package, it's on GitHub, it's a, oh, spider, it is written in JavaScript but whatever. Anyway, you just run that on an APK and then suddenly it disables certificate pinning in every web framework it knows about and then it also lets you MITM it. So this download that runners gets an APK and now you can put it through MITM proxy which, as I mentioned, MITM proxy dot org, very easy to remember. You set up an access point, you configure the access point to redirect all traffic to port 80 to 8080 or to any other ports you care about through MITM proxy so the app doesn't have to care about what it's doing or this go through MITM proxy and then you can see every API call that the app makes. Anyway, back to scooters. An eScooter app is a thing that shows you where scooters are and it's something that lets you hire a scooter so in order to do that the app needs to know where the scooters are in order to be able to show you a map with the scooters on it. So this means that it needs to know the location of every scooter very accurately because if it shows you the scooter somewhere where it isn't, you're not going to then spend money on the scooter and it needs to have some sort of link between the scooter's identity in real life and the scooter's identity online so it can do things like tell you how much battery the scooter has so you can make a reasonably informed judgment in terms of whether take this scooter here that only has a small amount of battery but I'm only making a short journey or walk a bit further get this scooter over here that has lots of battery and anyway so what do we call something that is a app that shows you the precise location of specific objects so naively this doesn't seem like a problem right these scooters are on the map what does that tell us so what happens when someone hires a scooter the scooter vanishes what happens when someone gets off the scooter the scooter reappears so if we were for instance to query the api for the location of every scooter within an area and then track when one of those scooters vanishes and then track when it reappears we can generally infer that this means someone hired the scooter and drove it to this other location alternatively scooters teleport I'm going to go with the former anyway so that allows us to build a database of where journeys start and end now the start points tend to be a bit fuzzy because people need to go to where a scooter is end points though tend to be the places people are actually going so if your question is Matthew does this allow you to identify people who for instance work for specific government departments and then figure out where they go to the pub or where they live the answer is yes it does do that so that's kind of course though that is only telling us that a journey started as a specific point and ended as a specific point it doesn't actually tell us anything about what happens during that time obviously the time between it also lets us infer you know times of each of those so we can infer how long it took someone to do something but we can do better than that wonderfully so say if you had an api that so mitm proxy will tell you about a bunch of api endpoints and you can use all the ones that the app is using but the nice thing is once you know what the format of those endpoints looks like you can go back to the app and you can just search the app for those endpoints and you'll usually find a file that contains the endpoints you've identified it will probably also contain a bunch of other endpoints that the app doesn't commonly use but which work anyway and if you do that then you might discover for one vendor that they have a api endpoint that is called fetch underscore by underscore plate underscore number and if you then pass the scooter's plate number to that you might discover that it gives you the precise latitude and longitude of the scooter even when it's moving thus allowing you to track the scooter in real time so obviously if you're trying to track every scooter in the world then the obvious ways to avoid this is you implement some sort of rate limiting so one vendor I feel kind of bad about this I did at one point with one vendor just literally request the location of every single scooter in the world and that took me about five minutes and then the next day they didn't mention rate limiting such that if you made more than five queries in two shorter periods of time they'd be exponential back off anyway it's a good thing that it's not possible to say just sign up for an account with one of these companies using a phone number with the only validation being over sms and there being a whole bunch of websites that just have anonymous sms numbers that just print the token that is sent to them on screen and it's a good thing that those sites aren't themselves using some sort of reverse engineerable api that lets you just scrape this automatically that would be terrible people could do awful things with that information so if you ask for the location of every single scooter owned by one company a few years ago you end up with something that looks like this and now what's interesting is that we see a lot of scooters in a bunch of places you can see that a bunch of scooters are located just off the west coast of Africa on Null Island but you can also see that there's a lot of scooters in China and moving from China into Europe and the weird thing is that they seem to all be following some sort of strange line you get blocks of them and turns out to map very well onto this so the scooters travel by train but also if we sort of enhance zoom enhance them further we discover that apparently they also traverse the sewer's canal cool scooters on the pose anyway so as I said we can track scooters in real time I was living in Oakland at the time so did this sort of thing so here we can see very precisely the start and end point of someone's journey through Oakland around scenic lake merit this I've actually fucked the coordinates a bit here before plotting this but the coordinate days it gave me was absolutely precise enough me to narrow it down to like two houses in terms of where this person was going I mean so this is just someone picked up a scooter from downtown Oakland and then went fine that seems innocent enough what else could be observed using this information well this one obviously international SP and I as I mentioned could be a thing as I mentioned the whole yes you can absolutely figure out where people who work for specific government departments live without needing to get off your sofa which is nice but then here's a case where a scooter goes around Oakland for a while somewhat secures this route doubles back on itself a few times and it's not obvious from here but there are various points where it sort of stops for three minutes right outside houses and if you said Matthew this seems very much like someone who might be delivering things that are not necessarily legitimate to people yes it does look awfully like that so it is potentially the case that in fact by doing this obviously I would not say you should commit any crimes in the process but you might be able to observe people doing things that may or may not be crimes so the conclusions are if you have information that's available via an API you cannot assume that merely because you didn't document your API that nobody will be able to determine how to use your API you are probably not developing something amazingly magical that people cannot reverse engineer especially when you've given those people an app that calls those API endpoints and have therefore effectively given people every piece of information they need to identify what's happening an internet connected device that has a GPS on it yeah that kind of collects a lot of information not just about the location of scooters but about how the scooter is used and if you don't take that into account when building your API you have not merely built a venture capital backed unprofitable enterprise that you're really hoping will IPO before everybody notices that this will never make money you have also built a publicly accessible massive surveillance system which is probably not something that you put on your pitch deck anyway thank you