 Hi, my name is Morgan Gagmer. I'm a student, but first, you've made it to talk about ARM and all sorts of fun stuff. So first, a little bit of a dedication of my father. My father's an event assistance engineer. What you're seeing right here is he has one ARM. He taught me to build my own tools, keep building your own tools. You will learn so much just by exploring your world. And he taught me that. And I want you to come out of this, at least with explore your world. Know how it works. Thanks, Dad. I'm the hopeiest fruit that ever did cross the universe. I know exactly where my towel is. It's a little puck. I bought it from REI. I've been fiddling with Linux stuff for a good 10 years now. I got an old embedded ARM board from my father at one point. I've done ARM services for the last a while. I've even built my own version of Cyanogen just to try and get it to build on a machine. I'm actually a student by day. I try to not talk about it as much, because it's in a completely unrelated field. I'm here because it's fun. So this talk is a fair amount of kind of theory. There's a whole lot of complex. And, like, ooh, look at this sweet trick that you can pull. This is because this sort of work, reverse engineering, is one part science, one part estimation, a dash of bitter feelings about everything in the world, and a little bit of end of, what the fuck was that EE thinking when they built this? A lot of things come from experience. I can point the way, but I cannot see the future. You will have to learn many little tools and techniques on your own. There are a lot of seemingly random parts in this talk. They'll all come together. So first, let's talk about ARM. The BBC, to your face. ARM originally started for aircrew and risk machine. If you think you are away from an ARM machine, you are sorely wrong. ARM originally built the ARM 1 chip for the BBC micro for a lightweight risk system. Acorn changed handing hands a couple of times. They've never cut silicon, but fun fact, Intel has cut silicon for a non-intel platform multiple times, especially after they had bought deck. The ISA actually hasn't changed about 20, 30 years. You can still read ARM 1 assembly from the 80s, kind of figure it out, and then run it on a brand new ARM chip today. There are ARM devices all around you, including in routers, cell phones, NAS devices. QNAP actually has a version of their NAS that runs on a Cortex ARM system. Synology is another company that produces hundreds of devices that run on ARM chips, and even new laptops with Qualcomm Snapdragon chips are coming out today that are running Windows, Linux, everything. In the non-SOC world, ARM takes up everything from smartwatches to home lighting to even in your laptop today. In the surface alone, there are four or five ARM chips, just for things like firmware control, disk access, even the touchpad has two independent ARM chips. The Steam controller is built around a Cortex M0, and IKEA's Trot Free Lighting System is built around an embedded ZigBee and Cortex ARM0 base station. Sony has been building ARM into their cameras for a good long while, and your camera today probably runs ARM. So let's talk about embedded Linux at a kind of high level. An embedded Linux device looks like three parts, storage, SOC, and RAM. Everything else is a bonus. You're gonna see physical devices on I2C, or USB, or SDIO. There's standards for how cameras and displays work. A lot of this has been hashed out pretty straightforward. It looks a lot like a standard PC today, and the lines are getting really blurry. So let's ask the question of what is the system on chip? It's a just-add-water sort of easy design peripheral. What you're seeing here is an orange pie with an all-winner chip. That is a quad-core ARM device that you can get for $10. This is really amazing, actually, because now we have stuff in today's world where you can put this on a breakout board, add power, boot Linux from an SD card, and actually build a device. This is meant to do a lot of devices that are using really cheap SOCs, including our phones. Everything looks about the same, especially when you look at the block diagram. Everything sits on an internal bus. There's a peripheral controller for external content. There's a storage controller for boot content, a GPU, a CPU, some external peripherals. Once you've seen one, you've seen them all, including ones from Intel. You don't actually always see ARM stuff on an SOC. This is from BayTrail SOC that goes into a laptop. As you can see down there, there's even an old SM bus and what looks to be like boot ROM and legacy stuff to hold all of the old X86 stuff that has been sitting there withering away in a corner because nobody runs 32-bit X86 anymore, right? Storage comes with a couple of different flavors. MTD, which is basically a cheap way to say, ooh, this is Flash. EMMC, which are called embedded multimedia cards, SD cards. And if you've ever seen SD card on a device, you have just won the jackpot. Then there's UFS. UFS is a new standard that's come along in recent years for the purposes of cell phones for higher speed, higher density stuff. If you have a newer cell phone, you are probably using this. There's a lot of different variations for this. Some devices have a little bit of onboard flash on the chip so that they can load the first-stage boot loader. This is common on phones, stuff like fast boot boots from this. Every vendor has their own way of shoving bits onto a device. They all suck. RAM, you're gonna see a whole lot of different variations of this. Vendors are notoriously tight-assed, especially after the terrific price-fixing that SK Hynex was found of doing. But consider that there's a lot of devices that have eight megabytes of RAM, including the Word 54G. And later, they did another revision that had four megs of RAM. But modern phones are hitting the realm of a gig to four to six gigs. In a pure flash environment, you might actually be losing RAM because flash can sometimes be slow and it's faster to shove things into RAM. When it comes to peripherals, you've got SPI, iSquared C, iSquared Sound, these are common stuff, but then you're gonna see really crazy shit in there. SDIO wireless cards are not uncommon. Turns out SD cards were also built for general-purpose IO devices. And so they actually talk a slightly different variation of SPI. You're gonna see sound cards over iSquared Sound, GSM modems, most of them are just pretending to be haze-18 modems. Power management is all the rage. You'll see LEDs on your GPIO pins. Go look around in Cyanogen Mod, you'll see all sorts of different examples of this. Linux doesn't care if they're on die. This is an example from a Snapdragon 820. A PCIe, a UART, and a PCM chain are all used just for the wireless LAN and Bluetooth. This is so that they can turn Bluetooth audio into just another channel on the audio stack and completely abstract away the actual hardware. The PCIe is typically for wireless LAN connectivity and the UART is actually gonna be probably for part of a baseband controller for the Bluetooth. Again, Linux doesn't care if these are on die. You could have these emulated somewhere, you could have them completely non-present. Another good example of this is, this is the next bit Robin. The flash on the camera is done through GPIO. It's a bi-color LED, but all of the LEDs in the back, including the bottom notification LED, are actually run off of a small TI LED controller that has its own ISA and is actually its own complete state machine. When it comes to bootloaders, there's one predominant game in the market, DOS U-Boot. It has a very simple scripting language, talks over a serial port, it can pull over TFTP, HTTP, all sorts of stuff. You can shove Xmodum kernels at this thing. It doesn't care. All you have to tell it is I wanna put this thing in memory at this place and then jump to this place and execute. Some devices don't use U-Boot, though. Fast boot is a very common thing to see on phones. Some Linux tablets are also built around U-Boot, but more often, commercial devices, you might be seeing, Samsung has their own Stage 2 bootloader and such. It's all something interesting. So let's talk about the life and death of an SOC-based device. First off, it does a DFU check to see if there's anything to load fresh code onto it. Then it loads its initial program loader. This is from the vendor, typically. It does any signature checking from the initial image. This is typically burned straight on to die at factory. Then it pulls into the bootloader for early UART network wakeup, especially if it's a network boot device. Then you get U-Boot loading kernel into RAM for some other fun stuff. It kicks the kernel and then you're in user space. The fun shit is in the DFU, because that's the first chance you have to actually attack the device. If you can get life at DFU, then you can run whatever you want on it. You have full control over everything that's happening. If you can interrupt U-Boot or any other bootloader, then you can run your own kernel. Now you can start attacking other pieces. If you're gonna get an attack in user land, that's still fun. You can go and attack the actual surface that most people would be hitting. In order to start a Linux system, you need a root file system. It connects the bare minimum to boot Linux, any shared object libraries, binaries of the content. Autofluid content actually gets tarballed up in a lot of routers. You'll see this on a consistent basis. This means that like TempFS is gonna look really huge, and if you try to extract content from it, it's gonna look weird. Sometimes there's actually multiple root file systems. Newer Android phones are running off of a more consistent AB platform, which means that you have one version of Android that is known good to boot, and you have an unknown version that has just been applied. You might find yourself scratching your head at times, especially if you've gotten a dump off of a live system as to why you're not booting that. That might be why. Some devices might actually try NFS. There's a whole set of devices that internally, they have a bunch of ARM SOCs running DSPs, and they're gonna like boot off of NFS because that's the easiest thing to do. Attacking these devices comes down to scoping out your device. Get to know what makes your device actually run. Get to know what it's running in terms of Linux, what its software stack is, et cetera. If you're gonna start attacking Cortex-ARM zero devices in friends, then what you're gonna start doing is looking at what SOC it has, what sort of JTAG ports it has, what kind of backup stuff does it have. Is there a known way to dump the code off of your chip? ARM executables are really generic. ARM really doesn't care as long as you can wake up the peripherals in the right way. Under Linux, this means that you don't have to compile for a particular chip. Debian has an ARM HF, and that's for any ARM that has hard float. If you have ARM Arch64 binaries, they'll run on any ARM Arch64 device as long as they can load up their loader and their necessary shared binaries. Hardware vendors are also dumb and lazy. You will see a lot of devices that are just variations on current existing devices. Don't reinvent the wheel as well. Inventative devices like this are gonna be really common, really popular, and they're gonna be consistent. CenturyLink, for example, sends out really consistently similar devices. Kobo puts out devices that the only difference is what kernel they're running internally. Everything else gets loaded at run time. OWASP has a whole set of tasks devoted to looking at embedded IoT devices. Their little tool set can actually become really helpful for devices which talk over the internet or which are trying to figure out what's going on. Tools like FirmWalker and such, they are built for turnkey, push this at your device and see what happens. They are intended to start looking and interrogating devices and services and especially running binaries to see what's going on. If you are curious about this, there's a fantastic blog called the Firmware Security Blog. They have a whole list of tools. So one option, especially if it's a Linux system, is say it's a Unix system, you know, I know this. If you're gonna shell, beat against the shell. You have only what's on the target, only what's available at that moment. This is a bit like going into the wild with a bowie knife and a jar of piss. I mean, it's gonna work. You're gonna beat your head against a brick wall for a while, but you'll eventually find something. But you have no debugger. You have no compiler. You have no fuzzer. You have none of the tools that most people like to play with. The second option is to black box it entirely. Don't even try and pull apart the system, just attack what externally visible things are. I'm not a lawyer. If you think you might be touching something that's gonna violate some NDAs, get a lawyer, but you have a less likelihood of running into some secret. You are only out externally attacking this as though or just a black box. But unfortunately you've lost the jar of piss and all you have is the bowie knife. Both of these options suck. So you go, okay, let's reverse it. Pull out IDA, radara, any of your standard reversing tools, grab a beer, learn UNISA, and off you go. This is a great way to start with stuff like embedded IoT devices, like those light bulbs from Phillips, IKEA, et cetera, your smartwatch, et cetera. The problem is how the fuck do you get the binaries? Forget for that. But I'm a lazy asshole. I wanna fuzz this thing. I don't wanna learn IDA. Well then you emulate it. You have every tool at your disposal if you're emulating. Debugger, fuzzer, cool. But you still have that problem of how the hell do you get your actual binary of interest? You get the root file system, especially if it is a Linux device. If it's an ARM, such as a standard embedded device, a lot of these tools are gonna apply as well. Easy mode is update packages. These are probably the fastest way to get a root file system if you have a complete OTA or a lot of routers just ship an entire complete version of the file system. They open up the disk. They say, cool, here's my update. And they just plow right over whatever is there already. Sometimes they are actual executables. The downside is sometimes they're encrypted. Sometimes they're really obfuscated. Sometimes they're actually intended to keep you from doing this. The second trick is interview execution. You're gonna need a shell. Hijack some administrative interfaces. Go back to step zero and start looking at what can you do? Is there a known attack against command injection? Can you start Netcat? Can you explore what's on the file system through some blind command injection? You might need some kind of packer. There's a lot of stuff built into most busy box implementations. You're gonna have to find a way to get it somewhere. Netcat is a good example. Curl can do some amount of push. You might actually have an HDPD to follow on the device if you can set up a sim link into whatever it's using. And you might need some creativity. I've actually done this at one point. I just simply had the device send its entire disk straight over the internet, over the local network to a broadcast address. I used Wireshark to capture all the traffic and then pieced it back together. So let's take a look at what that looks like. So this is me. The first time I looked at the Technicolor C100T, this is a DSL router sent by CenturyLink. I've turned on the admin interface here and I'm just poking around what the admin interface gives me. Gives you some information about the memory of allocation. There's some DNS redirecting that you can do. This is really cool. It's been running for a couple of minutes. I can actually muck with the WAN interface here. Let's see if we have a shell. Oh, look, we have a root shell. The world is our oyster at this point. So let's go peek around. They put fucking playing text credentials all over PS. But what I'm starting to see is like GDB server. I'm starting to see config stuff, TFTPD. I'm starting to actually find some really cool stuff. Especially if you look very carefully there, you can actually flash from WGet straight to the device. Yeah, this is definitely the, is this security? This is available to CenturyLink over their back door. So I start figuring out, can I net cap this thing off? And then I go and I plug a flash drive into the back because this is meant for shoving printers on too for printer sharing. So I'm gonna copy over the MTD flash blocks just directly. And there's a handful of them. I don't care how many things it has. I'm gonna copy them all over just because I think I can. Some time passes. This takes about 10 minutes in real time. And let's go poke around some more. Turns out we have like scratch pad stuff. There's actually some arbitrary memory read and write that I end up finding here. This is some interesting useful information about how MTD is laid out. Looks like there's a root FS and root FS update which do not appear to be intentionally overlapping but they are overlapping entirely. I know what kind of file systems it has access to and just kind of playing around. So what did we get out of that? We got a whole bunch of information. We know that this thing has read and write commands that can touch arbitrary chunks of memory and that it can do things like DNS redirection and that this is available to CenturyLink through the administrative interface. But more importantly, we got the full root file system. These are EXT2 file system blobs just hanging out on a flash drive now. Other methods, surprise is direct extraction. A lot of these devices have an SD card so pull it out and image it. And you're telling me okay asshole, tell me something I can't already figure out my own damn self. SD cards are hiding in fucking plain sight. These are like kick starters for devices that are literally just a Raspberry Pi. A, like there's an anti-villain box which is intended for like Tor security. Over on the bottom left there is the first generation Amazon Kindle device. It just had a standard off the shelf two gig SD card that you could just pull. Oh, you need more space. We'll put a fucking 16 gig card in there. It doesn't care. Kobo is the same way. That's an SD card, just a SanDisk SD card which can pull out, you can image it, expand the file systems, off you go. EMMC is just a variation on SD. They're actually both simply multimedia cards. It can be done, you're gonna have to pull some stuff. You're gonna need to understand how the disk is laid out in the end. Having some in vivo information is really helpful. Doing some reconnaissance, shelling into your device, playing around like a traditional UNIX hacker would given an Union random machine. Some things are made a little bit harder by the fact that EMMC is also now bound into EMCP which is LPDDR and EMMC combined. However, all of this is made simple by China because they need to work on iPhones and iPhones use EMMC devices. So these are SD card adapters that you slot your EMMC device into and they turn it into a USB or SD card or you can make your own. It's just an SD card, it talks for a bit. It's a little bit slow but it works. And these things are everywhere. These are Raspberry Pi based devices. The ModBerry and Revolution Pi are both den-mountable. These are going into industrial control systems. All it's fails, solder the rescue. You might need to desolder some storage especially if you're on EMMC. You might need to drag out some traces. You're gonna look for JTAG. If you're interested this weekend go check out the hardware hacking village and start looking at what you can do. At this point you might be looking for logic analyzers. Selle makes a really good one. It's cheap, runs over USB. Does a lot of the decoding for you. There's also hardware interfaces to talk to a lot of this stuff. On the left is the bus pirate. It is an open source SRAM backed protocol analyzer. It has a lot of stuff baked in. On the right is the SPI driver. It's a little USB device that lets you make up an SPI interface but you can run it at any speed. You can see and visualize what's going on in the wire. So now that we have that, what the hell do we do? You try extracting it, you mount it, you take a look at what you've just pulled. EMMC's regularly and sometimes you have real partition tables, SD cards more than likely because somebody's actually on a PC had to touch this thing. Go back to the reconnaissance, go back to step zero and see what you found. But then what automation do you work? Look at BinWalk. BinWalk is a fantastic tool for anybody who is getting into reverse engineering especially. Tools like Photorec might actually be useful. You might have to get a little bit more creative. LO setup and friends can do a lot of stuff like finding partition tables. They can find certain records that mark that this is a file system. If you're only looking for stuff for Ida and Rodare, once you have your binary of choice, automation might be the easy way to get it. This is kind of where you're gonna stop. However, if you're looking to actually attack this thing live, you want something like QMU. Because the problem is, these devices are slow. The device that sits and does my internet at home is an 18 megahertz device. It has shit all for RAM. I don't want to sit and try and figure out how to cross compile my fuzzer and something and run it on there and try to beat it against it. And it takes five minutes to boot and every time I touch a bad chunk of memory it reboots and then I'm down in front of the five minutes and now I'm back to where I started. So QMU let's you do some stuff. It's a simple fast processor emulator for all sorts of stuff like mainframes, arm, MIPS, open risk. You can pretend you're an STM32F4 or you can pretend that you are a deck alpha. You can run OSX on Amiga. You can even run a Haiku on BOS. There's two ways to run QMU as a full-fat VM or as a translating loader. With the full-fat VM, you have full control over everything. You are the hardware. This is an in-circuit emulator for everything. You've got GDB at kernel level because you can step through the entire processor. Requires zero trust in whatever binary you're working with especially if you've got something that you might think is malicious. You probably want a special kernel if you're working with this. However, there's a lot of ways to make QMU do what you need to do. Any tools that you need are you gonna have, you're gonna have to push into the target environment. And I hate cross compilers. So that's why I typically use a translating loader. You have access to whatever you're doing. The downside to translating loaders is that it's kind of like wine except they can't run Windows executables. It's intended to run Linux executables directly. You can run it in a container so now you can completely automate a process. You can run this in parallel. You don't need a container. So this is actually as a full-fat VM. I have a friend that needed some work done. This is a nine track tape drive from Overland Data with a laptop running MS-DOS 6 as a full-fat VM over the parallel port interfacing with real hardware. Here I'm actually reading blocks out of an AOS install tape from a data general mainframe that we had picked up. This is just pretend to the system. It's running on an old DOS machine. It's old, it's slow. I'm running on a Core 2 Duo. This doesn't require a whole lot of power on all things considered. But what it does mean is that you don't have to worry about is the hardware actually there? Do I have to drag out a DOS machine? No, you just boot DOS out of a QAMU. As a loader, you're relying on bin format. Long ago, Linux had the ability to just say, hey, I need to say that this is my executable for this type of executable. I was originally for running JARs. So you can run JARs from the command line. Turns out it's a great place to put emulators. QAMU has a static version, which runs entirely in user space. And it uses the magic number system that's baked into bin format. Debian puts in their bin format package and a couple others. Without a container, it's dumb simple to set up. You just call bin QAMU whatever static. You call your binary, it all goes. You have to trust that your executable isn't malicious. You also have to have all of your local libraries sync up to whatever libraries it had. This works best for like big static monolithic executables like busybox. However, for a container, you can bring that whole root file system along. You can bring those weird versions of G-Lib C. You have effectively a little jail. You can run in Docker. You can run in SystemD's machine containers. It's great for when your binary is linked against some weird like, oh, we had this idea and let's go mangled G-Lib C. For this next demo, we're gonna look at a piece of software that's running on what is more normally known as IBM Z14. If anybody is familiar with S390X, it's actually compatible with the System360. So this is a quick user demo. So here we are. We're starting an S390X container from SystemD. We have, you know, standard thing. Here you can see that it's changed a little bit in H-Top, but to everything, it's a standard S390X system. But for fun, we can also do this for ARM. Again, my machine just says, yeah, you're an ARM system. Never mind that this is actually internally just an x86 box. It's actually a VM running inside a VM. QMU in this way is really astonishingly powerful. The big part is you can run AFL internally. AFL is a fantastic fuzzer. And AFL has support for QMU. There's a little bit of setup. You have to bring a copy of QMU built for AFL for your target alongside. I've done this in a VM. It's really slow and it works. So, so here we are, we're gonna run our S390X and we're gonna bind a couple of things over from the Debian machine that I'm running on this one. I've compiled AFL for x86 already. Here's our target executable. It's a 64-bit IBM S390 executable. We're gonna tell AFL, hey, here's your AFL pass. And AFL starts up. It's got a couple of small, simple tests environments. And this is actually really slow. But here we can see AFL is attacking an S390X binary. I don't have S390X hardware. I have a cheap laptop with a VM. So, what we learned is that hardware vendors are lazy. As we saw that a lot of these devices are gonna be very similar to each other. There is a lot of duplication in so many things. Attacking hardware means getting really creative. You're gonna see a lot of stuff. That whole process of pulling off a Centrelink router's firmware, that was the first time I had seen it and that took me just over an hour to get, you know, take and peek around. You spend a lot of time doing reconnaissance. You start looking at what is my device doing. QME was pretty new. It is fundamentally a way to make a device pretend a piece of software, pretend that it is on the actual piece of hardware that it believes it is. QMU can actually be used for way more. There is, again, a STM32 port. You can run it on embedded ARM. You can run QMU on ARM to pretend that it's ARM if you have the wrong kind of ARM. For a little while I was doing this so that I could use ARM64 binaries on ARM32 systems. It works, it's slow, but it works. AFL also runs really slow when you're emulating x86 stuff on a Strelink extra the other way around. And remember rule zero, okay? This is the big part here. That little rule zero, a lot of reconnaissance, a little bit of planning can save you hours and hours of a headache. Sometimes just simply noticing that there's an EMMC device means, oh, I'll desoderate, pop it into a reader, I'm done, I don't have to worry about how am I gonna extract this. Then you find out that it's encrypted. Well, all right, then you go back, you start looking. You start sniffing the wires. A lot of creativity comes with things like when failover flow, did their attack on the PlayStation 4? They found out it was just an x86 system. So what they did is they intercepted a lot of the communications between two sides by running PCIe over a UART at 9600 BOD so they could watch each frame visually as it went through their entire system. And people are lazy. Remember that the Xbox One, PS4, and probably upcoming versions are just x86 systems. They're just putting newer and more creative and forms of control on how code gets loaded. Attack early and find interesting ways. You will pour over documents in broken Chinese. You will pour over documents in broken English. I have spent many hours sitting in front of data sheets that were, I copied and pasted this out of a PDF into Google Translate and then had to be translated because the company that makes it only makes it in Chinese because all the engineers are Chinese. You'll see stuff. Don't be afraid to look for TFTP. Your device actually might pop open in TFTP early. Meraki devices for a long time did this and it was really interesting. Yes. The comment was that a lot of consumer routers will boot from TFTP if they can't boot from their normal file system. And yeah, that happens. If you can induce that failure, it might be as simple as a paperclip. This is, a paperclip is how you defeat on a lot of older ThinkPads, the lockout mechanism. So you cause the thing to boot, misread the boot configuration and then out comes an empty slate. You go back into the boot configuration, the BIOS, and it goes, I don't have a password. All the encryption keys have been wiped. So I just have to reboot and give no encryption keys. This sort of stuff is pretty common. Again, if you're interested, go look at the hardware hacking village. Explore what hardware engineers have to build because you will wonder, what the hell was that EE drinking on that day? Because people put PCIe devices on ARM. There's even, there's a GPS receiver there as well. And that's going over its own custom proprietary bus. There are many different ways to attack these. AFL is an amazing tool if you can get it working. Great. So, any questions? Yes? So for SOC, the question was, how do I emulate peripherals in QMU? I've worked predominantly in environments where I don't have to worry about peripherals. I know that there is a framework for saying when you write to this memory address, really write to this device. Check the QMU documentation. They have a lot of how to emulate external peripherals. Something like the SPI driver or bus pirate can be used as a bridge to the real hardware as well. You would have to check the documentation. Remember, I can't see all the way back. Fantastic. Thank you so much. There's more resources available on the internet. Definitely check out if you are more interested in this. Recon 2010, there was a fantastic talk by Igor Skocinski on reverse engineering for PC reversers, as well as the fantastic JTAG explained article. Go look at Elinix and Linux MIPS. There are many targets that you can attack. There are ARM devices and embedded Linux devices all around you, like right here. This is a 100 megahertz ARM chip. There are ARM devices on your body. There are possibly ARM devices in your body. There are embedded systems everywhere. Keep on hacking.