 The Laban's guide to zero-day engineering is our next talk by, and my colleagues out in Austin who run the Ponto-owned contest assure me that our next speakers are really very much top of their class and I'm really looking forward to the talk for that. A capture-the-flag contest like that requires having done a lot of your homework up front so that you have the tools at your disposal at the time so that you can win. And Marcus and Amy here to tell us something way more valuable about the actual tools that they found but how they actually arrived at those tools and the process of going to there. And I think that is going to be a very valuable recipe or lesson for us. So please help me welcome Marcus and Amy to a very much anticipated talk. All right. Hi everyone. Thank you for making out to our talk this evening. So I'd like to start by thanking the CCC organizers for inviting us out here to give this talk. This was a unique opportunity for us to share some of our experience with the community and we're really happy to be here. So yeah, I hope you guys enjoy. Okay. So who are we? Well, my name is Marcus Gazedalen. I sometimes go by the handle Gazedalen, which is my last name, and I'm joined here by my co-worker Amy, who's also a good friend and longtime collaborator. We work for a company called REC2 Systems. REC2 is best known publicly for its security research and development. Behind the scenes we do consulting and have been pushing to improve the availability of security education and specialized security training, as well as raising awareness and sharing information like you're going to see today. So this talk has been structured roughly to show our approach in breaking some of the world's most hardened consumer software. In particular, we're going to talk about one of the zero days that we produced at REC2 in 2018. And over the course of the talk, we hope to break some common misconceptions about the process of zero-day engineering. We're going to highlight some of the observations that we've gathered and built up about this industry and this trade over the course of many years now. And we're going to try to offer some advice on how to get started doing this kind of work as an individual. So we're calling this talk a non-technical commentary about the process of zero-day engineering. At times it may seem like we're stating the obvious, but the point is to show that there's less magic behind the curtain than most of you spectators probably realize. So let's talk about Pone to Own 2018. For those that don't know, Pone to Own is an industry-level security competition organized annually by Trend Micro's zero-day initiative. Pone to Own invites the top security researchers from around the world to showcase zero-day exploits against high-value software targets such as premier web browsers, operating systems, and virtualization solutions such as Hyper-V, VMware, Virtual Box, Zen, whatever. So at REC2 we thought it'd be fun to play Pone to Own this year. Specifically, we wanted to target the competition's browser category. We chose to attack Apple Safari web browser on macOS because it was new, it was mysterious, but also to avoid any prior conflicts of interest. And so for this competition we ended up developing a type of zero-day known typically as a single-click RCE or Safari remote, kind of as some industry language. So what this means is that we could gain remote root-level access to your MacBook should you click a single malicious link of ours. That's kind of terrifying. A lot of you might feel like you're very prone to not clicking malicious links or not getting spear fish, but it's so easy. Maybe you're in a coffee shop, maybe I just man in the middle of your connection. It's a pretty scary world. So this is actually a picture that we took on stage at Pone Own 2018, directly following our exploit attempt. This is actually Joshua Smith from ZDI holding the competition machine after our exploit had landed, unfortunately a little bit too late. But the payload at the end of our exploit would pop Apple's calculator app and a root shell on the victim machine. This is usually used to demonstrate code execution. So for fun we also made the payload change the desktop's background to the REC2 logo, so that's what you're seeing there. So what makes a zero-day a fun case study is that we had virtually no prior experience with Safari or macOS going into this event. We literally didn't even have a single MacBook in the office, we actually had to go out and buy one. And so as a result you get to see how we as expert researchers approach new and unknown software targets. So I promised that this was a non-technical talk, which is mostly true. That's because we actually published all the nitty-gritty details for the entire exploit chain as a verbose six-part blog series on our blog this past summer. It's hard to make highly technical talks fun and accessible to all audiences. So we've reserved much of the truly technical stuff for you to read at your own leisure as not a prerequisite for this talk. So don't feel bad if you haven't read those. So with that in mind we're ready to introduce you to the very first step of what we're calling the layman's guide to zero-day engineering. So at the start of this talk I said we'd be attacking some of the most high value and well-protected consumer software. This is no joke, right? This is a high stakes game. So before any of you even think about looking at code or searching for vulnerabilities in these products, you need to set some expectations about what you're going to be up against. And so this is a picture of you. You might be a security expert, a software engineer, or even just an enthusiast. But through some odd twist of self-loathing, you find yourself interested in zero-days and the desire to break some high-impact software like a web browser. But it's important to recognize that you're looking to defy some of the largest and most successful organizations of our generation. These types of companies have every interest in securing their products and building trust with consumers. These vendors have steadily been growing their investments in software and device security, and that trend will only continue. You see cybersecurity in headlines every day, hacking. These systems compromise. It's only getting more popular. There's more money than ever in this space. So this is a beautiful mountain peak that represents your mission of I want to craft a zero-day. But your scent up this mountain is not going to be an easy task. As an individual, the odds are not really in your favor. This game is sort of a free-for-all, and everyone is at each other's throats. So in one corner is the vendor who might as well have infinite money and infinite experience. In another corner is the rest of the security research community, fellow enthusiasts, all their threat actors. So all of you are going to be fighting over the same train, which is the code. This is unforgiving terrain in and of itself, but the vendor has home field advantage. So these obstacles are not fun, but it's only going to get worse for you. Newcomers often don't prepare themselves for understanding what kind of timescales they should expect when working on these types of projects. So for those of you who are familiar with the Capture the Flag Circuit, these competitions usually are time boxed from 36 to 48 hours. Normally, they're over a weekend. We came out of that circuit. We love the sport. We still play. But how long does it take to develop a zero day? Well, it can vary a lot. Sometimes you get really lucky. I have seen someone produce a Chrome slash V8 bug in two days. Other times it's taken two weeks. Sometimes it takes a month. But sometimes it can actually take a lot longer to study and exploit new targets. You need to be thinking, you know, you need to be looking at time in these kind of scales. And so it could take three and a half months. It could take maybe even six months for some targets. The fact of the matter is that it's almost impossible to tell how long the process is going to take you. And so unlike a CTF challenge, there's no upper bound to this process of zero day engineering. There's no guarantee that the exploitable bugs you need to make a zero date even exist in the software you're targeting. You also don't always know what you're looking for. And you're working on projects that are many order magnitudes larger than any sort of educational resource. We're talking millions of lines of code or your average CTF challenge might be a couple hundred lines of C at most. So I can already see the tear and self-doubt in some of your eyes. But I really want to stress that you shouldn't be too hard on yourself about this stuff. As a novice, you need to keep these caveats in mind and accept that failure is not unlikely in the journey. So please check this box before watching the rest of the talk. So having built some psychological foundation for the task at hand, the next step in the layman's guide is what we call reconnaissance. So this is kind of a goofy slide. But yes, even Metasploit reminds you to start out doing recon. So with regard to zero day engineering, discovering vulnerabilities against large scale software can be an absolutely overwhelming experience. Like that mountain, it's like, where do I start? What foothill do I go up? Like, where do I even go from there? So to overcome this, it's vital to build foundational knowledge about the target. It's also one of the least glamorous parts of the zero day development process, and it's often skipped by many. You don't see any of the other speakers really talking about this so much. You don't see blog posts where people are like, I Googled for eight hours about Apple Safari before writing a zero day for it. So you want to aggregate and review all existing research related to your target. This is super, super important. So how do we do our recon? Well, the simple answer is Google everything. This is literally us Googling something. And what we do is we go through and we click and we download and we bookmark every single thing for about five pages. And you see all those buttons that you never click at the bottom of Google? All the things are, here's related searches you might want to look at. You should definitely click all of those. You should also go through at least four or five pages and keep downloading and saving everything that looks remotely relevant. So you just keep doing this over and over and over again. And you just Google and Google and Google everything that you think could possibly be related. And the idea is you just want to grab all this information. You want to understand everything you can about this target. Even if it's not Apple Safari specific, I mean, look into V8, look into Chrome, look into Opera, look into Chakra, look into whatever you want. So the goal is to build up a library of security literature related to your target and its ecosystem. And then I want you to read all of it. But I don't want you, don't force yourself to understand everything in your stack and your literature. The point of this exercise is to build additional context about the software, its architecture and its security track record. By the end of the reconnaissance phase, you should aim to be able to answer these kinds of questions about your target. What is the purpose of the software? How is it architected? Can anyone describe what WebKit's architecture is to me? What are its major components? Is there a sandbox around it? How do you debug it? How do the developers debug it? Are there any tips and tricks? Are there special flags? What does the security track record look like? Does it have historically vulnerable components? Is there existing write-ups, exploits or research in it? Et cetera. All right, we're through reconnaissance. Step two is going to be target selection. So there's actually a few different names that you could maybe call this. Technically, we're targeting Apple Safari, but you want to try and narrow your scope. And so what we're looking at here is a tree map visualization of the WebKit source. So Apple Safari web browser is actually built on top of the WebKit framework, which is essentially a browser engine. This is open source. And so yeah, this is a tree map visualization of the source directory where files are sorted by size. So each of those boxes is essentially a file. Well, all the gray boxes, the big gray boxes are directories, all the sub squares are files. And each file is sized based on its lines of code. Hue, the blue hues represent approximate maximum cyclometric complexity detected in each source file. And you might be getting, anyway, you might be getting flashbacks back to that picture, that mountain peak. How do you even start to hunt for security vulnerabilities in a product or code base of this size? Three million lines of code. You know, maybe you're in like, I don't know, like 100,000 lines of C or C++ in my life, let alone read or reviewed three million. So the short answer to breaking this problem down is that you need to reduce your scope of valuation and focus on depth over breadth. And this is most critical when attacking extremely well-picked over targets. You know, maybe you're probing an IoT device, you can probably just sneeze at that thing and you're gonna find vulnerabilities. But you know, you're fighting on a very different landscape here. And so you need to be very detailed with your review. So reduce your scope. Our reconnaissance and past experience with exploiting browsers has led us to focus on WebKit's JavaScript engine highlighted up here in orange. So bugs in JS engines when it comes to browsers are generally regarded as extremely powerful bugs. But they're also few and far between and they're kind of becoming more rare as more of you are looking for bugs, more people are colliding, they're dying quicker. And so anyway, let's try to reduce our scope. So we reduce our scope from three million down to 350,000 lines of code. Here we'll zoom into that orange. So now we're looking at the JavaScript directory specifically the JavaScript core directory. So this is a JavaScript engine within WebKit as used by Safari on macOS. And specifically to further reduce our scope, we chose to focus on the highest level interface of the JavaScript core, which is the runtime folder. So this contains code that's almost one for one mappings to JavaScript objects and methods in the interpreter. So for example, array.reverse, or concat, or whatever, it's very close to what you JavaScript authors are familiar with. And so this is what the runtime folder looks like at approximately 70,000 lines of code. When we were spinning up for Pone to own, we said, OK, we are going to find a bug in this directory in one of these files and we're not going to leave it until we have walked away with something. So if we take a step back now, this is what we started with and this is what we've done. We've reduced our scope. So it helps illustrate this whittling process. It was almost a little bit arbitrary. Previously, there's been a lot of bugs in the runtime directory, but it's really been cleaned up the past few years. So anyway, this is what we chose for our RCE. So having spent a number of years going back and forth between attacking and defending, I've come to recognize that bad components do not get good fast. Usually, researchers are able to hammer away at these components for years before they reach some level of acceptable security. So to escape before a sandbox, we simply looked at the security trends covered during the reconnaissance phase. So this observation, historically, bad components will often take years to improve, means that we chose to look at Windows Server. And so for those that don't know, Windows Server is a root level system service that runs on macOS. Our research turned up a trail of ugly bugs from essentially the Windows Server, which is accessible to this far sandbox. And in particular, when we're doing our research, we're looking at ZDI's website. And you can just search all their advisories that they've disclosed. And in particular, in 2016, there was over 10-plus vulnerabilities reported to ZDI that were used as sandbox escapes or privilege escalation-style issues. And so these are only vulnerabilities that were reported to ZDI. If you look in 2017, there is four all again used for the same purpose. I think all of these were actually probably used at Pone O in both years. And then in 2018, there was just one. And so this span of three years where people were hitting the same exact component Apple or researchers around the world could have been watching or listening and finding bugs and fighting over this land right here. And so it's pretty interesting. I mean, that gives some perspective. The fact of the matter is that it's hard to write. It's really hard for bad components to improve quickly. Nobody wants to try and sit down and rewrite bad code. And vendors are absolutely terrified of shipping regressions. Most vendors will only patch or modify old bad code only when they absolutely must. For example, when a vulnerability is reported to them. And so as listed on this slide, there is a number of reasons why a certain module or component has a terrible security track record. Just try to keep in mind that it's usually a good place to look for more bugs. So if you see a waterfall of bugs this year in some component like wasm or jit, maybe you should be looking there, right? Because that might be good for a few more years. All right. Step three, so after all this talk, we're finally getting to a point where we can start probing and exploring the code base in greater depth. This step is all about bug hunting. So as an individual researcher or a small organization, the hardest part of the zero-day engineering process is usually discovering an exploitative vulnerability. That's just kind of from our perspective. This can maybe vary from person to person. But we don't have $100 million to spend on fuzzers, for example. And so we literally have one MacBook, right? And so it's kind of like looking for a needle in a haystack. We're also well-versed in the exploitation process itself, and so those end up being a little bit more formulaic for ourselves. So there are two core strategies for finding exploitable vulnerabilities. There's a lot of pros and cons to both of these approaches, but I don't want to spend too much time talking about their strengths or weaknesses. So they're all listed here. The short summary is that fuzzing is the main go-to strategy for many security enthusiasts. Some of the key perks is that it's scalable. It's scalable, and it almost always yields results. And so a spoiler alert, but later in this talk, you're going to see we fuzz both of our bugs, both the bugs that we use for our full chain. And it's 2018. These things are still falling out with some very trivial means. OK, so source review is the other main strategy. Source review is often much harder for novices, but it can produce some high-quality bugs when performed diligently. If you're looking to just get into the stuff, I would say start real simple. Start with fuzzing and see how far you get. So yeah, for the purpose of the talk, we're mostly going to focus on fuzzing. This is a picture from the dashboard of a simple, scalable fuzzing harness we built for JavaScript Core. This is when we were ramping up for Pone to Own and trying to build our chain. It was a grammar-based JavaScript fuzzer based on Mozilla's Dharma. There is nothing fancy about it. This is a snippet of what some of its output looked like. We had only started building it out when we actually found the exploitable vulnerability that we ended up using. So we haven't really played with this much since then, but it shows how easy it was to get where we needed to go. So something we like to stress heavily to the folks who fuzz is that it really must be treated as a science for these competitive targets. Guys, I know code coverage isn't the best metric, but you absolutely must use some form of introspection to quantify the progress and reach of your fuzzing. Please don't just fuzz blindly. So our fuzzer would generate web-based code coverage reports of our grammars every 15 minutes or so. This allowed us to quickly iterate upon our fuzzer, helping it generate more interesting complex test cases. A good target is 60% code coverage. So you can see that in the upper right-hand corner. That's kind of what we were shooting for. Again, it really varies from target to target. This was also just us focusing on the runtime folder if you see in the upper left-hand corner. And so something that we've observed, again, over many targets and exotic targets is that bugs almost always fall out of what we call the hard-fought final coverage percentages. And so what this means is you might work for a while trying to build up your coverage, trying to build a good set of test cases or grammars for fuzzing. And then you'll hit that 60% and you'll be like, OK, what am I missing now? Like, everyone gets to 60%, let's say. But then once you start inching a little bit further is when you start finding a lot of bugs. And so, for example, we'll pull up code and we'll be like, why did we not hit those blocks up there? Why are those gray box? Why did we never hit those in our millions of test cases? And we'll go find that that's some weird edge case or some unoptimized condition or something like that. And we will modify our test cases to hit that code. Other times, we'll actually sit down, pull it up on our projector, and talk through some of that code. And we'll be like, what the hell is going on there? And so this is actually a live photo that I took during our opponent-owned hunt. As cliche as this picture is of hackers standing in front of a dark screen in a dark room, this was absolutely real. We were just reading some code. And so it's good to rubber ducky among coworkers and to hash out ideas, help confirm theories or discard them. And so, yeah, this kind of leads us to the next piece of advice is when you're doing source review. So this applies to both debugging or assessing kind of those corner cases and whatnot. If you're ever unsure about the code that you're reading, you absolutely should be using debuggers on dynamic analysis. So as painful as it can maybe be to set up JavaScript core or debug this massive C++ application that's dumping these massive call stacks that are 100 deep, you need to learn those tools. Are you never going to be able to understand the amount of context necessary for some of these bugs and complex code? So for example, one of our blog posts makes extensive use of RR to root cause the vulnerability that we ended up exploiting. It was a race condition in the garbage collector, totally wild bug. I said there was probably three people on Earth that could have spotted this bug through source review. It required immense knowledge of code base, in my opinion, to be able to recognize this as a vulnerability. We found it through fuzzing. We had to root cause it using time travel debugging, Mozilla's RR, which is an amazing project. And so yeah, absolutely use debuggers. This is an example of a call stack. Again, just using a debugger to dump the call stack from a function that you were auditing can give you an insane amount of context as to how that function is used, what kind of data it's operating it on, maybe what kind of areas of the code base it's called from. You're not actually supposed to be able to read the slide, but it's a back trace from GDB that is 40 or 50 call stacks deep. All right. So there's this huge misconception by novices that new code is inherently more secure and that vulnerabilities are only being removed from code bases, not added. This is almost patently false. And this is something that I've observed over the course of several years, countless targets, code from all sorts of vendors. And there's this really great blog post put out by Ivan from GPZ this past fall. And in this blog post, he basically, so one year ago, he fuzzed WebKit using his fuzzer, call it Demato. He found a bunch of vulnerabilities. He reported them, and then he open sourced the fuzzer. But then this year, this fall, he downloaded his fuzzer, ran it again with little to no changes just to get things up and running. And then he found another 8-plus exploitable use-after-free vulnerabilities. So what's really amazing about this is when you look at these last two columns that I've highlighted in red, virtually all the bugs he found had been introduced or regressed in the past 12 months. So yes, new vulnerabilities get introduced every single day. So the biggest reason new code is considered harmful is simply that it's not had years to sit in market. This means it hasn't had time to mature. It hasn't been tested exhaustively, like the rest of the code base. As soon as that developer pushes it, whenever it hits Release, whenever it hits Stable, that's when you have a billion users pounding at it, let's say on Chrome. I don't know how big that user base is, but it's massive. And that's 1,000 users around the world just using the browser who are effectively buzzing it just by browsing the web. And so of course, you're going to manifest interesting conditions that will cover things that are not in your test cases, in unit testing. So yeah, the second point down here is that it's not uncommon for new code to break assumptions made elsewhere in the code base. And this is also actually extremely common. The complexity of these code bases can be absolutely insane. And it can be extremely hard to tell if, let's say, some new code that Joe Schmo, the new developer, adds, breaks some paradigm held by, let's say, the previous owner of the code base. He maybe doesn't understand it as well. Maybe it could be an expert developer who just made a mistake. It's super common. Another piece of advice, this should be a no-brainer for bike hunting. But novices often grow impatient and start hopping around between code and functions and getting lost or trying to chase, use after freeze or bug classes without really truly understanding what they're looking for. So a great starting point is always identify the sources of user input or the way that you can interface with a program and then just follow the data. Follow it down. What functions parse it? What manipulates your data? What reads it? What writes to it? Just keep it simple. And so when we're looking for our sandbox escapes, we knew we were looking at Windows Server. And our research had showed that there's all these functions. We didn't know anything about Mac, but we read this blog post from Keen that was like, oh, there's all these functions that you can send data to in Windows Server. And apparently, there's about 600. And there are all these functions prefixed with underscore underscore x. And so these 600 endpoints will parse and operate upon data that we send to them. And so to draw a rough diagram, there's essentially this big red data tube from the Safari sandbox to the Windows Server system service. This tube can deliver arbitrary data that we control to all those 600 endpoints. We immediately thought, let's just try to man in the middle of this data pipe so that we can see what's going on. And so that's exactly what we did. We just hooked up Frida to it, another open source DBI. It's on GitHub. It's pretty cool. And we were able to stream all the messages flowing over this pipe. So we could see all this data just being sent into the Windows Server from all sorts of applications, actually. Everything on macOS talks to this. The Windows Server is responsible for drawing all your Windows on the desktop, your mouse clicks, your whatever. It's kind of like explorer.exe on Windows. So we see all this data coming through. We see all these crazy messages, all these unique message formats, all these data buffers that it's sending in. And this was just begging to be fuzzed. And so we said, OK, let's fuzz it. And I remember we were getting all hype. And I distinctly remember thinking maybe we can jerry rig AFL into the Windows Server, or let's mutate these buffers with Radimsa, or why don't we just try flipping some bits? And so that's what we did. And so Halper actually had a very timely tweet just a few weeks back that echoed this exact experience. He said that, looking at my security slash vulnerability research career, my biggest mistakes were almost always trying to be too clever. Success hides behind what is the dumbest thing that could possibly work? The takeaway here is that you should always start simple and iterate. So this is our Fuzz Farm. It's a single 13-inch MacBook Pro. I don't know if this video is actually going to work. It's not a big deal if it doesn't. I'm only going to play a few seconds of it. So this is me literally placing my wallet on the Enter key. And you can see this box popping up. And we're fuzzing. Our fuzzer is running now and flipping bits in the messages. And the screen is changing colors. You're going to start seeing the boxes freaking out. This is because the bits are being flipped. It's corrupting stuff. It's changing the messages. Normally, this little box is supposed to show your password hint. But the thing is, by holding the Enter key on the lock screen, all this traffic was being generated to the Windows server. And every time the Windows server crashed, you know where it brings you? It brings you right back to your lock screen. So we had this awesome fuzzing setup by just holding the Enter key. Yeah. And we lovingly titled that picture, Advanced Persistent Threat in our blog. So this is a crash that we got out of the fuzzer. This occurred very quickly, actually. This was probably within the first 24 hours. So we found a ton of crashes. We didn't even explore all of them. There's probably a few still sitting on our server. There's lots of NLD refs, lots of garbage. But then this one stood out in particular. So anytime you see this thing up here that says EXC bad access with a big number up there, address equals, blah, blah, blah, that's a really bad place to be. And so this is a bug that we ended up using at Pone Own to perform our sandbox escape. If you want to read about it, again, it's all in the blog. We're not going to go too deep into it here. So maybe some of you have seen the Infosup comic. It's all about how people are trying to do these really cool, clever things. People can get too caught up trying to inject so much science and technology into these problems that they often miss the force for the trees. And so here we are in the second panel. We just wrote this really crappy little fuzzer. And we found our bug pretty quickly. And this guy's really upset, which brings us to the misconception that only expert researchers with blank tools can find bugs. And so you can fill in the blank with whatever you want. It can be cutting edge tools, state of the art, state sponsored, magic bullet. This is not true. There are very few secrets. So the next observation, you should be very wary of any bugs that you find quickly. A good mantra is that an easy to find bug is just as easily found by others. And so what this means is that soon after our blog post went out, so actually at Pone Own 2018, we actually knew we had collided with fluorescence, one of the other competitors. We both struggled with exploiting this issue. It was a difficult bug to exploit. And we had some very creative exploit. It was very strange. But there was some discussion after the fact on Twitter by Ned, started by Ned. He's probably out here actually speaking tomorrow. You guys should go see this talk about the Chrome IPC. That should be really good. But there was some discussion on Twitter that Ned had started. And Nicholas, who's also here, said, well, at least three teams found it separately. So at least us, fluorescence, and Nicholas had found this bug. And we were all at Pone to Own. So you can think how many people out there might have also found this. There's probably at least a few. How many people actually tried to weaponize this thing? Maybe not many. It was kind of a difficult bug. And so there are probably at least a few other researchers who are aware of this bug. So yeah, that kind of closes the, if you found a bug very quickly, especially with fuzzing, you can almost guarantee that someone else has found it. So I'm going to pass over the next section to Amy to continue. And yeah. Yeah. All right, so we just talked a bunch about techniques and expectations when you're actually looking for the bug. I'm going to take over here and talk a little bit about what to expect when trying to exploit whatever bug you end up finding. So we had the exploit development is the next step. So OK, you found a bug, right? You've done the hard part. You were looking at whatever your target is. Maybe it's a browser. Maybe it's the Windows server or the kernel or whatever you're trying to target. But the question is, how do you actually do the rest? How do you go from the bug to actually popping a calculator onto the screen? The systems that you're working with have such a high level of complexity that even just like understanding enough to know how your bug works, it might not be enough to actually know how to exploit it. Should we try to brute force our way to an exploit? Is that a good idea? Well, all right, before we try to tackle your bug, let's take a step back and ask a slightly different question. How do we actually write an exploit like this in general? Now, I feel like a lot of people consider these kind of exploits maybe be in their own league, at least when you compare them to something like maybe what you do at a CTF competition or something simpler like that. And if you were, for example, to be given a browser exploit challenge at a CTF competition, it may seem like an impossibly daunting task has just been laid in front of you if you've never done this stuff before. So how can we work to sort of change that view? And it might be kind of cliche, but I actually think the best way to do it is through practice. And I know everyone says, oh, how do you get good? Oh, practice. But I think that this is actually very valuable for this. And the way that practicing actually comes out is that, well, before we talked a lot about consuming everything you could about your targets, like searching for everything you could that's public, downloading it, trying to read it, even if you don't understand it because you'll hopefully glean something from it. It doesn't hurt. But maybe your goal now could be actually trying to understand it at least as much as you can. It's not going to be easy. These are very intricate systems that we're attacking here. And so it will be a lot of work to understand this stuff. But for every old exploit you can work your way through, the path will become clearer for actually exploiting these targets. So because I focus mostly on browser work and I did the browser part of our chain, at least the exploitation part, I have done a lot of exploits and read a ton of browser exploits. And one thing that I have found is that a lot of them have very, very similar structure. And they'll have similar techniques in them. They'll have similar sort of primitives that are being used to build up the exploit. And so that's one observation. And to actually illustrate that, I have an example. So alongside us at Pone to Own the Spring, we had Samuel Gross of Phoenix. He's probably here right now. But so he was targeting Safari, just like we were. But his bug was in the just-in-time compiler, which converts JavaScript to the machine code. Our bug was nowhere near that. It was over in the garbage collector, so a completely different kind of bug. But the bug here, that he had, it was super reliable. It was very, very clean. I recommend you go look at it online. It's a very good resource. And then a few months later, at Pone to Own Mobile, so another Pony event, we have Floresicate, which was an amazing team, who managed to pretty much Pone everything they could get their hands on at that competition, including an iPhone, which of course, iPhone uses Safari, so they needed a Safari bug. The Safari bug that they had was very similar in structure to the previous bug earlier that year, at least in terms of how the bug worked and what you could do with it. So now you could exploit both of these bugs with very similar exploit code, almost in the same way. There were a few tweaks you had to do because Apple added a few things since then, but the path between bug and code execution was very similar. Then, even a few months after that, there was a CTF called Real World CTF, which took place in China. And as the title suggests, they had a lot of realistic challenges, including Safari. So of course, my team, RPISec was there, and they woke me up in the middle of the night and tasked me with solving it. So I was like, okay, okay, I'll look at this. And I looked at it, and it was a JIT bug. And I've never actually, before that, looked at the Safari JIT. And so, I didn't have much previous experience doing that, but because I had taken the time to read all the public exploits, so I read all the other PONDA-owned competitors' exploits, and I read all the other things that people were releasing for different CVEs, I had seen a bug like this before very similar, and I knew how to exploit it. So I could, I was able to quickly build the path from bug to code exec, and we actually managed to get first blood on the challenge, which was really, really cool. So what does this actually mean? Well, I think not every bug is gonna be that easily to swap into an exploit, but I do think that understanding old exploits is extremely valuable if you're trying to exploit new bugs. A good place to start if you're interested in looking at old bugs is places like this, with the JS volume DB, which is basically a repository of a whole bunch of JavaScript bugs and proof of concepts, and sometimes even exploits for them. And so, if you were to go through all those, I guarantee by the end you'd have a great understanding of the types of bugs that are showing up these days and probably how to exploit most of them. But there aren't that many bugs that can get published that are full exploits. There's only a couple of year maybe. So what do you do from there once you've read all of those and you want to learn more? Well, maybe start trying to exploit other bugs yourself. So you can go, for example, I like Chrome because they have a very nice list of all their vulnerabilities that they post every time they have an update, and they even link you to the issue. So you can go and see exactly what was wrong. And so take some of these, for example, at the very top we have out of bounds write in V8. So we could click on that and go and see what the bug was, and then we could try to write an exploit for it. And then by the end, we will have had a much better idea of how to exploit an out of bounds write in V8, and we've now done it ourselves too. So this is a chance to sort of apply what you've learned. But you think, okay, that's a lot of work. I have to do all kinds of other stuff. I'm still in school or I have a full-time job. Can I just play CTFs? Well, it's a good question. The question is how much do CTFs actually help you with these kind of exploits? I do think that you can build a very good mindset for this because you need a very adversarial mindset to do this sort of work. But a lot of the times, the challenges don't really represent the real world exploitation. There was a good tweet just the other day, like a few days ago, where we were saying that, yeah, libc is like random libc challenges. Actually, I don't think it, yes, is libc here. Yeah, are often very artificial and don't carry much value to real world because they're very specific. Some people love these sort of very specific CTF challenges, but I don't think that there's as much value as there could be. However, a lot of, there's been a couple of CTFs recently and historically as well that have had pretty realistic challenges in them. In fact, right now is a CTF 35C3 CTF is running and they have three browser exploit challenges. They have a full chain Safari challenge. They have a virtual box challenge. It's like, it's pretty crazy. And it's crazy to see people solve those challenges in such a short time span too. But I think that it's definitely something that you can look at afterwards, even if you don't manage to get through one of those challenges today, but something to try to work on. And so these sort of newer CTFs are actually pretty good for people who want to jump off to this kind of real exploit development work. However, it can be kind of scary for newcomers to the CTF scene, because suddenly it's your first CTF and they're asking you to exploit Chrome and you're like, what is going on here? So it is a bit of a double-edged sword sometimes. All right, so now we found a bug and we have experience. So what do we actually do? Well, you have to kind of get lucky though, because even if you've had a ton of experience, that doesn't necessarily mean that you can instantly write an exploit for a bug. Our JavaScript exploit was kind of like that. It was kind of nice. We knew what to do right away, but our sandbox exploit did not fit into a nice box of a previous exploit that we had seen. So it took a lot of effort. Quickly, I'll show, so this was the actual bug that we exploited for the sandbox. It's a pretty simple bug. It's a integer issue where index is signed, which means it can be negative. So normally it expects like a value like four, but we could give it a value like negative three, and that would make it go out of bounds and we could corrupt memory. So very simple bug, not like a crazy complex one like some of the other ones we've seen. But does that mean that this exploit is gonna be really simple? Well, let's see. That's a lot of code. So our exploit for this bug ended up being about 1,300 lines. And so that's pretty crazy, and you're probably wondering how it gets there, but I don't wanna say just be aware that when you do find a simple looking bug, it might not be that easy to solve or to exploit, and it might take a lot of effort. But don't get discouraged if it happens to you. It just means it's time to ride the exploit development roller coaster. And basically what that means is there's a lot of ups and downs to an exploit, and we have to basically ride the roller coaster until hopefully we have the exploit finished. And we had to do that for our sandbox escape. And so to start, I'd say we found the bug and we had a bunch of great ideas. We'd previously seen a bug exploited like this by Keen and we'd read their papers and we had a great idea, but then we were like, okay, okay, it's gonna work. We just have to make sure this one bit is not set. And it was like in a random looking value, so we assumed it would be fine, you know? But turns out that bit is always set, and we have no idea why and no one else knows why, so thank you, Apple, for that. And so we're like, okay, maybe we can work around it, maybe we can figure out a way to unset it, and we're like, oh yes, we can delete it. It's gonna work again, everything will be great until we realize that that actually breaks the rest of the exploit. So it's this back and forth, it's an up and down, and sometimes when you solve one issue, you think you've got what you need and then another issue shows up. So it's all about making incremental progress towards removing all the issues that are in your way, getting at least something that works. And so just as a quick aside, this all happened within like 60 minutes one night. There was just, Amy saw me just like, I was just like out of breath, I was like, are you kidding me? Like there was two bugs that tripped us up that made this bug much more difficult to exploit, and there was no good reason for why those issues were there. And so it was a horrible experience. But it's the one I'd recommend. Yeah, yeah. And so this roller coaster ride actually applies to the entire process, not just for the exploit development because you'll have it when you find crashes that don't actually lead to vulnerabilities or unexploitable crashes or super unreliable exploits. You just have to keep pushing your way through until eventually you hopefully you get to the end of the ride and you've got yourself a nice exploit. And so now, okay, so we assume, okay, we've written an exploit at this point. It's a maybe it's not the most reliable thing, but it works like I can get to my code exact every now and then. So we got to start talking about the payload. So what is a payload exactly? A payload is whatever your exploits trying to actually do. It could be trying to open up a calculator on the screen. It could be trying to launch your sandbox, escape exploit. It could be trying to clean up your system after your exploit. And by that, I mean fix the program that you're actually exploiting. So in CTFs, we don't get a lot of practice with this because we're so used to doing system cat flag. And then it doesn't matter if the entire program is crashing down in flames around us because we got the flag. And so in this case, yeah, you cat the flag and then it crashes right away because you didn't have anything after your opt chain. But in the real world, it kind of matters a little more. So here's an example of what would happen if your exploit didn't clean up after itself. It just crashes and goes back to the login screen. This doesn't look very good. If you're at a conference like Pone to Own, this won't work. I don't think that they would let you win if this happened. And so it's very important to try to go back and fix up any damage that you've done to the system right after you finished. And so actually running your payload, so a lot of the times we see in the exploits, we'll see that you'll get to the code exec here, which is just CCs, which means int3, which just tells the program to stop or trap to a breakpoint. And all the exploits you see most of the time, they just stop here. They don't tell you anymore. And to be fair, they've gotten you code exec. They're just talking about the exploit. But you still have to figure out how to do your payload, because unless you want to write those 1,300 lines of code in handwritten assembly and then make it into shell code, you're not going to have a good time. And so we had to figure out a way to actually take our payload, write it to the file system in the only place that the sandbox would let us. And then we could run it again as a library. And then it would go and actually do our exploit. And so, yeah. And so now that you've assembled everything, you're almost done here. You have your exploit working. You get a calculator pops up. This was actually our sandbox escape running and popping calculator, improving that we had root code exec. But we're not completely done yet, because we need to do a little bit more, which is exploit reliability. We need to make sure that our exploit is actually as reliable as we want it to, because if it only works one in 100 times, that's not going to be very good. For Pone to own, we ended up building a harness for our Mac, which would let us run the exploit multiple times and then collect information about it. So we could look here and we could see very easily how often it would fail and how often it would succeed. And then we could go and get more information, maybe a log and other stuff like how long it ran. And this made it very easy to iterate over our exploit and try to correct issues and make it better and more reliable. We found that most of our failures were coming from our heap groom, which is where you try to align all your memory in certain ways. But there's not much that you can do there in our situation. So we tried to make it as best as we could and then accepted the reliability that we got. Something else you might want to test on is a bunch of multiple devices. For example, our JavaScript exploit was a race condition. So that means the number of CPUs in the device and the speed of the CPUs actually might matter when you're running your exploit. You might want to try different operating systems or different operating system versions because even if they're all vulnerable, they might have different quirks or tweaks that you have to do to actually make your exploit work reliably on all of them. We wanted to test on the macOS beta as well as the normal macOS release so that we could make sure it worked in case Apple pushed an update right before the competition. So we had to make sure that some parts of our code and our exploit could be interchanged. So for example, we have addresses here that we were specific to the operating system version and we could swap those out very easily by changing what part of the code was done here. Yeah, and then also if you're targeting some browsers, you might be interested in testing them on mobile too, even if you're not targeting a mobile device because a lot of times the bugs might also work on a phone or at least the initial bug will. And so that's another interesting target that you might be interested in if you weren't thinking about it originally. So in general, what I'm trying to say is try throwing your exploit at everything you can and hopefully you will be able to recover some reliability percentages or figure out things that you overlooked in your initial testing. All right, I'm gonna throw it back over for the final section. Throw in the final section here. So I didn't get to spend as much time as I would have liked on this section but I think it's an important discussion to have on here. And so the very last step of our layman's guide is about responsibilities. And so this is critical. And so you've listened to our talk, you've seen us develop these skeleton keys to computers and systems and devices. We can create doors into computers and servers and people's machines, you can invade privacy, you can deal damage to people's lives and companies and systems and countries. And so there's a lot of, you have to be very careful with these. And so everyone in this room, if you take any of our advice going into this stuff, please acknowledge what you're getting into and what can be done to people. And so there's at least one example that's kind of related that I pulled out quickly that quickly came to mind was in 2016, I have supposed to remember this day actually, I was sitting at work. And there is this massive DDoS that plagued the internet, at least in the US. And it took down all your favorite sites, Twitter, Amazon, Netflix, Etsy, GitHub, Spotify, Reddit, I remember the whole internet came to a crawl in the US. This is the L3 outage map. This was absolutely insane. And I remember people were bouncing off the walls like crazy, you know, after the fact. And they're all referencing Bruce Schneider's blog. And they were, you know, on Twitter, there's all this discussion popping up that this was likely a state actor. This was a newly sophisticated DDoS attack. Bruce suggested it was China or Russia or, you know, some nation state. And the blog post was specifically titled, someone is learning how to take down the internet. But then a few months later, we figured out that this was called the Mariah Botnet. And there's actually just a bunch of kids trying to DDoS each other's Minecraft servers. And so it's scary because, you know, I have a lot of respect for the young people and how talented they are. And it's a, but people need to be very conscious about the damage that can be caused by these things. Mariah, they weren't using O-Days per se. Well, later nowadays they are using O-Days. But back then they weren't, they were just, it was an IoT based botnet, one of the biggest in the world, our highest throughput. But it was incredibly damaging. And, you know, so when you're, it's hard to recognize the power of an O-Day until you are wielding it. And so that's why it's not the first step of the layman's guide. Once you finish this process, you will come to realize the danger that you can cause. But also the danger that you might be putting yourself in. And so I kind of want to close on that. Please be very careful, right? And so that's all we have. This is a conclusion. The layman's guide, that's the summary. And if you have any questions, we'll take them now. Otherwise, if we run out of time, you can catch us after the talk and we'll have some cool stickers too. So. Wow. Great talk. Thanks. We have very, very little time for questions. If somebody's very quick, they can come up to one of the microphones in the front. We'll handle one. But otherwise, will you guys be available after the talk? Yeah, we'll be available after the talk. If you want to come up and chat, we might get swarmed. But we'll also have some cool right to stickers. So come grab them if you want them. And where can we find you? We'll be over here. We'll try to head out to the back. Yeah, because we have another talk coming in a moment or so. Okay, I don't see any questions. So I'm going to wrap it up at this point. But as I said, the speakers will be available. Let's give this great speech another round of applause. Thank you. Thank you. Thank you. Thank you.