 And so let's talk about deserialization vulnerabilities. Before I get into it, just a couple words about myself. My name is Ian Haken. I'm a senior software security engineer at Netflix. I'm on the platform security team where we make a bunch of tools to keep our microservice ecosystem safe. Download the slide deck afterwards. We talk a lot about all the cool stuff we do so you can check that out after the talk. But today I'm talking about deserialization gadget chains. So I'm gonna start by just answering the obvious question, what is a deserialization vulnerability and then getting into the question of what is a deserialization gadget chain? And ultimately what I wanna talk about is a new tool that I built for understanding gadget chains and of course the fun stuff at the end, some of the new exploits of that tool is able to uncover. So what is a deserialization vulnerability? So in object-oriented languages like Java, and I'm mostly gonna be using Java examples in this talk, code is containing classes and classes hold your data alongside the code. And that's the whole point of object-oriented design and that gives you cool features like polymorphism, but this means that if you control the type of data, if you're able to specify what data type something is, then you're implicitly controlling what code gets run. So let me give you an example. So this is kind of a classic Java deserialization vulnerability. It's a REST endpoint that reads in a post body and passes it into an object input stream and then you read some object out of it and in this case we're casting that object to a user and calling render on it. So what the developer might intend is that this is some user class that exists on the class path and so the post body that gets sent in is some serialized version of this. It has a name, when you call render on it, it returns that name, totally innocuous. Nothing interesting can really happen with this. But where you start getting into dangerous territory is if maybe you had something like this on your class path. So it extends user and it's a thumbnail user. The intent is that there's some member that specifies a file path with the thumbnail of that user and when you call render, it reads that file from disk. So if an attacker sends a thumbnail user to this endpoint instead of a regular user, then when it calls user.render, he can read off any file from the disk and get that returned. So that's what I mean by controlling data types means that you end up controlling what code gets executed. So why am I talking about deserialization today? Like that's this 2016 topic, this is not new. This is something that we've been thinking about for a little while. But I mean, honestly, this class of vulnerabilities really goes back to even before 2016. So some of the first mentions of it go all the way back to 2006. Mark Schoenenfeld gave a talk in a black hat that year and kind of identified how some application containers basically were subject to this kind of vulnerability. They were using object input stream in an unsafe way and you could get code execution on them. But the talk that really kind of put the spotlight back on this subject was given by a throw off in Lawrence in 2015 at AppSat Cali. And this just really kind of blew up this vulnerability class because they showed that there's these gadget chains that exist in all sorts of open source libraries that mean basically any application that's doing unsafe deserialization is subject to some kind of RCE. And it's because they utilize these libraries that have these RCE gadget chains in them. So in the year that followed, I've heard a lot of application security researchers refer to that as the Java deserialization apocalypse because everyone realized that their application was vulnerable to this sort of thing. So every talk, every conference, every convention had someone talking about this stuff in 2016. My favorite talk from that year was probably by Luca at a OWAS meetup where he just did a really good job of kind of explaining what these vulnerabilities are, what they look like, what exploits look like and how you should remediate them. So if you really want to dive into this a bit more after, that's definitely a good talk to go look at. But you might have thought that was the end of it. If 2016 was the Java deserialization apocalypse, then it's all said and done. But at last year's Black Hat, Mino's and Moroche gave a survey of JSON parsing libraries that talked about how all these other libraries also can potentially do some unsafe deserialization and you can be subject to just as much dangerous behavior as if you're using the Java object input stream because up to that point, most of the focus was really on this Java object input stream and they did a survey not just in Java but across other languages like C sharp of just other JSON parsing libraries where things can go wrong. And in case you think that was the last talk or basically this is the last talk, this vulnerability class isn't going away. In October at AppSecUSA, there's someone talking again about deserialization vulnerabilities and why you've got to do stuff to protect yourself from them because we haven't solved this yet, it's not gone. So why are deserialization vulnerabilities so bad and interesting? If they were all really just like that first slide I showed, then they actually wouldn't pop up that much because it's not that often that you have some class on your class path that does something dangerous that overrides something where you meant it to do something safe. And the reason that they're so bad is because there's these things called magic methods and what those are, are there methods on classes that get automatically invoked by the deserializer before the deserializer ever even returns? So that means that dangerous behavior that's implemented in one of these magic methods can get invoked regardless of what data type you meant to be returned from that deserializer. So here's another example. So this is exactly the same dangerous endpoint or vulnerable endpoint from that first slide. But let's say there's some bad class you have on your class path that's doing something unsafe inside one of these magic methods like read object. So in this case, it's just executing some string that is reading out of that object input stream. But even though my application isn't using evil class at all, even though it expects a user to come back, the deserializer is going to execute that read object magic method before it ever returns. So it's gonna execute that runtime exec before the cast to user. So it doesn't matter what my application actually expected the data type to be. So what's the deal with magic methods? Maybe you've never even heard of them before. How common can they actually be? And the answer is they're actually really common because all sorts of classes inside the JDK implement magic methods. And so hash map and priority queue are a couple of good examples, but they're all over the place. And the reason that these magic methods exist is because it allows classes to customize how they serialize and deserialize their data. So if you had a hash map that just used the default serialization strategy where it serialized all of its hash tables and different maps and bins and buckets, then that serialized version of the hash map probably wouldn't be interoperable between Java versions because they may change their implementation under the hood and then everything would break when you tried to deserialize it. So instead what it does is it implements these magic methods where when you try to write out the object instead of writing out all of its hash tables, it just writes out a list of key value pairs. And then inside its read object method, it expects to be able to read in a list of key value pairs and it calls this.put on the key and the value. And that means that each object or each key at least that is reading in for that input stream, it's calling hash code and equals on it in order to put it into the hash map. So this gets you some additional known entry points because it means that if you have some class on your class path that does something dangerous inside hash code or equals, we know we can wrap that class inside a hash map and get from its read object magic method into the dangerous hash code method. And so this is how we start building up a gadget chain. So here's a really specific example of what a gadget chain might look like. So here's more or less what hash map does inside its read object method. And all of its doing is basically what I just said. It's reading keys and values out of a list and then calling put on it. And in particular it calls hash code on the keys it reads out. So let's say there's this class that exists on your class path. And this is an example of a class out of the closure library. So it's basically a proxy object where inside hash code, what it does is it looks up an I function interface inside its map for hash code and then it invokes it. And so inside that closure function map, we the attacker could serialize some interesting I function implementation. So as an example, you could implement, you could supply the compose function, which just has two members functions inside of it that it composes. And so as one of those functions, we could supply the constant function. And then as the other function, we could supply eval. And then basically when you wrap all of this up in a nice package and tell your deserializer to deserialize it, it's gonna automatically call reobject on your hash map. That's automatically gonna call invoke on this compose function, which is automatically going to call invoke on this constant function and pass that into the eval function and then do arbitrary code execution. So this is an example of what that payload might look like using Jackson style serialization. So, and that's exactly what it just described. You wrap things in a hash map as it's members, you use this abstract table model class with the dangerous hash code implementation as hash code. You use this compose function and then you supply the values you want for each of those two functions inside there and then you can execute whatever binary or command you want. So the important thing to understand about gadget chains and the things that makes deserialization vulnerabilities so dangerous is, as I kind of showed you in that example and kind of alluded to earlier, what gadget chains can be constructed has nothing to do with what your application actually does because if there are classes on your class path, they can be specified by the serialized payload and then your application can therefore be made to construct them and run whatever magic methods exist in those classes. So your code, as with that example, wouldn't have to have called any of those things. In fact, maybe there's no code anywhere, even transitively that called any of those methods, but by the mere fact that they exist on your class path, they can potentially be exploited. So what Java libraries are vulnerable? And again, I'm kind of focusing on Java, but this is definitely something that applies to C-sharp and PHP and lots of other languages. But in Java, the object input stream, the one that's built into the JDK is probably the most well-known and most studied one. But Xtreme is another library. It's an XML parser in its default configuration. That can be used unsafely, and all these JSON parsing libraries have unsafe configurations, where they can basically be induced to deserialize arbitrary types and therefore potentially do dangerous behavior. And if you're interested in exactly when those libraries might be dangerous, you should definitely spend some time reading Munoz and Maroche. They did a really good survey of how and when these kind of libraries can be misused. But what's important is that as you begin studying these additional libraries beyond just the object input stream, libraries end up having different magic methods that will automatically get invoked, and they have different notions of what can be serialized. And that's gonna be really important as I keep talking about this later in the talk. So how do you know if your application is vulnerable? So finding potentially vulnerable applications is really basically the same thing as a lot of other application security vulnerabilities. So things like XSS or SQL injection, all the vulnerability really is is some kind of attacker controlled input flowing into one of these dangerous libraries. So in this case it's the object input stream or X stream or Jackson. And so existing tools are kind of already good at understanding how to find those vulnerabilities. Because it's exactly the same thing as looking for some kind of attacker controlled string going into some kind of SQL statement. So I'm not too interested in digging further on how you find those vulnerabilities because existing tools are really good at that. But what do you do once you do find a vulnerability? That's the big question that I wanted to talk about. And one of the simple answers is why don't you just use a better serialization strategy? Why use one of these dangerous libraries, use something that's safe? And Luca has this great quote from his talk in 2016. It's 2016, there's better options. Why do you still use object input stream? And I think that's really good advice if you're working on a new project, if you're building a new service. But what happens if you're not working on a new project? So who recognizes these guys? Or in particular the thing on the left? So that's the original Netflix disc that got sent out to owners of a Wii so that you could stream Netflix from Wii's. And so that's got client code stamped on a disc that was sent out in 2010 that we still have to be able to speak to. And so you might be in situations where you don't control your clients and can readily update your IPC mechanism. The guy on the right is the first generation Roku that came out and it's exactly the same thing. It's got firmware in there that needs to be able to talk to upstream services. And even if you're thinking you can just update firmware and update your IPC mechanism, if someone's got one of those in a closet and they pull it out in two years, at the very least we need to be able to talk the IPC mechanism that tells them they need to go fetch an update. So you can't just turn things off easily necessarily. And even if you're not in one of these contexts where you've got some clients that you can't easily update, it's still just a very costly operation to start ripping apart your IPC mechanism. If you need to update your server to speak something new, something other than JSON or Xtreme or object input stream binary format, then you've got to update your server, then make sure you update all of your clients and then only once you finally tear down everything on the server side would you be safe. And that's just a lot of work, even in an ecosystem where you control both the client and the server. And so at Netflix, where we've got a microservice ecosystem, we've got thousands of applications and we're coming across these things and we have to decide what to tell a developer about how important it is to patch this issue we found. We need to answer the question, is it worth the effort to drop what I'm doing and spend three or four weeks or maybe more doing exactly that process I've described of updating all your clients and services in order to patch that vulnerability? Is your GC serialization vulnerability that we just found even exploitable? And that's something that's not immediately obvious when all you know is that some kind of untrusted input flows into one of these unsafe libraries. So how do you find exploits for a deserialization vulnerability? How do you find these gadget chains? So why so serial is one of the most well-known projects in this space that throw off maintains and it's got a bunch of gadget chains for the object input stream. MarshallSec is another project in this space that's got some wider breadth and understand some gadget chains for some of these other deserialization libraries. But they're both basically projects that have these known gadget chains and you can compare your application to that list of bad libraries where you know there's some version of this particular library where you can construct a gadget chain. But that doesn't tell you something that might be unique about your application. Maybe there's a gadget chain that only shows up when there's some class in your application plus some other classes in these other libraries that only when all put together end up giving you some kind of interesting gadget chain. And furthermore, those are all bound to these kind of known deserialization libraries. What if you're using something new or something custom that is vulnerable to these same kind of attacks but isn't one of these sort of well-studied ones? How do you answer the question, is my vulnerability exploitable? So besides the couple that I mentioned, there's a bunch of other existing tools in this space. So Jugal is a good tool for programmatically querying about metadata on your class path. There's a Java deserialization scanner, which is a BIRB suite plugin that mostly uses payloads from Wiso Serial in order to detect whether or not you're vulnerable to one of these known gadget chains. The NCC group BIRB plugin is something that was released earlier this year. Again, another dynamic scanner that's mainly based on payloads from the Munoz and Morosha's work at last year's Black Hat. So this is more focused on the JSON deserializers. But again, these are all kind of tools that might help you but don't immediately answer that question. Is there something unique to my application that makes it vulnerable to one of these exploits? So given that I wasn't able to find a tool that did exactly what I wanted, I went about the task of asking, how can we evaluate the risk of this kind of vulnerability and what do we really want to be able to answer? And what we want to be able to answer is, what is the risk? How important is it to remediate a vulnerability? We want to know if that deserialization vulnerability is exploitable. And if it is exploitable, what exploits are possible? RCEs tend to be much more interesting than DOS. And so if that's our goal, just to evaluate the risk, we don't necessarily have to be perfect. We don't have to set about to solve this problem once or for all. A reasonable overestimation of risk is reasonable. And we don't actually have to generate payloads if we don't want to. Knowing what kinds of payloads might be constructable is also a really useful piece of information. So if those are the requirements, what I want to set out to do, then specifically what I'd like to do is build something that finds us gadget chains. So I'm not looking for vulnerabilities. I'm only gonna use this new tool if I already know my application is vulnerable. But it needs to be able to look at the entire class path because of what I said at the beginning. It doesn't matter what code is in my application. It matters about the sum total of classes on my class path. It should err on the side of false positives because a reasonable overestimation of risk is more useful because I don't want to tell developers to drop what they're doing and fix something unless I have good reason to believe that there's something exploitable in it. And lastly, it should operate on the Java byte code because we've got like a million plus one languages written on top of the JVM now. And I don't want to write something that has to understand Groovy and Scala and Closure and Kotlin and whatever comes out next week. So if I just operate on byte code, then I've got it covered. So I put together a tool that I called Gadget Inspector, which is a Java byte code analysis tool for finding gadget chains. That's what it does. So the way it works is it operates on a class path. So you specify either some jars and their dependencies or an entire war, basically your entire application. And then it reports discovered gadget chains, which is really just a sequence of method invocations where one invokes the next and you're starting at some known entry point and you're getting to some kind of dangerous behavior. It does a little bit of some holistic symbolic execution to figure out when some attacker controlled arguments can get passed in to a method and then that gets passed to the next one in the chain. And most importantly, because of the context we're working in, this tool is able to make a lot of simplifying assumptions that actually makes this pretty easy to do. It's not something where you have to have written a thesis on symbolic execution in order to understand or implement it. So all right, specifically how does this tool work? So the first step is just enumerating everything on your class path. You wanna figure out the whole class hierarchy, all the method hierarchies so that when you see something calling a method from one magic method, like, you know, hash map calling hash code, you want to know what are all the implementations of hash code that you might jump to. So first step is just enumerating all that stuff and that's not terribly difficult. You can use a plain old Java reflection APIs to do that if you want to. But important first step for the rest of the analysis. So where things start getting interesting is when I want to understand the data flow inside an application. So the first thing that I wanna discover is what I call pass through data flow. So this is where basically what I mean is if an attacker can control the input to a function, does that attacker control data get returned back out of a function? So in this case, like with the constant function, if an attacker controls the implicit this to this argument, then they're gonna be able to control this dot value and therefore the return value. So that's one of the first assumptions that goes into this. Object is tainted and this is basically taint analysis that I'm doing here and if you're not familiar with that or don't really know what I mean by taint in this context, all I really mean is that I'm thinking of it as being attacker controllable. So if an object is tainted, then every member on that object is also considered tainted. And that's a pretty reasonable assumption because if we are thinking of an object as being attacker controlled, that means it came out of the serialization library so all of the members on that object are also in that serialized payload. So that means when we look at a function like this, we can enumerate this piece of path through data flow and all this kind of funky custom syntax means is that if the attacker controls argument zero which in this case is the implicit this, then the return value is also considered attacker controlled and that's just because we returned this dot value. So as one other example where things start getting a little hairier, there's this default function which wasn't on a previous slide. So all this does is look at an argument and if it's not null, it returns it and otherwise it invokes some other function like a constant function. And in this case, we've got a branch condition which is something that's also really hairy to deal with if you're doing any kind of static or symbolic analysis. But in this case, we make another assumption which is that all branch conditions are satisfiable. I'm not gonna worry about whether or not I can go down different paths and this is probably one of the weakest assumptions that's made in this but it's also one of the easiest ones to make because in practice, if you're inside these magic methods or going down a gadget chain where all this stuff is attacker-controllable because it would have to be for you to get there in this chain then basically all of the variables and arguments going into a branch condition are attacker-controllable. So usually an attacker can tweak these things to get down whatever branch condition they want to. So if we assume all branches can be walked down then we end up with these paths through data flow. So in this case, the first argument just gets directly returned here if we go down the true path and if we go down the false path based on the first pass-through data flow we discovered the return value of f.envoke is gonna be considered tainted as well. So we enumerate that. So step three is basically exactly the same thing. It's the same symbolic execution of just walking through what data flows where but this time instead of looking at return values we care about where data flows into subsequent method calls. And so we're gonna use the data from step two in this to just enhance this enumeration. But let's look at that dangerous hash code method that we had earlier and see how that shakes out here. So in this case we would end up enumerating these pass-through call graphs or method calls. So again some sort of funny custom syntax but all I'm saying here is that if argument zero the implicit this is attacker-controllable then that's gonna flow in as argument one what I call function.envoke. In this case all I know is that it's the i-function interface. And so we get that literally because this gets passed in as argument one to that function there. So that one's kind of easy to figure out. But f.envoke, f comes out of this map which is a member of this. So again because of that assumption where we assume all members are attacker-controllable we know that f would be attacker-controllable so f which gets passed in as the implicit this to function.envoke would also be attacker-controllable so that's where we get that from. And just to go through sort of one more example this is what you would get if you looked at the compose function again from the previous slide. And again all we're really doing when you are doing this symbolic execution is just stepping through byte code one line at a time and it's actually kind of easier to understand what's going on when you look at it that way but sort of at a higher level what we do is we see argument one gets passed in as argument one to function.envoke. Then we see f one which is a member of argument zero the implicit this gets passed in as the implicit this to function.envoke. And finally the value that gets returned from that based on our analysis from step two would also be considered attacker-controllable and then that gets passed in as function one to or argument one to f two. So just a lot of walking through these functions and enumerating these things and really there's not a lot very deep going on here it's just kind of a lot to keep track of but computers are good at that. So step four next to last step is just enumerating known entry points and that's basically just using all the known tricks that researchers have come up with over the last few years to figure out how to get into interesting gadget chains. So for example we see this hash code method we know it overrides object.hashcode so we can enumerate that as an entry point. So all right that step's super easy especially after the last few. But this does highlight one limitation that I wanna point out which is that this does rely on known tricks. So knowing that we can get to hash code we could have derived from this analysis just by going through that symbolic execution of the read objects method of hash map but there's other clever tricks that researchers have come up with like wrapping things in a dynamic proxy where that then calls invocation handler dot handle that we wouldn't be able to derive. So there's definitely room for more gadget chains that this thing might be missing just because there might be more clever tricks that aren't hard coded into this guy. So all right very last step now that we've enumerated all that stuff the only thing that we have left to do is literally just do a like algorithms one-on-one breadth-first search on this call graph in order to see if we can get from one of these known sources to a method that does something interesting. So just using exactly that stuff we've enumerated to build up that gadget chain from some of those first slides. We would look at that entry point and then looking at the methods of that calls below we'd want to step into each of those and see what methods those things would subsequently call and here's where we make one of the last assumptions which is any method implementation can be jumped to. So down here we see we're calling i-function dot invoke and we don't have a specific method that were or specific implementation that we're jumping to there. So as we're going through this call graph we're gonna go look at every implementation of that as long as that class is considered serializable and the reason that we assume we can do that is literally because that's how we build up gadget chains. If we control the data type of one of those members that determines what implementation of i-function this is then we can build up our gadget chain in such a way to get to whatever implementation we want to get to. So for example, we might use this call in order to get into function-compose dot invoke and then looking at what functions that calls we're gonna end up walking through each of those invocations and one in particular might be calling function dot invoke where we pass in a tainted argument of one and use the eval as our implementation and then inside there we would see we call runtime dot exact and we know that does something interesting and something dangerous so we would output this as our gadget chain. So by walking through all those steps this thing would look at that library and spit out this gadget chain. The one last limitation that I will point out here is that this of course relies on knowing what are interesting methods or interesting syncs that we should output gadget chains for. So there's lots of good stuff in the JDK so reading files, writing files, runtime dot exact, opening up a URL, doing DNS lookups, sleeping, there's all kinds of side effects that you might be interested in. So adding more to its list of interesting syncs is a way to improve this tool but even with kind of a limited set of just knowing what's interesting gets you pretty far. So one of the things that I mentioned at the top of this talk that was really important to me is that there's a lot of different libraries now where we know there's serialization vulnerabilities. And as part of this analysis I mentioned a few times that there's things like known entry points that we want to start with or we consider any class at serializable to have a method we can jump to. So all those things are parameterizable in this analysis. So for JRE deserialization anything implementing serializable is considered a serializable class but for Xtreme it depends on what converters you've enabled so it depends on how your application is set up. For Jackson it's basically any class so the no R constructor is considered serializable. For Jackson you can also only jump into constructors as your entry points. And so there's lots of differences between libraries but all those things can be easily tweaked and parameterized in this analysis. So this is what makes this tool I think especially powerful is that you might be working in some kind of custom context where you're doing unusual forms of deserialization that happen to be unsafe but aren't well studied yet by a project like Marshall Sac or Wiso Serial. And this is a tool that can help give you insight into those kinds of libraries. So all right I described this tool it does a whole bunch of funky things that maybe you did or didn't follow depending on how much sleep you guys got last night. And I claim that like at the end of the day this thing can find some gadget chains. Sorry does it live up to the hype? So the first thing I did after writing this thing on like a 10 hour flight to Europe was run it on some open source libraries to see does this thing actually do anything useful? Can it find some stuff? Because at the very least it should be able to find gadget chains we know exist because of the stuff that Frohof and Lawrence discovered in 2015. So all right, built this tool ran it against the 100 most popular libraries at least according to mavenrepository.com and look for exploits against the standard Java deserialization library. So it did successfully rediscover some known gadget chains. So cool it's at least doing what I claim it's supposed to do. It didn't find a ton of classes implementing serializable so it didn't have a ton of new findings but it did have some and so I'm gonna talk about those. And it did have a handful of false positives cause this does try to err on the side of false positives but not as many as you'd expect ease like just a dozen enough that are easy to rule out and it's mostly because reflection is hard to reason about. So all right old gadget chains what did it discover? So it rediscovered the commons collections gadget chain and the reason is gadget chain was so interesting when Frohoff and Lawrence first discovered it is because it's the 38th most popular dependency at least when I looked this up a couple months ago and so it's everywhere. Like every application more or less ends up pulling this thing in as some kind of transit of dependency and this is more or less what that gadget chain looked like. You wrap your object inside a dynamic proxy and then you get into this invocation handler and then you go to the commons collections lazy map which ends up doing some reflection things and lets you basically call any method you want and so hooray it found that old gadget chain it's finding things that I expect that it should be able to find. But the first thing that it actually found was this new gadget chain the side closure and this is basically the gadget chain that I've been kind of discussing and using in this example leading up to this. So this was super interesting because this was the sixth most popular dependency according to mavenrepository.com and so what this gadget chain did at least according to the way the version that originally found is load a closure file from disk and execute it which may or may not be interesting but it also turned out that by tweaking that last step in there to call eval instead of load file it would execute arbitrary closure that you pass in so it's basically RCE. So that's super interesting if there are people that patch their version of commons collections but decided that they're good now and they're still doing unsafe deserialization chances are you're probably pulling in this dependency so you're still in hot water. Hopefully in the last couple of years people have figured out that they shouldn't be doing unsafe deserialization but people continue to surprise me. So I did report this as a closure dev mailing list when I discovered it and they decided who's even serializing this class anyway we're just gonna turn off serialization with that class and then that's great. So all releases since 190 have disabled serialization on that abstract table model class so that hash code entry point doesn't exist anymore. Yay, we're making the world safer one gadget chain at a time. More recently I discovered some new gadget chains in Scala using this tool. So Scala is the third most popular dependency according to MaverickPository.com. So this gadget chain isn't an RCE maybe not as interesting but it does allow you to write or overwrite a zero byte file on disk and that's an interesting DOS exploit because you can overwrite some application resource file zero it out and then your app goes down so that's possibly interesting. There's a very similar one that Gadget Inspector also found that can do an SSRF so it does a get at an arbitrary URL and it's basically the same thing and this is something that gadget chain spat out and I've got examples of the actual gadget chain payload on my fork of Wiso Serial that you can check out after this talk so these are not just things that it found and I'm claiming like could it actually work if you built the gadget chain I did actually build the corresponding gadget chain and verify these things work so cool stuff. So just before this talk a couple of weeks ago I re-ran Gadget Inspector on the latest release of Clojure and then it turns out that exact same gadget chain I found before still exists in Clojure just with a different entry point. There's another class implementing hash code that delegates to a function and so you can actually do exactly the same gadget chain using this different entry point and that's been in every release since 1.8.0 So apparently there's still an RCE gadget chain in every release of Clojure that's out there. So I need to follow up with the Clojure guys and see if they wanna lock down this too but really I hope this is just hammering in the point that you've gotta stop doing unsafety serialization guys there's gadget chains everywhere. But all right enough of that I looked at open source libraries but what I was getting at at the top of this talk was that what I really wanted to find was gadget chains that are specific to my applications I'm looking at so that I can go back to developers and tell them how important is it that you patch this thing right away or can you wait until your next release so that you can finish up these critical features. So let's look at vulnerable web app number one. So this was using some potentially dangerous use of jackson deserialization an attacker could specify any class to instantiate and put an arbitrary body in there but there were a lot of limitations on it it was using more or less the default configuration of Jackson so you could only deserialize classes with no R constructors your only entry points are gonna be no R constructors and most of the time classes don't do anything terribly interesting in constructors but it did have a 200 megabyte class path and was bringing in like six dozen dependencies so there might be something there and I don't really have enough time to manually go through every constructor of every class on that class path to find out if any of them do anything interesting. So I ran gash in inspector and it found nothing. So all right that wasn't the cool exploit and bomb shell you guys were hoping for but it saved a bunch of time cause no one had to go through every constructor and decide if it was important to remediate this vulnerability we could tell the developers that hey it's cool if you wait until the next time that you're able to get to this but the story doesn't end there so internal web app number two so this one was really interesting cause it used a non-standard deserialization library something that had some like custom in-house tweaks to it that had some really unique constraints on it so it invoked read resolve matching methods but not read object and it was able to deserialize any class on the class path that didn't have to implement serializable except for the rest of these constraints down here so one is that it's member fields couldn't have dollars in it because that screwed up the binary format of this thing so non-static interclasses always have this implicit dollar outer member name so basically anything that happened to be a non-static interclass was not serializable furthermore it didn't have any support for serializing arrays or generic maps and most importantly every member value had to be non-null and that meant that every member value every type of every member value also had to satisfy all of these constraints because you couldn't leave it null so you had to serialize it as something so it had to satisfy all these constraints that also meant that you couldn't have any data types that had any character arrays or byte arrays in it you couldn't have any data types that had some kind of self-referential or recursive type cause like thread for example has parent which is a type of thread so there's no way to have non-null members for all those and stick in a payload so it was really really hard to determine what classes were even considered serializable in this context much less whether or not you could actually build a gadget chain that went through those particular classes but that's the sort of thing where Gatchin's vector has the functionality to stick in all of those constraints and then ask it to tell you what do you find and this is what it found so this is a 12 step deep gadget chain it starts at read resolve like I promised and the bottom thing that it does here is copy a file from any arbitrary location to any other arbitrary location and that was cool cause it allowed us to do things like exfiltrate private keys off the box by dropping them in the web app resources directory and you can look at this really closely but I feel like you don't really have to the thing that's really interesting is just looking at the package names that are showing up here so here I've highlighted the different dependencies that this gadget chain is flowing through and if you count the app itself and the JRE there's seven different libraries involved in this gadget chain and it's something that you would never have found by analyzing any of those individually or and you would never have found it by looking at the set of dependencies without also pulling in the classes from the application itself but it's something that just lit up as soon as I ran Gatchin's vector on it and so that's super cool and that's what I'm talking about where this thing is utilizing the power to look at the entire class path and it's able to utilize the parameterization of what it means to be serializable according to your kind of custom constraints so that was a really cool gadget chain that this thing found but also spending just like five or 10 minutes staring at this you see this step this gadget chain method is step eight which is stream pumper dot run what that actually does is copies an input stream to an output stream so if you look at this for just a few minutes you realize you can tweak this last thing to copy an arbitrary string input stream to a file output stream and be able to write an arbitrary string to an arbitrary file so I was able to write a JSP to my web app resource directory and get RCE with this gadget chain so this was a really cool result to come out of this and immediately allowed us to say all right we've got to fix this thing now because you're getting RCE on this like really sensitive service so this was really powerful and it allowed us and it saved us time of trying to actually build up this thing as a matter of fact this was a web app that we actually had a pen test team looking at and they identified that was vulnerable to this kind of vulnerability but they spent a couple days kind of looking at it here and there and basically weren't able to decide whether or not you could do anything with it Gadget Inspector took about 15 minutes to run on this application and spit this out so that is a huge time saver and I think a huge win for both pen testers and AppSec engineers trying to understand deserialization vulnerabilities and applications so obviously there's gonna be a lot of room for improvement in this kind of tool so reflection continues to be the bane of existence for anyone doing code analysis of any form it's hard to understand and this tool basically just treats any kind of reflection as an interesting sink just cause it doesn't know how to do any better but that also leads to a lot of false positives and some blind spots so it can be improved there I also mentioned that there's a number of assumptions and limitations that I made in the course of building this tool and while I think most of those were reasonable given the context we're working in it's obviously something that could be improved but that being said I think diving down into this kind of automatic analysis for deserialization vulnerabilities is territory that has a lot of room for more discovery and more time spent on it because this was something that's just kind of a functional prototype but it's already saved us a bunch of time as we've been doing AppSec reviews of our internal applications and this was something that I specifically wrote for Java and to understand Java bytecode but I think all the techniques I described here apply equally well to C-sharp and PHP and all these other languages that have these kind of libraries that allow you to specify data types and therefore it can be used dangerously but this tool is open source so I encourage you to go look at it check it out, see if you wanna do a PR on it or improve it or just use these ideas and kind of build your own thing that's better but also most importantly deserialization vulnerabilities aren't gone yet they're still relevant and they're still interesting and I think exploits can and will be more complex as time goes on this is the first time I've ever seen a gadget chain that long and I think we need better tools to help us better understand those sorts of vulnerabilities so if you've got questions we've got about five minutes you can also hit me up online later and thank you all for coming