 Hello everyone. Thank you for my introduction. I was so nervous before my first DevCon talk that I prepared 62 slides for this 20-minute talk. We will try to manage it. So, we'll talk about Prada Day Pollution Vulnerabilities in JavaScript. This talk is based on the joint work with Musart Balon and Chris Tycom. In this paper we study Prada Day Pollution Vulnerabilities and their gadgets and implemented the tools to detect them. I already presented part of this talk, the exploits that we detected in the real application and details of the gadget on Black Hat Asia. Today I want to present more details about our methodology about the tools. Let's start with some introduction in JavaScript and how inheritance work in JavaScript through an example. So, we run Node.js, which executes the code in index file. The first line of the code creates an empty JavaScript object. The runtime allocates a new object with a building property Prada that points to the object prototype. The object prototype has a bunch of functions that we can reuse, for example, to string. To implement inheritance, JavaScript allows to extend a prototype with a new property. In this example, we define the property X with value 42. Things get more interesting when we create some other object, in principle unrelated from the first one. Both objects share the same prototype. When the runtime executes the last line of the code to print X property for the second object, it tries to find prototype in the object itself. Since X is undefined for the second object, the runtime will look up the property X in the prototype. In this case, it prints 42 to the terminal. Well, let's consider the threat model of web applications to see how this feature can affect security properties of the application. Index.js file creates a simple web app that handles to request. Update and backup. The attacker and our threat model can send any request to the server, for example, update with any parameters. Let's see what happens when the parameters are in this figure. The code creates an empty object. It reads the proto property. And then the attacker adds the property shell with the value calc to object prototype. So, this code pattern is called prototype pollution vulnerability. But how can it affect our application? Let's assume that this code handles backup request. It just executes backup script by the helper function. Notice that the attacker cannot control any arguments of this helper function. In this function, we can give some options in terms of which shell you want to use. If we don't specify this option, it will use the default shell and then it will run a new process. If the attacker sends backup request after prototype pollution, we execute the function and options.shell reads the attacker-controlled property from the prototype and runs the calculator. So, we get remote code execution and these code fragments call it prototype pollution gadget. If we find gadget in the code of Node.js itself, the impact is much higher because it potentially affects all applications. In summary, to achieve remote code execution, we need two steps identifying prototype pollution and identifying gadgets. How to identify prototype pollution at scale? We implemented static analysis for Node.js applications and NPM packages. We used trained analysis where we marked the attacker-controlled data by the input label. However, we cannot define these things syntactically because not every property assignment leads to prototype pollution. Instead, we use what we call multi-label paint analysis to find this thing. Let's see how it looks at this example. We assume that all arguments of the function are attacker-controlled and mark them by input label. We propagate the input label and if we have a property read with the tainted property name, we change the prototype to proto. It means attacker can potentially read object prototype here. When the analysis detects the property assignment with a receiver that has a proto label like this one, it reports the code fragment as potential prototype pollution vulnerability. We used the main idea for our analysis. We implemented it based on the CodeQL analysis framework and evaluated our analysis on 100 vulnerable packages that we collected. The best result achieves 97% recall. It means we detected 97 vulnerabilities out of 100. So, which is necessary to find vulnerabilities in the real applications? The second question is how to identify the gadgets. We analyzed Node.js code and used dynamic analysis to detect property reads from object prototype. Then we used static analysis to find flows from these properties to internal Node.js function calls. You can find all details of this analysis in the paper. In presentation, I want to show some interesting results. So, we detected 11 different gadgets in Node.js APIs. The first gadget is spawn function, which executes new process. Let's look at this code. Details is not important here, but we see that the property shell and the property nth can be undefined. And there is a flow from these properties to the internal function call, which is actually vulnerable. It's simplified version of spawn function from Node.js API implementation. Let's see how remote code execution can be achieved. Suppose that the backup handler calls spawn with no attacker-controlled arguments. The attacker first pollutes the prototype by update request, as we saw earlier. They add property shell with value node to object prototype. The property nth by another request. And send backup request to execute spawn function. Let's see what happens. When spawn executes, it reads the value of the shell and nth from the prototype. It allows the attacker to run Node.js in debugging mode by controlling environment variables and connect remotely to execute arbitrary code. For this, we implemented the shell based on Node.js remote debugging protocol. You can see the short demo on the slide. The second gadget is the require function. The require function is used to include external packages to an application. So each application has a lot of require calls. This is simplified snippet of a require function. As we can see, it reads a package configuration file, package JSON, and evaluates the entry point, if this is defined in the property main. If main is undefined, Node.js uses default value. Let's see how we can exploit it if the attacker pollutes the main property. To exploit the gadget, we need to require a function call for a package without the main property defined. An example of such a gadget is bytes, one of the popular gadgets in NPM ecosystem. Let's see how it works when the attacker triggers this code to execute the require call. The attacker triggers the backup handler, which parses a config file of byte package. Since the main property is undefined, it looks up the property from the prototype. To achieve remote code execution, attacker should control this malicious file. As you can guess, this is a strong requirement that the attacker should be able to upload some malicious file to the system, for example, using some other vulnerability. Let's see how we can bypass this limitation by combining these two gadgets. The key idea is to use require a gadget to trigger the spawn gadget. To achieve this, we need to find the existing file that executes spawn function. If we identify a file in Node.js default distribution, then we increase the impact of the exploit, because every application uses Node.js, already have this file. An example of such a file is npm.js, which runs Node.instance. Let's look at the end-to-end exploit. The attacker pollutes the main property with a pass to npm.js, and the end property is required by spawn gadget. Finally, when backup is triggered, Node.js executes the require function loads npm.js that calls spawn function. The spawn function reads the attacker-controller environment variable from the prototype and runs Node.js in debugging mode as we saw earlier. So the attacker achieves remote code execution for a require call without additional requirements. The last question is how to exploit real application using or tools and detected gadgets. We covered the GitHub for Node.js applications and took 15 most popular ones. We ran our tool and got some prototype pollution cases. As you can see, the prototype pollution pattern is rare in practice, but manual verification is applicable for the total number of detected cases. We confirmed eight detected cases as exploitable and reported them to maintainers. You can see, we found two cases in npm-cli. Everyone, I think, used this up. Part server and rocket chat messenger. The main problem of real exploitation of this vulnerability to achieve remote code execution is that some code of the application can break your application if you pollute object prototype. So the code does not expect any additional properties in the object prototype. It happened many times in these experiments and I want to show one technique that you can use and that you can also apply for your research. Let's see a case for part server. We have a part server. It's a server that provides REST API out of the box and use some database. For example, MongoDB to store data from the request. It also uses JSB son. It's a library of the MongoDB and it can serialize and deserialize the data to binary MongoDB format. So we detected prototype pollution case in part server. This example of the code details is not interesting. It's not important. What's actually important for us that attacker can control name and the value of the polluted property. We also detected a gadget JSB son package and this gadget can be triggered when JSB son deserialize some data from the MongoDB. So this gadget call eval function for function string from the MongoDB that attacker can store to the MongoDB before but the problem that this eval call executed only if one option is enabled. As you can guess, it's disabled by default in the configuration file but it can be polluted and the gadget reaches from the polluted prototype and we can enable this feature in runtime. This is the main idea. So let's see on the naive way how we can exploit it. Attacker first send a package the package serialize it to the MongoDB, respond deserialize it and trigger prototype pollution. Good. We pollute the option that we need to trigger a gadget. After that the attacker send a package that should trigger a gadget but serializer of JSB son library throws an exception and application crashed. And any request after prototype pollution application crashed after the first request that we try to send. How we can bypass it? I found a way some kind of race condition. Let's see. If the attacker send the package that trigger remote code execution gadget first this package serialize it long time. At this moment the attacker send second gadget that should trigger prototype pollution vulnerability. It serialize it and deserialize it from the MongoDB and successfully trigger prototype pollution. After that MongoDB decided to stop handling the first request and pass it to the application and we trigger remote code execution. This technique I used many times usually you need very short time gap to trigger remote code execution gadget. One of the technique that I use I just send 20-50 requests to trigger remote code execution in one request that trigger prototype pollution in the middle. Most likely we get some situation that we trigger remote code execution exactly after prototype pollution and some threat. Let's see how it looks in practice. Some short demo. We are on par server. We need to implement a script to exploit this vulnerability because the second request we need to send in 100-300 milliseconds after the first one. So it's not possible to do manually. So this script prepared database to add some data to MongoDB to make it a little bit slowner to handle the first request for the long time and send these two requests. And you can see that we get calculator from the par server. So it works. So in conclusion we implemented and open-sourced tools to detect prototype pollution vulnerabilities in their gadget. Detected 11 new gadget in Node.js APIs. You also can find them in the GitHub app. Some gadgets already fix it but not all combination of that. And we reported a remote code execution vulnerabilities in the popular open-sourced applications. So the interesting question for the future work. One of that, what is the more efficient way to detect gadget? So we have options like a static analysis, dynamic analysis. We already used some hybrid approach. I think it's a good question for the future research. How many new gadgets are in NPM packages? We did not consider in these research NPM packages but it's also efficient way to detect it to explore it. Probably we can find some new ones. We continue researching this direction. If you are interested in this just follow me on the twitter. Thank you for your attention.