 talk is called Damn GraphQL, Attacking and Defending APIs. I'm extremely excited to be here. My name is Doliv, this is my first time speaking and participating at NorthSec. Extremely excited for the opportunity and I'm excited to also share what I've learned about GraphQL in the last six months or so. So to get things started, I want to kind of talk about the very basics on GraphQL, whether you know what GraphQL is about or you have no idea what this is, this is the right talk for you. I'm going to bring everybody to a point where we all can understand what GraphQL is about, what makes it very unique and also the attack surface of GraphQL. Towards the end, we're going to show a very practical demo of a real-world CDE. And lastly, I will share all the resources that I could possibly gather in order for you to actually learn and practice GraphQL by yourself. So let's start with what is GraphQL? So what GraphQL is a technology by Facebook that was released a few years ago, it's a query language for APIs. You can think of it almost as an alternative to REST API, except it works a little bit differently. It uses three main kind of operations called queries, mutations and subscriptions. For the purpose of this talk, we're going to be focusing on queries and mutations, but just so you have some context on what subscriptions are, it's basically a read operation that's long-lasting, typically used for chat purposes. So you have like a PubSub mechanism to drive applications such as chats that require real-time messaging. The reason why GraphQL was kind of born is to resolve a few pain points that REST API has. And there are two kind of pain points that REST has today. One of them is called overfetching. And what overfetching basically means is that when you call an API, you will typically get more data than you actually need. And underfetching is basically the opposite. Sometimes you may call or make an API request and you don't get enough data back, which basically forces you to follow up with another additional request in order to complete the original intent. So what GraphQL is basically doing is it solves that and it also, the result of it is that it reduces the round trip the client needs to do in order to complete a transaction. So when we talk about GraphQL, we need to start with kind of the data model, the schema. A schema is basically kind of a definition of objects and fields. And for the purpose of this talk and throughout this talk, I'm going to be using the data model of Pastebin a lot. And it will make sense towards the end why I actually took that as an example. So if you don't know what Pastebin is, it's basically a website where you can upload text snippets and so on. And you can share them with people or anonymous people can also read them depending on their permissions. So you can see on the right side we have an object type called Paste and Paste has a bunch of metadata or fields associated with it. So if I upload a code snippet, I'll be the author or the owner of it and a code snippet may have a title or content and maybe there's a permission flag whether it's public or private and you get kind of the idea. You can define a type with its associated fields and this is what basically constructs a GraphQL schema. Once you have a schema, you can then start querying that data. So when we want to read data from a GraphQL endpoint, we would use something called queries. It's primarily for reading operations. So you can see on the right side, we have an example request, a JSON request that basically takes the content and title or asked for the content and title of all the pastes on the website. And you can see that the response is very similar to how the request is constructed. We're getting an array of pastes but we're only getting the title and content which is exactly what we asked for. You'll notice that we haven't received from the server whether the owner name or whether it's a public paste or a private paste, we really got only what we asked for. And this is what GraphQL is kind of all about. So if you wanted to change data as opposed to just reading data, things like deleting something or creating something, you would use an operation called mutation and mutation is basically to alter information on a web application. So you can see on the right side, we have an example how we're actually creating a new paste on the website. This create paste operation has two parameters and we're passing the title and content and the response reflects that exact paste that we just now created. So just a few GraphQL things that you will likely find quite interesting from a security standpoint. Requests are usually carried over post. What I mean by that is whether you delete something or change something or even just read without changing anything, the requests will typically be carried over post. And another interesting thing about GraphQL is that it would typically leave under a single route. So you would typically see slash GraphQL is a very common one, but it obviously can live in other spots as well depending on the implementation. But there is a few predictable locations where GraphQL typically lives. And if you are in a pentest and you wanna try and enumerate or find where GraphQL lives, if GraphQL is configured, you can use an NmapNSE script I put together. It's at the bottom of the slide, which basically attempts to figure out where GraphQL lives. And lastly, one of the interesting things here is that GraphQL would typically return a 200 response. And what I mean by that is let's say that you're asking to get some information about an account that exists on a web application. If the account doesn't exist in the REST API world, you would typically expect like a four or four or something like that. In GraphQL, it's a little bit different. In GraphQL, you would get a 200, but the indication that the account doesn't exist would be reflected in the response. So you might have like a key called errors and the value would be the account doesn't exist, but the status code is still gonna be 200. And the reason I listed all these things, and I think they're interesting from a security standpoint is if you do blue teaming insert response log analysis and you have an application that's backed by GraphQL, you will immediately notice that the investigation is gonna be a little bit more difficult. You will not have like self-describing routes just as such as you would have in REST API. And if you're used to looking for specific status codes to determine if something went wrong or if somebody had access forbidden or unauthorized request or something like that, you will not have this in GraphQL because you would typically see 200s. So quite interesting from blue teaming standpoint, a little bit challenging. It changes the game a little bit. So this is kind of the core things that make GraphQL altogether a little bit different than REST API. So now that we know a little bit about GraphQL just enough, I wanna jump into GraphQL attack surface. So from a pentesting standpoint, when you run into an application that is backed by GraphQL, one of the first thing you wanna do is you wanna figure out how to communicate with it. So if you're lucky, you're gonna run into an implementation that happens to have introspection enabled by default. What introspection is is a mechanism for GraphQL to kind of self-describe what it knows and the data model. And it's a feature that typically could be enabled or disabled obviously depending on the implementations. Some implementations actually have that enabled by default. And it's not a vulnerability per se. It's really a feature that its intention is to make it easy for people to integrate with your API, but obviously it has a security trade-off. So you really have to understand your own environment and whether you actually need to have that enabled and act accordingly. So if you are a pentester and you run into an introspection mechanism that is actually enabled, what you would get in return is you're gonna get this response that describes the data model. And what you wanna do next is you could obviously parse it manually and read through it. It's a little bit overwhelming I would say depending on how big the application is, but you could use something like a GraphQL visualizer so that you will take the JSON response, fit it into the visualizer and it will kind of draw a nice diagram of the relationship between the different fields and objects. So it makes it fairly easy to read. So from a blue team standpoint, if you are protecting the GraphQL application, try to do a self-check whether you have introspection enabled, disable that if you absolutely don't need that, but you could also place it behind some kind of access control. So just so you know GraphQL by default doesn't come with authentication mechanisms. It's something that you have to kind of slap on top of it in addition. So you have to take care of that. And this is one of the problems with GraphQL. There's a lot of things that you need to do in addition after you actually implement that. So really disable it if it doesn't make sense in your own environment. So if you're a pen tester and you run into a GraphQL application that has introspection disabled, what are you going to do? How are you going to figure out how to communicate with it? One thing that GraphQL has is basically a feature called field suggestions. And what field suggestions are is basically a way for GraphQL to tell you about fields in case you're making some kind of a mistake. So in the example at the bottom, you see I'm requesting for the telephone content of the pastes, but for some reason I had a typo and the paste doesn't have the letter E. So the response that you're going to get from GraphQL is that did you mean paste or paste? So it will try to do this matching or try to find the closest word to what you supplied. So, and you can see how you kind of leverage that. You can build a very comprehensive word list of common English words and just try and send it to GraphQL until you figure out in the fields to basically construct a proper query. What's interesting about this feature is that it's available in all the possible popular GraphQL implementations today without any ability to disable. Again, this is not a feature, this is not a vulnerability, it's a feature but it has a security trade off because you might not want to tell too much about how your data is structured. So from a blue teaming standpoint, this is a little bit tricky because there is no like option to toggle that off. You would have to go into the code and either comment it out or replace it with something else that makes sense in your context. Just so you know, there's a user experience impact here because it helps people to integrate with your GraphQL. So if somebody's integrating with your application and they made a typo, it's very convenient for them to know where they made that mistake so they can move on with their life. So just acknowledge and know that there's a user experience impact. If you don't care about that, you could go ahead and patch it. Let's talk about the denial of service and the denial of service is gonna be a very common thing in GraphQL. In GraphQL, there is the ability to have batching support and what I mean by batching is GraphQL can take a bunch of queries, even the same query multiple times and the client can send all of those in an array and the server will process them one after another. Batching would typically not be available by default. You would have to either install a package or code it yourself. But just so you know that if it's available, there's a few things you have to look out for. So this is a very interesting method to bring down a service and I will show you exactly what I mean by that later on. But one thing that you have to kind of remember, I mentioned that GraphQL lives under a single route slash GraphQL. So if somebody is sending a lot of those expensive requests toward a server from a network control or application control such as like web application firewalls, it's gonna get a little bit tricky to come up with solid rate limiting rules against that. Just keep in mind. So from a blue teaming standpoint, if we have batching in our application, there's a few things you could do to try and protect it. The very first thing you could do is to write some kind of a middleware analyzer and check the array length that you're receiving. This is a very basic check, but if you get an array with 100 elements, maybe you wanna drop that. Alternatively, you could do something a little bit more sophisticated called cost-based analysis. And what cost-based analysis is, think of it as like assigning a value to a field or an operation and let's say a value of 10. And in this example, you see we have an array where we're calling the operation backup system twice. And since we assigned a value of 10, that adds up to be 20. And since the max cost that we have on the backend is 10, we're gonna drop that request because it's just too expensive to complete. But overall batching is a feature you can disable. So really only use it where it makes sense and just be aware of the trouble that it can bring along with it. So if your appendixer and batching is disabled, there's a way you can try to break down the service in an alternative way. There's a concept called query aliasing. What query aliasing is, is you basically supply some alias name and you can then call the same query multiple times by supplying different alias names. So this is a, it's not similar to batching in that sense, but it can definitely act the same from a denial of service out of things. You can call a very expensive query 100 times in a single query and the server will have to process those. So just keep in mind that again, just like of the batching queries, it can evade or make it difficult to mitigate from a networking standpoint because it's all going to be under the same GraphQL route. From a booting standpoint, again, to mitigate against that, you would have to build some kind of an analyzer or use what I mentioned before, cost-based analysis. You could write some kind of a middleware that will try to do a count on the number of aliases that you receive and drop it if it's a number that doesn't make sense. But again, there's a significant lift that you have to do yourself in order to protect your own GraphQL implementation. And this is almost like a, I would say that GraphQL is a little bit vulnerable by default and you will see what I mean by that throughout this talk. Another avenue for the denial of service is circular queries. And what circular queries are is basically if you have a schema where you have two types referencing one to another, for example, in the pasting example, you have paste and the paste has an owner, but the other, the vice versa is also like apply. So an owner may have a paste. So if you have a data model where two objects can reference one to another, somebody can then create a query where they just reference these objects to another, one to another. And then they can really build a very complex and deeply nested query and bring down the server or at least cause a significant resource consumption. And if you take that and you chain it with what we learned earlier about batching queries or aliases, then you can really amplify this attack by just abusing features really. So from a boot-keeping standpoint, the fix is relatively trivial in this case. So some implementations, for example, in Ruby, they would allow you to set a maximum depth limit. And so for example, you could set a max depth of 10. And if you receive a query that's nested 99 levels deep, you're gonna drop that query altogether. So again, that value that you set for yourself is what makes sense in your own environment. You need to have some knowledge on the types of queries that you have in your own environment in order to not cause trouble or downtime or drop requests that are benign. Operation name in GraphQL is interesting. Operation name is this optional text field that you can supply along with a query or a mutation, which basically describes what the query and mutation is doing. This is a, I would say, a free text field. You could supply anything and it doesn't have to match anything. So for example, you can see on the right side we're calling getUsers, but we're supplying getPaste as an operation name. The way you can kind of leverage that from a pentesting standpoint is if the site operator is using the operation names as a way to figure out which operations are more common and they're doing some analytics on it. Maybe they're logging it somewhere. Maybe they're naive enough to do some kind of decision-making based on that. Just keep in mind that this is a value that you control as a pentester and some implementations actually allow you to supply special characters as well. So this can become an injection opportunity as well as spoofing. So if you're supplying an operation name that's very benign, but you're actually calling a query that's, I don't know, maybe something that could cause problems, you're effectively masking what you're doing. If the logging on the backend is naive. So on the blue team side of things, to defend against this, it's pretty trivial. You need to have a list of acceptable operation names. And if you receive an operation name that's not on that list, treat it as any other untrusted input, drop that and log it in a safe way so that you at least know that you had some kind of a tampering attempt in the specifically in the operation name. So fairly trivial way to mitigate against this. This is a field of duplication. It's probably one of my favorite and less known graphical things. Some of the implementations today, what they do is they don't really care if you supply the same field multiple times. What I mean by that is you can see on the left side, we're creating a query where we get the name of the owner of all the paste on my website. But we only specified once. And that would what a proper valid trusted quotes would look like. And let's say that you're doing some kind of a timing analysis to see what the response looks like and how fast it is. Let's say that the server took 100 milliseconds to respond. If you tick name, the field name and duplicate that 100 times or 10 times or whatever, and you see that the server takes longer to respond, you will know that they don't really do any kind of deduplication or any kind of intelligent query analysis. And you will soon see in this talk how you can leverage that to really cause a significant resource consumption. So if you want to protect against this, one thing you could do is you can write again, kind of an analyzer that we did dedupe the fields. It's a little bit error prone, so keep in mind. Or you could use call space analysis like I talked before. Another thing you could try to leverage is something called persistent queries. And persistent queries is an interesting mechanism because it's basically allows the client to supply a hash that represents the actual query. So the GraphQL server on the backend will have a list of trusted hashes. And if the client is supplying a hash that's not recognized, you're gonna drop that request. So you can at least have some assurance that the query structure itself wasn't tampered with. But one thing you should know is that you can still pass variables alongside that hash which would get interpreted by the server. So there are still injection opportunities, even when you use this kind of mechanism. So in summary, just so you know, GraphQL is not that different from REST API when it comes to the vulnerabilities themselves. So OS top 10 still very much applies to GraphQL. There are security tools out there both from like a testing from like a red team standpoint but also from blue teaming standpoint, but there's not a whole lot, there's a few. So a few things you should know, OSP's app has an add-on that you could install, which basically will try and test your GraphQL and Fuzzit and so on and so forth. There's also a Burp extension for this that will basically help you construct proper queries and you can then use Burp Suite to test GraphQL setups in a more convenient way. But in overall, GraphQL is fairly young and what I noticed from looking at the various implementations across the different languages is that notice that there is a misalignment between the security features that are offered by the various languages. So for example, PHP may have persistent queries available but Python may not. So there's no consensus around like which features should be available to everybody and there's some gaps between the maturity of different languages. So if you do choose to implement that, make sure that you have all the proper mitigations in place to protect yourself against everything that I just mentioned, which basically come by default when you use GraphQL. So with that, I wanna jump into the attack demo. And what I'm gonna show today is a CVE that I disclosed just a few weeks ago related to a GraphQL plugin in WordPress. So in WordPress, you have this plugin marketplace and one of the plugins is a called WP GraphQL, which basically only does it gives you a out-of-the-box-ready GraphQL interface for your own blog. So pretty popular 100,000 downloads, 10,000 active installations, and it's very easy to download it and it's ready without doing anything special. One thing you should know about this plugin is that once you install it, or at least that was true until a few weeks ago, the batching was enabled by default and there's no way to turn that off. So you install it and there is no way you can mitigate it against some of the attacks that I just demonstrated. If you receive, the plugin is not intelligent enough to drop or reject certain requests that look bizarre or are malformed or constructed in what seems to be malicious kind of construction. And it's partially authenticated. What I mean by that is, WordPress should allow anonymous users to view blog posts and comments and stuff. So the GraphQL API follows that. All of these were resolved in 138, I believe, version 138, but I do wanna show how it looked like a few weeks ago. So let's talk about the exploit. So what we're gonna do is we're gonna get all the comments from, so we're gonna read this diagram kind of bottom up. We're gonna get the comments and posts from GraphQL backed WordPress and then we're gonna duplicate the field comments 10,000 times. And then what we're gonna do is we're gonna utilize the fact that batching is enabled by default and we're gonna take these complex queries, put them together in an array and send them to the WordPress instance. And we're gonna do that using 300 threads. So you can see how complex this query is going to be for the server to actually complete. So with that, I do wanna show a demo and I hope that this works. And if it doesn't work, I have a recording, but let's cross our fingers. So when you install the plugin, you're gonna get this shortcut to basically interact like this interface that you can interact with GraphQL. So this is like the admin view of WordPress. So we can start writing a query here, which will basically grab the content of the posts on the website. So when you run this query, you're gonna get the comments on the page just like you would expect. So what we're gonna do is I'm going to utilize all these weaknesses in order to try and bring down WordPress altogether. So I'm going to kind of tail the logs on the server and see what's happening when we're actually executing this query. So fingers crossed, the server will try to process those because there are no mitigations in place against this. And there's no way for the site operator to actually handle this unless they're putting this behind some kind of web application firewall. So you can already see that we're running out of memory. So my SQL process actually died and it's not able to recover at this point. And if you look at the WordPress instance, you can see that it's down. And this took less than 10 seconds and you can think about the opportunity if you have a lot of nodes sending this kind of traffic towards a server if you haven't protected your graphical implementations. So just so you have some context and how easy it is to bring down a server back by GraphQL without the right mitigations in place. So with that, I wanna jump into a how you can actually go and learn GraphQL yourself. I want to equip you with enough tools and resources so that you can learn yourself. So when I started with GraphQL just a few months ago, the first thing that I looked up was, I want to practice GraphQL by hacking it. And there was no one solid and mature platform to do that. There are a few labs here and there, but I wanted something else. So what I did was I came up with them vulnerable GraphQL application. And if you're familiar with DVWA, it's very similar except it's focused on GraphQL. So everything that I talked about is actually in that web application right now. I put a lot of emphasis on the education part. So there's a lot of resources and things like that for you to leverage. And whether you're a beginner to GraphQL, like I was six months ago, or you've dealt with GraphQL just a little bit, there's something for everybody. So there's two modes you can kind of switch between to harden and unhardened server depending on that, I guess, that your knowledge with GraphQL. And the link to the GitHub repo is at the bottom of the slide. So when you install it, you're gonna get this dashboard. There's a lot of resources and tools and blog posts and things that I kind of threw in there so that you can actually learn as you go about GraphQL and how it looks like when you install GraphQL without any protections in place. And with that, I want to thank everybody for listening and I hope you enjoyed this talk and learn just a little bit about GraphQL and how we can as a security community make it better. Thank you.