 Hello, my name is Yan. I am an envoy senior maintainer software engineer on Google working on GCP platform Core networking I'm going to talk about some of an obscure subject today, but it's nonetheless. It's very important if Your envoy enforces some sort of security policies like for instance, for example an access control policy for requests and specifically it is the effect of normalization of the URI path on the safety of the security policies Before we go any further Here's sort of like a bird's-eye view on what happens within an envoy when it proxies a request Request comes from downstream client downstream endpoint and the first step with service selection where Envoy takes routing table and some requests of the pro some properties of the request and then determines the route where the request have to go next and The second step if it's configured there's a policy application and conceptually it's the same same process where Envoy looks at the list of policies takes request properties figures out which policy to apply evaluates the policy Depending on the result it may reject the request or let it go through and By far the most common request property that is used for both service selection and policy application Is the path component of the request URL? It's syntax is standardized in RFC 3986 That's the path that's the part that sort of sits after right after the authority or in this example the host name and Goes either up to the end of the URL or up to the question mark or a hashtag depending on whether your request URL has query parameters or the or the fragment and It's just a series of segments separate by forward slashes Representation should use seven-bit ASCII characters and everything else outside of it should be present encoded It can contain dot and dot dot segments Which sort of work similar to how they work in the file system paths? And it also may contain parameters and the reason I go into these details is to show that Conceptually a simple thing can actually have quite a bit of complexity And in fact you can look at two URLs that are visually very very different but in fact may point to the same resource and Here where path normalization comes in it's a transformation that Determines the canonical form of the path which Intermediaries like on what used to you know compare paths for equivalents Which is very important when you you know when you want to find a policy or a service that corresponds to a specific path and It's it's a you know According to the standard it's actually fairly simple procedure. You know all the percent sequences are normalized There are three steps there all of the percent sequences are normalized to the upper case Then anything that doesn't need to be presented encoded is decoded It's a called unreserved set and if you look below down you see that it's actually you know a fairly small subset of characters There that is unreserved and then there is a path segment Normalization you know to talk and the thought sequences are collapsed and the thought sequence actually collapses also the previous segment so the You know standard based path normalization is relatively simple But the problem is that the actual implementations are very often not standard compliant. They evolved over decades There's a lot of craft in them. There is a lot of special cases very often. It's also controlled through configuration So determining what the origin or upstream service will actually do for path normalization is tricky So here's just to kind of title together. I'll give a quick example But they will have a very simple policy Where all requests to the admin endpoint have to be to the admin prefix have to be authorized You know some sort of formalization is talking attached to your request or maybe a peer certificate and all other requests You know, we're just not even gonna look at doesn't matter authorized or not authorized. They'll go through and here's some examples So let's say request to the admin slash change endpoint and path normalization Has no work to do here. It's unchanged We find the prefix admin and request undergoes the authorization check So then a little bit more complicated, you know user dot dot admin in this case the standard compliant Normalization takes out the dot dot in the previous segment, you know produces admin Path matches our prefix and we check for authorization And now here's where the sort of the rubber starts to meet the road Let's replace one of the forward slashes with its percent encoded form percent to F This if you remember, you know from the previous slide, we only decode on the reserve characters forward slash is not one of those So the path normalization actually doesn't change the path at all It doesn't match the admin prefix and such a quest is gonna go through Now the interesting question here did we just Created a sort of a trivial bypass of the security policy And the answer is maybe And it actually depends on what is your upstream server? What's what is your origin server? Just for just as an example, I took you know, we can take Apache very common origin server and It depends on its configuration If you in your configuration set, you know specific option to either off or don't decode There are no problem. So request either gonna be rejected or normalized the same path as unboyed and There's no policy by pass However, if you configure Apache to decode percent, you know percent encoded slashes problem On Apache will decode the forward slash normalized path to admin and Authorized request will be forwarded to the admin endpoint some more examples that's actually a very very small slice of what's possible and It's you know, it really depends on the Different types of you know different types of backends go servers have, you know, their own set of possible bypasses Node.js is its own and But actually the key takeaway that I wanted to sort of point out from this presentation is that there is no One hats one half fits all normalization that an intermediary can do. There is no one right way Even if we do strictly standard compliant normalization It's not enough as you know, as I showed them from the example and the really the right way of doing that is if is to match path normalization on unboy and path normalization on the origin of steam server and In this case, you know for sure that your access control policy will be applied safely as a sort of a little bit, you know a little bit worse example If unboy performs a superset of path normalization than the origin server Then at least you will know that your security policies will be safe But the service selection may end up being wrong, which is normally usually a much lesser evil So for practical suggestions You know the best practical suggestion is know your origin server path normalization it performs, you know What transformation it makes? That's You know, sometimes it's very difficult sometimes you have a Container that you pulled from the internet and it you don't know what's inside it go Java Who knows so there are some I think practical suggestion that can improve the security considerably And the first one is actually just simply enabling The option number to do path normalization the option is off by default. We're changing it in the future We know we realize this is not a very safe way to operate unboy, but it's very often overlooked and Just turning this on Factually enables pretty much most of these standard compliant normalizations and just shrinks down the attack surface quite considerably The second is to enable merging of the slashes There is virtually no Real-world systems that would you know that it would that that was option would break I always recommend it's just a no-brainer. Also removes a very large attack surface And the last one the last option that I wanted to bring up is actually decoding how how the percent to F Sequences are treated and the safest option is just simply to reject Those requests this might not always work. There are some applications that actually, you know use Use this feature and they encode their slashes for one reason or the other And in this case you have to be really careful if you use access point access policies Take a look at, you know What paths they're applied to and what the back end service is doing, but this is really one of the more dangerous policies So things that are coming soon. I don't have a specific date, but we are working on Profiles for path normalizations where it might what it would be if you know the Implementation and configuration options of the upstream you can just enable it and not have to pick through what Node.js or go Do so you can just turn this option on and envoy will behave just like that back-end service And last but not least the problem that haven't been solved yet is what do you do in service meshes where your Back-end services are hit or geniused and you may be running a mix of apaches node.js Who knows what else? There's no solution for that right now. Unfortunately, we're working on this Contributions are always welcome if you have ideas come talk to me speak to me or any maintainers on the envoy That's it for me. Thank you so much You