 Hey, so this talk is about RP Filter, as you could guess. I believe most of you know what RP Filter is, or at least have some idea about what it does. In this talk, I'm going to make the scope a little bit larger because I'm going to first start developing on what the IETF says about what I will call RPF. So I will make a distinction for the purpose of this call between RPF, the algorithms defined by the IETF, and RP Filter, the Linux kernel implementations. And as we will see, there are several of these implementations in the kernel. So what is RPF? Let's say we have a router here in the middle connected to different networks. And with RPF, the general idea is that the router, instead of just routing packets depending on the destination address, it would also validate the source address of incoming packets. So here, for example, we receive a packet from network blue. And it will verify that the source IP address of the packet actually belongs to network blue and not network red. The objective is to limit problems caused by IP address spoofing on the internet. Let's say, for example, here we have an attacker inside network green. And this attacker wants to attack the victim in blue. So it sends a packet to the server in red. But instead of setting the source IP address with its own address, it used the IP address of the victim. So when the server receives a packet, it replies to the victim. And there are two benefits for the attacker. First, its IP address never leaks to the network. So the victim has no idea where the attack actually comes from. And also, when we choose carefully the server, we have a nice amplification factor. So the attacker can send just small packets. And the server would reply with much bigger packets. So there are protocols that are famous for doing that, like SNMP, for example. So this problem has been known for a very long time. And even some old RFCs, like RFC 1812 requirements for IPv4 routers, it defines a basic algorithm to do some source IP address validation. Unfortunately, even at that time, it was clear that it would break some routing topologies. So it was recommended to not turn it on. Then came RFC 2827, which is more commonly referred to with the best current practice. So BCP 38 that it belongs to. BCP is just a collection of RFC. Unfortunately, this RFC didn't really provide any technical solution. It just talked about using access control list. It just said that. Source IP address validation was important, but it didn't say how to do that. But the IETF has continued to work on this problem. And we now have two other RFCs, RFC 3704 and 8704. And we will see what are the algorithms they define. So this first RFC defines four different flavors of RPF. We have strict RPF, lose RPF, ignoring default routes, and the reverse path forwarding, feasible RPF. The strict RPF is really the simplest and most intuitive version. So here we have a router connected to two gateways, bringing to two different networks. So with strict RPF, what this router does is when it receives a packet from Gateway Blue, it will look at the source IP address. And if this IP address belongs to Network Blue, it will route the packet normally. And if it belongs to another network, either a network red or it's not routed at all, it drops the packet. So this is the original idea that we found even in the oldest RFC. The problem is that it has always been clear that it would break asymmetric routing. So here, for example, we have two nodes, node A and B. They can communicate between each other, between two gateways, blue and red. But node A is configured to only use a Gateway Blue. And node B is configured to only use Gateway Red. So when node B receives a packet from node A from Gateway Blue, it will do the strict RPF control. And it will realize that normally the node A is routed through Gateway Red. Although it received the packet from Gateway Blue, so it will drop the packet. And same thing with node A, it receives packet from node B through Gateway Red, but it would rout it through Gateway Blue. So strict RPF has actually broken communication between our two nodes. So another variant of RPF that was defined was lose RPF. So the lose version doesn't actually take the input interface into account. So it just verifies that the packet is actually routable, that the source IP address of the packet is actually routable. So in this case, when the router receives a packet from Gateway Blue, it will just do a route lookup on the source IP address. And if the source IP address is routable, it will just route the packet normally, even if the source address belongs to Network Red. Obviously, if the router even has a default route, lose RPF is going to let almost any packet go, because as soon as it's routable, it flows. That's why we have the special variant of lose RPF ignoring default routes, where the default route actually can't be used for the route lookup. But it doesn't really make any difference in practice. For example, in our case, Gateway Blue, an attacker from Network Blue could still spoof an IP address from Network Red. So the fourth variant of RPF that was defined in RFC3704 is feasible RPF. Feasible RPF is interesting because it uses some extra information that were not used in the other variants. So we actually use all the information that are available through the different dynamic routing mechanisms. So let's say that we're using BGP. Here we have a different autonomous system. So at the bottom, we have AS0 that announces different routes, while here it's the same route. So it's 2001, DB8, A slash 48. It's announced to AS1 and to AS2. And eventually, AS4 learns about this route. And the AS pass is longer on the red side. So for routing purpose, AS4 doesn't take the route from AS3 into account. But with feasible RPF, for incoming packets, AS4 would accept packets with source IP address 2001, DB8, A slash 48 as a source IP, even though it wouldn't use it to send packets. Because of feasible RPF and because it was announced on the other interface too. But that wasn't enough for all practical use cases. So actually, the IETF has continued working on this problem and has defined a new RFC, which is RFC 87.04, with two new variants of RPF. So they're all based on the idea of a feasible RPF. But yeah, the algorithm gets a little bit more complex. Yet even the acronym, as you can see, is getting complex. So the idea here is, again, to use the information we can collect with BGP. And instead of checking the origin of the source IP address, we will check the reachability of the autonomous system that this source address belongs to. So for example, this is algorithm A. So that's the first algorithm defined in this RFC. We have AS0 that sends that announce a route to AS1 and a different route to AS2. And finally, AS3 receives the announcement for both routes. And with this ERPF, the Enance Visible Pass RPF, will accept packets from any of these prefix, no matter if it's received from AS1 or from AS2. Because these prefix both belong to AS0 and AS0 is the origin AS in both AS paths. In practice, there are also problems with routes that are not propagated to all the autonomous systems. So the IETF has defined algorithm B so that a network administrator could work around this problem by putting interfaces into an interface group. So here we have AS0 who announce a route to AS1 and AS2. It's the same route. But AS2 doesn't propagate this route to AS3. But this is worked around with the administrator putting both interfaces into the same interface group. So now AS3 can send packets with source IP address belonging to this route. Even if it arrives from AS2, it will be accepted by AS3 because it's in the same interface group as AS1. And it would be accepted if it had arrived on AS1. So as you can see, RPF has become something much more complex and complete than just a simple route lookup on the source IP address. And we need some information that are not available by the kernel to actually implement them. OK, so enough with the theory. Now let's see how the Linux channel implements RP filter. So as I said, we have several RP filter implementations. We have one CCTL for RP before, and then the IP tables, IP6 tables, and NF tables modules that all have their own implementation. So that's three IPv4 implementation, two IPv6 implementations. So here how we can configure them. You can see that in the slide offline. All these implementations support strict and lose RPF. There is a special option for lose RPF with all these modules. And beyond the limitation we already saw in the theoretical RPF algorithms, we have to face some kernel-specific problems. So first we have five implementation to keep synchronized. And we have the fact that the kernel doesn't handle IPv4 and IPv6 routing tables in the same way. And we also have kernel advanced routing features. So we'll see in detail what are these problems. So first let's see just how RP filter interacts with regular routing tables. So here we have a route that is defined to use two different output interfaces. So what happens is that with most of the RP filter implementation, we can receive packets from any of these interfaces. Just the NF table IPv6 implementation only allows one of the interfaces. If we do the same kind of configuration, but with more recent commands, let's say, so we use Next Hop groups instead of a single route with several Next Hop. As we did before, we get the exact same results. So I'm not going to go too much into the details about why it's the case, but it happens that the problem stands in how RP filter handles ECMP routes. Because what we did in these two examples is create an equal cost multipass route. So yes, RP filter and the IPv4 and IPv6 implementation handle that in different ways, which leads to different results. Now let's see what happens when we define the same route twice, but with different gateways. So we can do that with IP route append, and let's see what happens. So this time we don't have an ECMP route. So what happens is that most implementations will only accept the first route, and only the IPv6 tables, RP filter implementation, will accept packets from any of these interfaces. And again, that's slightly handled about how the kernel handles ECMP, even though we didn't explicitly want an ECMP route. But IPv6 internally converts these routes into a single ECMP route. And yeah, the rest is implementation detail. I'm not going through that now, because we won't have time. Now let's see a more common use case, which is to define the same route, but with different preference. So here we have preference 1,000 and preference 2,000. They use two different interfaces. Preference is like metric. You can use whatever the keyword, however you call it. It works the same way. So the favorite route is eth0, and eth1 is just the fallback route. So what is RP filter supposed to do? Should it accept packets on eth1, or only on eth0? The result is that almost all implementation accept packets only from eth0. But again, IPv6 tables behave differently here. It's slightly related. The root cause is the same as for the previous example. This is because this implementation constrained the route lookup for our IPv6 RP filter. But yeah, again, that's maybe something we can talk about later if we have time. These are problems that probably could be worked on in the kernel, at least to make the different implementation behave similarly. But there are also problems that really are more fundamental problems and that are not only dependent on how the code is written. So let's talk about policy routing. Policy routing for those who don't know is when we use different routing tables filled with different routes, and we jump from one table to the other depending on some particular patterns. So here, for example, we have the main routing table that uses eth0, and we use eth1 for table 100. And we decide if we do our lookup in the main table or in table 100 depending on the destination port. So here, if destination port is 50,000, we jump to table 100, and if not, we jump, or we stay in the main routing table. So let's see what happens. And in this case, what happens is that only the RP filter CCTV, the IPv4, will jump to table 100. So it will accept packets from eth1 if the source port is 50,000. So here, RP filter just swabbed the source and destination port, which probably is intuitive because actually that's what happens for the IP address, but we'll see that it's not always the right solution. And for all the other implementation, it's simple, the destination port is not taken into account, so we just never jump to table 100. So jumping to table 100 when we have source port 50,000 might look like the good idea because when we send the packet to destination port, the answer will come back with the source and destination ports swapped. But that's not always the case, especially with UDP tunnels like VXLA or Geneve, we always use the same destination port and in this case this behavior is not appropriate. So let's do policy routing but on something different. Let's use a packet field that we can't swap between the request and the response. So let's use the DS field. DS field is kind of like the toss for IPv4 or the traffic class for IPv6. Those should be obsolete and should be replaced by DS field but you get the idea of what it is. So let's see what happens and if there are some differences between our different RP filter implementations. And actually all implementation work more or less the same way. I'll leave some details but we can consider for this talk that they behave the same way. So what happens is that if the DS field from the return pass matches the IP rule then we do the wrap lookup in the table 100. So again that probably looks intuitive but in reality the DSCP doesn't have to be mirrored on the return pass. So if the return packets don't have the same DSCP value than the outgoing packets we're not going to jump in the same table and RP filter will break connectivity. And we can also do policy routing based not on IP packets or IP packets fields but on metadata of the packets. For example we could use the packet mark the socket user ID the input interface and more but we're only going to consider these examples. So packet mark most of well all of the RP filter implementation won't take the packet mark into account for doing the wrap lookup because the problem is that again we need to have the same packet mark on outputs and on input to get to jump to the right table and often it's difficult enough to have the packet mark being symmetric on input and output. So we have to use a special options to make RP filter no matter the implementation respect the packet mark. And we have also problems with a socket user ID so we can jump to a different routing table based on the user ID of the socket that sends the packet so that's for locally generated packets but on the return pass of course we don't receive the packet from a socket we receive them directly from a network interface and we don't even know yet on which socket it will be delivered. So all the RP filter implementation make the wrap lookup with a socket or the user ID zero so as if it was sent by route. And for input interface we have the same kind of problem so if we do some policy routing based on the input interface when we send the packet that has just been routed we know on which input interface we received it but on the return pass how should we do our route lookup? If we want to reverse input and output interface then we should say the input interface is actually the output interface we would use but this also is not known all the time this depends on where the rule is on the net filter hook so that depends not only on the implementation but also on where the rule is inserted So to summarize the problem with policy routing we have different set of metadata available in the transmission and reception pass we have packet information that might be different on transmit and receive pass and even for something as obvious source and destination port where we should just swap the source and destination port that doesn't work all the time So really we have a fundamental asymmetry between the receive and the transmission pass which makes policy routing mostly incompatible with RP filter So to summarize the whole talk RP filter even if we consider just the theoretical part of it and only the IETF work it's not just a simple on-off functionality We have to select which flavor of the algorithm we want to use For advanced algorithms we need some cooperation with dynamic routing daemon We might need for, if you remember, algorithm B some special configuration from the administrator Also we have to keep in mind that RPF was designed for routers and especially for ISPs, not for N-host and yeah, routing daemon, I talked about it And then we have the implementation So we have five different implementations for RP filter which are hard to keep synchronized We have all the complexity of how the kernel manages the kernel route lookups and that has some side effects on the implementations There are some things we could improve like the handling of equal cost multipass in order to make all implementation behave similarly But there are also some fundamental incompatibilities with advanced routing in particular every time well, or many times where we use IP roles Also we don't support and we can't really support currently the advanced RP filter algorithms because we need some cooperation with BGP daemon Okay, so thank you and time for questions Yeah, so the question is can we detect if, automatically, if some IP rules are going to create problems when we use RP filter We could at least detect that we have some special IP rules and then consider this is a potential problem for RP filter But the biggest problem is yeah, what are we going to do if we activate RP filter and we detect that there's going to be a problem Do we just disable RP filter entirely? Because there's no correct way for some of these IP rules there's just no correct way to handle the problem automatically We need some help from the administrator Yes, yes So yeah, the question is is it primarily a mean to manage the problem under the problem of IP spoofing on the internet Yes, from the IETF point of view and the reason why RPF was designed by the IETF Yeah, that's the main reason that's really the reason Yes, so what the IETF recommends is that you activate RPF as close to the customer as possible and if possible, even on the first router And then if you have a very big broadcast network behind this well, probably you also have also security problems if you have a big internet segment Yeah, again On Linux, we can activate it on router or the N-host or RP filter implementation that works similarly on the router on routers and on N-host It just makes less sense on the N-host because if it's not going to change anything for the IP spoofing Yeah, again Yeah, okay, yes, repeat the question The question is should we disable it on the rail by default? I know this question has been answered already some years ago and it was said that no, we shouldn't disable it by default because it's security and we don't disable security and there's no really technical argumentation it's just a fear of accepting packets that were not accepted before So, yeah, the reason is this Yes, no more questions? Okay, thank you