 My name is Peter Hessler, I am with the OpenBSD organization and I also work for a company called Vantronic Secure Networks in Germany. And is this working? Okay, now it's working, sorry about that. And my talk of today will be on using routing tables and routing domains in networks. So many of you are probably asking, you know, is this working? No. Okay. I pushed my foot, didn't know work. I blame Henning. So the, okay. So a lot of you are asking, what are these? So ordinarily you would have a single routing table inside the kernel and that chooses where packets will go. So what this does is we give you two additional mechanisms. So the first one is an R table, which is an alternative routing table, but still within the same overall instance. The key point is that the IP addresses and networks cannot overlap, that they have to be unique within the entire system. And you can have multiple routing tables within the same R domain, which then brings us to what is an R domain? It's completely independent routing table instance. So for example, you can have the 10 slash 8 network a dozen times on your system. And this can talk to the individual networks that it's connected to without interfering with the other behavior. In OpenBSD, these are, the interface itself is assigned to a routing domain. And that's what determines where the packets leave the system. It also determines which routing table it's assigned to when packets come in. And R domains always contain at least one R table. They can create more, but this is a not very common use case. Big caveat for now. IPv4 only. Yes. Thank you Henning. I am working on bringing IPv6 support to this. Right now I can pass packets for about four seconds and then they get lost somewhere. So over 90% of the way there, we just have the remaining 90% of the code to write. So this, using R tables in R domains, there's two very common terminology they'll be used. Centralized mostly by Cisco, VRF light, and then full VRF. So VRF light is multiple routing tables, multiple routing domains. It's done by hand. It's all centralized within one or two machines. It's common in your smaller single location, small number of location systems. And as I mentioned, the big advantage is that you'll need a single system to use it. You don't need to connect to anybody else. You don't have to explain to them what your network configuration is. Probably some good examples are like small to medium size ISPs with a number of customers connecting to it. You give the customers individual routing domains that way they cannot get access to the other routing domains for alternative customers. And then you have full VRF. This is fairly commonly known as MPLS. Basically this is for large enterprises with many multiple sites connecting up all together, uses BGP, LDP, which I will explain in a few minutes. And so a good example of VRF and MPLS would be like, for example, like Deutsche Telekom. They have hundreds and thousands of locations all over the world, and they want to have a single network within their organization, but they don't want to advertise this network over the wide internet. So when you are doing this, there are a number of things that you need to be aware of. The most important one is default route all of the things. Always default route, because when we take, when we receive an incoming packet, we will check if we have a route to the destination. And this happens before you have any chance to steal the packet or to move the packet or to decide where it's going to go. It's something we should look into, but this is how it is today. This is how it's been for quite some time. So this is an extremely common mistake at the company I work for, Vantronics. Probably 40% of our support calls for there's a problem with my new R domain is this problem. It's a very common mistake. It's very easy to overlook. Which means debugging can be painful sometimes, because which route, which R domain is it using? It's not using the normal interfaces. It's using a slightly different area, which also means so which route will it use? Ordinarily, traffic within a routing domain will stay within that routing domain unless you steal it possibly with PF. PF is able to make decisions on which routing domain a packet came in on and should go to. So a very common configuration of ViraFlight. So normally ViraFlight is the more common case that I've seen. As I mentioned earlier, this is very common in like a shared infrastructure for an ISP. You have, let's say, 20 customers all connecting to your network, and they need access to the raw internet and your shared management systems, but they also need to be completely separated from the other customers. Everybody uses 10 slash 8, everybody uses 192, 168, and everybody uses all the other internal private networks. And so before R domains, you would have to have completely separate routers and then that's all the boxes independently to get to your main network. That's a lot of boxes, it's a lot of power, it's a lot of cooling, and it's going to cost you a lot of money to actually buy all this stuff. So instead, you can synchronize them all to a single machine and have this route for your network. Very commonly you'll have things like a shared backup system that'll connect to the customer's location, monitoring of all the customer's location, and that'll need to have access to all of the different R domains in their system. They'll need to have access to all of the customer systems, so they'll use the R domains to jump over and get access to those. And then you have full VRF. This is, again, primarily MPLS. You use LDP, which is the label distribution protocol, and this is a way of telling the other MPLS routers on your network who you are and how they should propagate your route system. And then over this, you use BGP to do the translation, to also advertise your internal routes over this external route protocol. BGP builds up some, essentially what's called, they're just VPN tunnels, but with no encryption at all, so they can be really fast and done in hardware on the bigger, more expensive, like Cisco routers and things that have no CPU, but a lot of dedicated hardware that will process your networks. Sadly, I don't have a lot of experience in running the full VRF. The vast majority of our customers are doing just simple VRF light with R domains on their machines themselves. So after we got this written, we started to implement this and actually deploy this. Now, as I'm sure you all know, testing a system and designing something kind of new is one thing, but actually running it and having real traffic go over it is completely another thing. So first bit is a way to execute commands within a dedicated R domain. So for example, for route, we added the exec command. So normally when you run a system, everything is in R domain zero, but if you need to run, for example, ping or a server or anything else, you need to jump over to the other one. We originally developed the route exec for testing, just so that way we did not have to add support to everything individually. It turns out it's very, very useful. The great thing to use is that we don't have to add support to everything. And this is now the recommended way to start multiple services and multiple daemons in different R domains. So if you want to have a web server in, say, R domain 20, you would just write exec g20 and then Apache. But, of course, there's a few necessary network tools and a few daemons where this doesn't quite work out, which I will describe in just a bit. The next part is when you add an IP address or when you add an R domain to an interface, then the question is what should we do if there's any IP address already configured on it? Should we keep it? Should we delete it? It turns out the best solution is to just simply erase the configuration on that interface. So you need to re-add the IP address and net mask and all that configurations after you assign the R domain. You can have dedicated tagged VLANs in a different R domain than the parent interface. It's completely legal. It's just encapsulated packets. Not a problem. Carp for failing over does need to be in the same R domain as its parent because incoming packets come on Carp, outgoing go on the parent. Having them in different will create a lot of problems. We ran into some problems with FTP proxy. So when going through that, you need a helper because FTP is just an incredibly retarded protocol. The problem is that sometimes you don't want to just change it from any protocol to R domain 0 or from R domain 0 to any. You need to do both. Which is why the source and destinations R domains matter. You need to be able to make decisions and to assign them for this. Now, as I mentioned, with Routexec, you want to run some domains multiple times. This gets extremely entertaining when you try and run NTPD. So you want to start serving time to more systems. Well, that's always a very good thing. Everything should be in sync. So for everything else, you start it again. But with NTP, it tries to synchronize your clock as well. So you start two or five NTP daemons. And they're all syncing the clock. And they all think they're the master. So after about 30 minutes of real time, my laptop is now five hours ahead. And after leaving it overnight, it's now next year. So you definitely don't want to do that. Obviously we do, yes. Because customers can be in different locations. And we need to be careful with this. So NTPD is one of these daemons that actually does run. That actually does have internal knowledge of how our domains work. So that way you can say I have my server in one set of our domains, like in the management network. I am providing time for people in these other our domains as well. And then another thing we needed to add was not just identify a packet and then send it to a different our domain or our table, we need to be able to make filtering decisions in the firewall based on the incoming our domain. Because with our domain, since you can have multiple IP addresses, again the 10 slash 8 network is no longer just the one network you've always been used to. So then you need to be able to specify where it came in on. So when designing your network, make sure you always add a default route. Again, it doesn't have to point to a real place. If you're only going to be stealing the packets with PF and putting them into another one. But you definitely have to have this. Normally you won't see this because you always have a default route to your ISP or if you're in BGP then you get all of the routes from the internet anyways. So it's so common I need to emphasize this many times. There are some neat tricks that you can do with PF, which I will explain in a few minutes. Just be aware of what you can and cannot do. And it's very helpful to spend as much time as you can in the planning stage because it is complicated. You will have to think about which R domain is traffic coming in on, where do I want to route it, how do I want to firewall these things off. And since a lot of people are simply not used to having multiple routing tables, it takes a few extra minutes to just work yourself through all these problems. So definitely plan it correctly. So here is a very, very simple setup. I assigned my interface to R domain 1. And then I assigned, of course, the 10 network because this is what everybody uses. And then I had a default. Personally, as soon as I assigned the first address, I assigned a route. So that way I don't forget later. And then you can do nifty tricks in PF. So this I will explain a bit more in detail. So we have anchors, which allow you to just have a set of rules that only apply if the first line matches, basically an and statement. So for example, for the customer, all of their traffic on R domain 15, you block all the traffic by default, allow ICMP traffic, so pings and trace out work. And then you allow access to just HTTP. Of course, when you're doing, you should do more specific, depending on which direction things are going. So that way you don't have any stealth web servers. Let's just merely show you what, that you can add all sorts of rules into here. This next rule, pass in on R domain 2, R table 4, will, for any incoming traffic that is received on R domain 2, it'll be sent out the system in the R table number 4. This rule does not do NAT, so in that case you would, you probably should have separate IP addresses, otherwise things will get confused. But this is how you can just steal traffic and send it over. And then the final rule is a NAT. So you pass out from this network, and then you can simply assign it to the outbound table. This is actually a, this is a rule that one of our customer uses. That's how they get access for a lot of things up to the internet. You can have your default route going over non R table 0. It's perfectly fine. They're basically full-featured running tables. So that's pretty much the end of my talk. I want to give special thanks to Henning for actually doing a lot of the low-level lifting work. Please mention why you didn't, because I don't quite remember the story. So I have a couple cases that I could not cover with that. So I had this shiny new upper on table, but I could not get rid of the stuff that I wanted to, this to replace. So we had this, we had some ideas about argument, but I was like shit, it's plain, but the code's in, it's nice, but what do we do with this now? Argument is nice, but I'm not interested in this. And this is where it all gets picked up. And this one? Basic argument. He's really the one who didn't polish this to the point that it's a little bit more complex. Yeah, so part of what I do at Ventronics is I write a little bit of code, but I mostly do the customer support. And so I'm able to remember the questions that my customers are asking, and the type of networks they're trying to design. And so I actually have to deal with this stuff. And so thankfully I'm able to help polish it up, get it working better, discover all of these evil, evil things. Yeah, so Claudio actually took a look at what Cisco, all the Cisco documents about this, describing how they're implementing it, decided this actually is a good idea and that we should implement it as well, relatively similarly. He also had to put up with all my just crazy questions in the beginning, especially the default route case that screwed me over a couple of times at the very, very beginning. Right, yeah. No, of course, yeah. So as I mentioned earlier, we have multiple routing tables, but they all exist within the same routing domain and that you can always move up in the tables, but you cannot move to another domain unless you explicitly declare it in PF or you do loopback connection back into the system. There are cases where I have wanted to do this, where that would have solved several problems. Yes. So, any questions? There is not a public benchmark, but at Vantronics we have benchmarked it and we found basically zero difference in performance with this. The only hit is a bit more memories used in the kernel to store an additional routing table. However, the additional routing tables and additional routing domains are only created on demand, so in the normal use case of, I just want my single default route, you're not wasting anything extra. Yeah, so the question is, do we need dedicated physical interfaces or can we virtualize this? The answer is in OpenBSD, VLANs are real interfaces and so you can have as many VLANs on top of the same physical interface as you would like, up to the RFC maximum of I think 4,096, and each VLAN can be a different art domain if you would like. I believe we have a hard-coded limit of 256 or 1024 art domains because it's an array. It's 256, okay. So that's the limit of art domains in the defaults. It's one defined if you want to change this, but you would of course have to recompile your kernel. But if you are using physical interfaces, you would have to have one art domain per interface. You cannot mix them unless you have a child, I guess, like a VLAN interface. That code has not been committed. Yes. You probably can play evil games with the Ether and Bridge and other of our pseudo interfaces. Yes, we can do stacked VLANs. Yes, the answer is one physical interface, 100 VLANs, 100 different art domains, completely legit, which is how most of our customers are running it over 10 gig, 10 gig to the switch, and then just as many VLANs as they can throw on it. So, any other questions? That's something for the future. I personally have not looked at it. We do support larger than 1,500 frame size on many Ethernet cards as long as the card itself supports and the driver supports jumbos. You of course can configure this manually, but we don't have an automatic method of changing the MTU size for this. I'm sorry? Yes, so the question is how do you manage the ARP table? And ARP is independent for each art domain, and that's my problem right now with the D6 code is the Neighbor Discovery Protocol, essentially the IPv6 version of ARP. That has a few bugs still left in it that I need to fix. But I believe once that's fixed soon, then that can be committed. Anything else? At our largest customer, we have 40 art domains. The design of the network is not as optimal as it could be. They only really need to use three. Yeah, but they insisted on using this feature, and so actually thanks to them, that's how we discovered a lot of these issues and things that needed more polish on it. Yes. The fastest we can get is 9 gigabit. 9 gigabit with the 1500 frame size, and that is without PF, that's pure routing. I believe with PF we can get 7.5 or 8 gigabit. IX, which is the Intel 10 gigabit card, and this is on, I don't remember all the details, but reasonably new, super fast on processors. It's the HPDL 360 generation latest. So the question is, does this use multiple threads? So is it MP in the kernel? And the answer is in OpenBSD? No. Not yet. Not yet. We can do MP in user land, but right now the kernel still has the big lock and still on single CPU. There's a couple of syscalls, but nothing within the routing area. So yeah, unfortunately now we are still limited to a single CPU. There is work being done on this. Of course we have very high interest in this. The newer network cards allow you to assign interrupt queues to independent CPUs. That way you can actually get quite a bit of performance out of them. Okay, so the question is, so I understand correctly, how many interfaces can be part of a routing domain? And the answer is as many as you want. You're allowed to have... So of course in OpenBSD, the default routing domain is RoutingDomain zero, and so all the interfaces show up there. You can assign as many interfaces as you want to any arbitrary R domain. So the question is, can you have the same routing table just copied multiple times? And the answer is you would have to create it manually. There is not an automatic way to simply copy it over, because that's an extremely rare case. In the vast majority of cases I've seen, it's been completely different networks, even within the individual routing domains. So the most common case is you have to do it yourself. Anything else? Okay, so thank you very much.