 Okay thank you. So, a small background. Why would you have the idea of trying to do that? So the situation I was faced with was that we had collocation space with very limited reckon it's available. We had a rooted IP space, so not just on the, not that the IPs were not just on the Wheelan, but they had to be rooted. We have a multi-machine virtualization cluster, in this case with Proxmox, which works quite well, but so we needed some infrastructure to root our internet traffic. Ideally, high availability is quite nice, not only for being available, but also for, the thing is, I had to build that for, in this case, the Free Software Foundation Europe, and the thing is, when I'm not being paid, I usually have to do the work whenever I have time, so being able to do things like updates without downtime does have its advantages, or even also if it's a bit resilient, if something breaks, and then I might not have to jump immediately, but maybe being able to do it in the evening is also quite a useful thing. So the basic concept I came up with was, well, let's just have two redundant routing VMs. Using for the routing, having floating IPs for the HA, we do have some minimal firewalling on the routing VMs. It's not so much for the real services, because the people who run the services on the VMs do their own firewalling. It's just that we have some central firewalling so that people don't by accident expose things like NFS to the internet, and yes, that has happened, or also I really don't want to have the management interface of the virtualization cluster on the internet, because why should we? Yes, it's technically possible, but why should we do it? And then we have some small problem that if you don't pay nowadays, you don't really get unlimited IP addresses, especially IPv4. I don't know why people seem to want to have money for that stuff nowadays. And so we had to do, we had to have a concept on how can we work with less IP addresses. So in our case, we decided to use a reverse proxy, in this case, HA proxy, to be able to make the stuff available on IPv4, and the VMs, most of them only have native IPv6 connection. So then, yeah, we had to select the stack. We ended up at the moment with carp for the floating IPs. We use for the minimal firewalling part and something that will come up later, we use PF. We currently run free BSD. I'm also thinking about testing it with open BSD as well. I just didn't have the time, because as always, once the concept was finished, somebody wanted to put it into production. Some of you might have had this problem before. Yeah, and then some small things like PF sync for syncing the firewall state. I mean, it's more or less nothing special, really, but it had to be integrated in some way. So what's the advantage and what's the disadvantages of doing it in VMs? So normally, I tend to run the network infrastructure on physical machines. The problem with the VMs is that on the one hand, you're already depending on the cluster to work for the network to function. So if your cluster breaks, it can get complicated on how to fix the cluster if you don't really have access to the network. That's a bit of an annoying thing. It's also how to best put it. Depending on how much bandwidth you have, you sometimes also tend to have a bit of a performance bottleneck. That's a bit harder to debug if you have a virtual machine. So, but on the other hand, what's the advantages and why did we end up with the VMs? So the one thing is, while running an NGO, it's cost because, well, I mean, if you already have the virtualization cluster, you don't need extra hardware, which is also nice in so far as you don't have to maintain the hardware. It's less stuff that could break physically. And what was also an important thing in this case is you don't need the extra REC space for physical hardware. So how did we go with that? So at some point, I started testing and I immediately ran into problems. So first, I mean, I have done 3PSD firewalls before and usually I didn't really run into performance bottlenecks with somewhat modern hardware. But I had this interesting problem where, in some situations, I could only get about, what was it, I think, like 30 kilobits or something over the connection, which is somewhat less than the gigabit-efinite interface that we have as a nutlink. There were also some strange things happening where, depending on if the firewall machine was on the same as the virtual machine actually creating the traffic that made major differences in performance. And the other, and with some debugging, we found out that we had major packet loss problems and some invalid packets going out that were dropped. So, well, as you do, you have to start debugging stuff. And so the most of the debugging so far has been done on KVM for the virtualization and 3PSD as a guest. I'm currently running two production clusters with that and I'm running a test system on my own hardware, but there it's a bit different because I only have one physical machine available. The interesting part is that the problems were similar, so I had this kind of similar experience, but it was a bit different. So, one of the interesting things I discovered that for one reason or another, the Virgilio driver has flags for disabling hardware offloading. I'm not sure which, I haven't found out yet what exactly they mean with hardware because it's a virtual interface on a virtual bridge. So, but it made major differences in the performance. I had, on one of the clusters, I actually had to disable the hardware offloading features on the I can also give you the slides afterwards that's probably better quality than taking photos of it. So, on one of the clusters, I really had to disable all of the hardware offloading, so definitely the bugs in the physical hardware played part of it. On the other one, not so much. And then one of the other findings that we found out while testing, which I didn't expect, was that some of this cloud native DevOps stuff is not yet IPv6 capable. I didn't realize that. So, talking to GitHub, okay, well, they don't seem to have IPv6. I don't know why. The same was at least at the time of testing Docker Hub. I think they might have IPv6 by now. I also found out that the strange Docker stuff people for one reason or another want to use seems to have still problems with IPv6 only networking. I can't really understand why. So, I started to have to say, well, what can I do to further debug it? I'm currently testing with different hardware a bit to try to differentiate what's the problem of the hardware, what's the problem of the virtualization, and what might or might not be a bug in the guest driver, because all of them can be involved here. And the other thing is, well, what I also have to further debug where the problem actually originates is that if I do live migration of the routing VM, the car gets kind of confused and kills all the network. And then I have to access the remote access console of the physical host and fix that. So, some fixes that I found that improved situation quite a bit was, interestingly, disabling, offloading features in the VM. Then, now, at least, then some of the big problems were disappearing, because from what I've seen is that the free PSD, all the traffic itself that originated on the firewall, that was fine. But most of the traffic that just passed through the router had invalid jacksums and was therefore dropped somewhere on the way. And on the host, I also had to disable a lot of the hardware offload features, as I mentioned previously in the thing. So, another important part is how to manage this stuff. Because, well, I mean, it's well and nice if you build it by hand, then you have your two routing VMs firewalls. It would be ideal if the configuration is somewhat related, because, otherwise, you will just have chaos and also how to handle updates. So, in my case, I decided to script it with Ansible, because I'm relatively familiar with it. And one of the nice, two of the nice things on how to solve the problem and what happens if I destroy my routing because of some upgrade, then one of the nice features is boot environment. So, I can have a relatively simple fallback, even if I can't access the console. And the other thing is switching to virtual console helps, because then I can, even from the IPMI interface, I can access the VM. On how to make sure that configs are really identical, I had to drill that into other people who also have admin access to the machines that, no, you're not going to manually change the firewall config on one of the nodes. Thank you. Because that can lead to some interesting problems in debugging. I'm still, I mean, it is in production and it works. I still have some things that I really want to try. One of the things that I really want to try is other combinations of hypervisors and guests. So, I'm experimenting a bit with beehive, if that behaves differently. On the one hand, because I'm quite interested to migrate some of the stuff to beehive, at least my personal stuff, but also because it would help me to differentiate a bit on what problems originate in the guest, what problems are originating in the virtualization. And I also want to do more comparison tests using OpenPSD. On the one hand, yes, there are some nice features that PF in FreePSD currently does not have. There are also probably some performance differences we will have to see, but also I really like to have multiple options and see what's the difference and really test it. The other thing that I'm still a bit of in the process, unfortunately, that takes longer for me than I would have appreciated is actually I do want to document and publish the work so that other people can see what can be done and maybe, I mean, directly use, well, if it's useful for people, of course, but also to have something to compare against if you run into similar problems. It's always nice to see what have other people done, where are problems, how can you solve them. And the other thing that I really want to do is I really want to try to switch the failover to BGP away from Carp. It has, I mean, not that Carp is necessarily bad, it's just that I had some interesting interactions with the data center when they, even for the explicitly told me that I could use certain IDs, then they used them themselves and then you have a bit of a chaos. Yeah, communication is always hard. Yeah, so that's the basis, so that's the basics. Are there any questions? Ah, yeah, good question. So I do use OPN Sense in production, yes. I run it both at home, I run it at work, I also run the HA cluster. It really works well, I like it. It's just that in my experience using it in the data center was a bit of a glitch. So the problem was a bit that I had to, in some situations, had to work more or less against it, because the basic concepts were not really what I needed. So that's the reason why I decided to do it by hand. And I mean, yes, the thing also does firewalling. On the other hand, my firewall config is like five rows of BF com, so it's not really the complex part of it. Also, yeah, what I forgot, at some point I now had to implement, I had now had to implement IPv4 nut because of the stuff that I wrote that's unfortunately not IPv6 capable, so I now have to run nut in the data center. Yes, for the incoming connections, yes, the problem is not for the incoming connections. We do use HA proxy for the incoming connections. Mostly with proxy protocol and then use the TLS termination on the machines themselves, because it makes it a bit easier to deploy stuff, because we have the thing is that the admin team is a bit distributed, so we have people who are responsible for a specific service. And if the TLS termination is with them, we basically don't have to touch it. We just have to say, yes, this host name, this IP, thank you. And so we have a central HA proxy that runs that mostly does redirect to HTTPS and otherwise as an IProxying of the encrypted TCP connection and just go that way, please, thank you. There are some hacks that we have, because we needed to redirect a few things. And I'm currently in the process of also having an IMAP server behind the HA proxy because of reasons that, yeah. Unfortunately, I couldn't tell the people to please just use IPv6. Yeah. Please. Okay. Okay. I was quite astonished because the thing is we use word error drivers because I would have expected some, I mean, yes, that the Intel driver has some offloading features that might or might not do strange things in the virtual machine. I expect that to some degree, but for the word error, it's like, why, what's going on? Yes. So it seems that, just repeating the question for the recording, it seems that also not seems to be problematic sometimes with the offloading. I really want to get more into it so that I can really document it. So where are exactly our problems so that people can find that and not have to find it the hard way because while I enjoy looking at things and finding out how they work, it's sometimes annoying if you're in a hurry. Let's put it mildly. Any other questions? Okay. Yes. Since there are currently no questions, a few things that I might, that I only brushed by. So I mentioned that the problem with carp. So the thing is, some people, some people overlook it sometimes. So the problem is that a carp and VRP use the same protocol ID. And so VRP is used by some data centers and some providers. Some also say, well, we only do BGP for failover. The problem is that the problem I ran into was that we told them, yes, we're going to use that. And they told us, yes, but it's fine. But please don't use these IDs so that we don't have a collision. And then they seem to have changed the config on their side without notifying us and that created some chaos. Yes? Yeah, I will see. I mean, the thing is, in general, I would really want to look into the BGP is also that I'm also probably have to run a similar setup in a data center where they say we are only going to do BGP. And so far, I've seldomly encountered a data center where they say, well, we won't do BGP with you. And so it seems to be a bit more of a portable solution to me that that's the reason why I also want to go in this direction. And I mean, for the relatively simple use case, BGP is not that complicated. I mean, it's not trivial, but it's not that complicated. Any last questions? Yes, please. Well, that's probably a bit besides the point for the networking part. What we run is Zef, which is quite well integrated in KVM and also well integrated in the Proxmox user interface. So it's relatively simple. Yes, you still have to know what you're doing because otherwise you will shoot yourself with the foot. But it works reasonably well and has survived some interesting split brain situations that I ran into because of major fuckups in the power outlets in the data center because I had some interesting situations where one server had only had power on one power supply, the other had only had power on the other power supply, ones which didn't have power at all. Really nice. Okay. If there are no further questions, then I think we will be able to nearly finish on time. Okay. Thank you guys and well, maybe see you in the closing ceremony and otherwise next year.