 Hello and welcome to yet another talk here on kaoszone.tv live from our studio here in Potsdam. This talk will be named reproducible building network infrastructure and will be given to you from Astro who is located in Dresden at C3D2. Für die Deutschen Zuschauer dieser Talk wird übersetzt und ich werde jetzt zurück auf Englisch wechseln. 5 Jahre ago, die C3D2 fiel in eine neue und renovierte Gebäude. Mit dieser Gebäude hat sie die Möglichkeit, die Networkinfrastruktur zu übernehmen und die ISP zu all der Gemeinschaft, die die Gebäude in der Gebäude befindet. Astro will mir sagen, wie hier structured und maybe restructured all of this network infrastructure based on NixOS and I will wish you lots of good information and lots of fun here at this talk reproducible building network infrastructure by Astro. The stage is yours. Hello, welcome back sorry for the issues from the beginning on. I am actually a software developer by heart, but I am also an computer administrator. I have a strong interest in usable network infrastructure because it is required for my communications programs when I write software. So over the time I developed a few strong opinions about how computer networks should be designed so that they actually serve their users and put additional restrictions on them. And I've always asked myself how this can be done actually. And 5 years ago I've had the opportunity to try this at a larger scale and learn a lot. And that's why I'm doing this talk today because I want to share my experience and maybe inspire you to do the same for your community. And before I begin, I will tell you a bit about the location because where our Hexbase was located before in an office complex. This wouldn't have been possible. But then we found the Zentralwerk. And we met these creative, open-minded people in an old factory building. And we immediately fed the table with us. And the environment is really nice. For example, the old factory has towers where we can put up 5-Funk-Node-Styling. And we can install antennas and radios. And that provides a lot of opportunity to have fun with technology here. But the best is there is a ballroom that is a beautiful event venue where we hold our annual conference. It's a really nice atmosphere. So in these buildings, they are built in Ethernet cables to all the apartments. Because there were no tech people before us, we were offered to run it. So this is the opportunity because I was very convinced, there would be a growing chaos organically. So I invited all the interested technical people to propose a structure. And that's what we've been running ever since. And our goals are, of course, the eyeballs. I want to watch some Facebook. And this is often disappointing for me. To be honest, most people would be very happy in a network address translation gateway alone in the Internet. Trust that was Metcalfers law. He postulated that the value of a network grows with the connected devices and users. So, actually, you wouldn't want to be, you want to be connected. Maybe you want to be connected with even faster cables to your neighbors. They are connected in shared infrastructure. Of course, pool their resources, pool their money to get high quality Internet uplink, which means for less money, there's more peak bandwidth for everyone. And because access is a fundamental requirement in the 21st century, there needs to be open. I think it's a shame that there's so few open Wi-Fi in Germany. When I run the network, there has to be open Wi-Fi. And because we want to run an open Wi-Fi in parallel, we actually have to off the wireless access points, which is a kind of big difference to an ordinary ISP. And then, we want to make it useful beyond Internet. We want to enable cooperation. So, that's why we uphold the Internet principle of the Internet in our network. There's no network address translation between neighbors. So peer-to-peer and sharing printers actually works. And by now, there's a few servers for house internal servers. But that's not what this talk is about. This talk is really about the underlying infrastructure. And because I do not want to be the sole responsible person for all that, I tried to enable cooperative administration from the beginning on. So, of course, I want to increase the bus factor. So, if anything happens to me, other people can continue to take care of it. But I also want to invite users to take care of the infrastructure that is beyond their premises, because that would be very attractive to me as well. So, I'm just projecting it up onto them. And indeed, there are a few trusted neighbors from outside the hackerspace that have root access to the server. But in reality, no one really cares until there is an outage. So, how do we enable collaboration and transparency? Of course, like a software project, we have all the configuration in Git repository. At first, five years ago, we started out with Saltstack, which was recommended to me at the time. I never got really happy with that and switched to NixOS this year. And that means the setup is entirely reproducible from the code base. This is great for consistency. So, what you see in the repository is the documentation. There are no files in slash ETC that have been touched by someone. This is luckily kind of enforced by NixOS, but should be done with any other deployment tool as well. And if you contrast that to imperative style, where you actually touch the files in slash ETC, you will forget about them after a few days, which have been modified, which are actually important. And this is even worse if you collaborate with other administrators. So, a central repository, where you have everything that is relevant in one place, is really great for transparency. So, I'm really huge NixOS Fanboy. When there's no lockdown, we have daily code exchange and progress review at the space. So, we get accused of circle jerking. And this is why this talk is not going to be a NixOS advertisement. And indeed, if you prescribe declarative Administration to your collaborators, that can actually turn out to be a blocker for people who are not used to declarative style, who are used to touch files in slash ETC. And that is my experience. But it's better than touching slash ETC. You have to ignore these people. So, people are always calling out for documentation. My experience is that doesn't work. People who do not look into code, do not look into documentation. And there's also this problem with external documentation gets out of sync with the actual state really quickly. So, that's why you best keep it in the repository. And that's what we do. We've written quite up some texts so that every neighbor can get to know the network. But in fact, no one ever reads that, no one ever scans those QR codes. And by now, because people, even the admin team doesn't like to read documentation, I package little packages, little scripts for regular choice. So, they do not have to look up documentation, but just run the commands that are documented in the process. So, how do we design our network? Before we connect people, we actually need to separate the networks. So, they cannot break the neighbor's network with a rogue DHCP server, handing out inlet IP addresses, or even worse, do upspoofing to redirect traffic. So, what we want is an isolated link between a neighbor and our server, and then connect on the IP level with well-defined routing and maybe even firewalls. So, what we need is virtualization, and there is a widespread technology for that called virtual LAN, where you just give a network a number in a switch, and to send multiple networks over one cable. There is this well-supported VLAN packet format. Here you can see on the on top, you can see a normal Ethernet frame, starting with destination and source addresses, and then comes the packet type, like IPv4, IPv6, IPX, you name it. And on the bottom, you have a special packet type for VLAN, which is followed by the VLAN number, the network number, and then the rest of the original packet follows. So, when such a packet is received by a network device that understands that, it can look at the number in that packet and assign it to the proper network. So, this is understood by lots of configurable switches and by Linux and OpenWRT Devices, and this is what we actually use. On top, you can see cheap Wi-Fi Routes that are supported by OpenWRT. Remember, we run them for our users to ensure they do not only run their private Wi-Fi network, but also their public network that is open for anyone, and we run it with OpenWRT, because it runs on many cheap devices, and means freedom, and we get updates, and it's nice. People often tell me to just buy devices from the proper vendor, but that's not what we do here, because we don't actually have a budget. In the middle, you can see big switches. These are not the 10 Euro switches you buy for your desk, but these are manageable switches that can be configured so that you can put sets of ports in different numbered virtual lands. And on the bottom, there's the most important part that we need is a Linux Server, and once there is Linux, we have freedom to do whatever we want. So, how do we configure these devices? The configuration should come from our repository, and from that data, we just generate expect scripts. Expect is a domain-specific language that just sends output and expects input, and that's how we can control tenant and SSH connections to these devices. And I recognize this as an archaic way to do it, but it works with any device that is configurable over the command line, and we don't need vendor-specific tools to use our devices, because we got them for free, they are old, there's no current software, some network switches require configuration via web interface that only supports Internet Explorer 6, and it's really great if you discover a hidden command line via Internet, where you can actually do these things reproducibly. But in the end, this technology really sucks, and that there are no updates is a huge problem, because this network infrastructure is not only a smartphone. I rather have a big Linux box with 48 Ethernet ports instead. So, how do we segment our network? We decided to go for the following network types. At first, we have a management network, which is not really accessible. This is just for getting to the network devices and reconfiguring them with our generated scripts. Then we have a core network, which is just at the center. This is where all the routers are. We run a routing protocol, open shortest pass first, so that there's a consistent view of the specific routes, and all the routers are connected there. And for every network where there are clients, we have a gateway that is between the core network and those client networks. And those client networks is one for services, one for the hackerspace, one for the open Wi-Fi, and many for every neighbor. And then there are isolated networks, because sometimes people are afraid of doing networking with CCC people, so we can connect them with their modem in the basement and provide an isolated Ethernet link. This has also been very useful for conferences like Datanspuren, where the VOC wanted to do audio-video bridges or put their compute nodes in another room. So that's a lot of flexibility that you get when you have manageable switches put everywhere. This is a graphics-generated visualization from our configuration data, just a side product. And Rectangular, all the switches, circular, no, and hexagonal are all the access points. And indeed, cheap open WRT access points are manageable switches. A lot of them have configurable switching chips inside to separate all those Ethernet ports. So, now that the physical structure is solved, let's connect these networks logically with IPv4 and IPv6. Oh, that's the wrong image. This is still the physical structure. So, we have a star topology topology topology with the core at the center. And this is where the routers are. And running a routing protocol means we can add routers without reconfiguring all the others. So we don't have full flexibility. We have no constraints by embedded hardware. I know there is hardware that can do IP and IPv6 routing. But on Linux, there is just much more potential. So, for the routers, I decided to have the granularity at the router level. So, I put them in Linux containers. This is not like Docker, but this is full NixOS in LXC, where we can actually have multiple network interfaces per container. And so we can bring the core network and one of the access networks to a container. And the VLAN is handed by the Linux host that bridges a network into the container. So, what do the router containers do? Of course, they do routing, nut routing for the Internet uplinks and for the routers between the core net and the access networks. There are DHCP servers for handing out IPv4 addresses. There's router advisements for handing out IPv6 addresses. And neighbors can optionally get a firewall if they do not want incoming connections. So, when you have shared infrastructure, people are always afraid of the leeches and cedars. That's why we need ... That's why we need packet scheduling on the sending interfaces. And for quite some time, we used control delay, shape to our upstream bandwidth. This shaper is also the default for WOT by now. So, this also keeps queues short on Wi-Fi. But by now, there is this cake shaper called common applications kept enhanced that builds on top of code and brings many more integrated features for the most snappy Internet access. So, now that our network runs and provides Internet to people, I have spent a few thoughts for upholding quality. So, we run our network on cheap old devices, where there are no replacements. There's no redundancy because there's simply no similar devices left. That means we have the best monitoring. People call me when they don't have Internet. But of course, I want to make people happy because they have their Wi-Fi devices on-premises, where we can't go and check cables. I was afraid of cables becoming unplugged, which would mean there would be dysfunctional Wi-Fi networks, which is why I made a little crown job that runs every few minutes and checks if the server is still reachable. And if it's not, it will shut down the Wi-Fi, avoiding broken Wi-Fi networks. By now, we have multiple Internet connections. Originally, I planned that multiple neighbors can pull into one Internet connection, turns out one big Internet connection is enough for everyone. So, people don't actually care about which technology they use to get to the Internet. They just want Internet, and I want to give them the fastest experience. So, actually everyone is routed over the fastest Internet connection, except for the public network, which you can't route directly, because we are in Germany. This is why we use VPN providers. But there's more Internet connections, and I use them as fallback. Because we already run the OSPF Routing Protocol between routers, we have some dynamic sharing of the routing state, but the OSPF protocol has a consistent view of the routing table in a network. That means if you put multiple routing, multiple default routes for Internet access in there, the network will decide on just one router for everyone. So, I was looking for a solution with that, and then I discovered that OSPF can coexist on one network in multiple Instances, and you can actually configure this in BERT, the Routing Protocol Implementation, to select routes from these OSPF Instances with a specific preference. So, we removed the decision process from the protocol into the routing suit and have some preference for which internet router to take if the first one is down. And this works very well. Due to that, we don't have public IP addresses inside the network, where we don't have public IPv4 addresses. We have public IPv6 addresses in the network, but that works too, because on the Internet uplinks that are not associated with these addresses, we use NAT66. And on the Hexbase, because we have multiple Internet connections, I actually provide multiple default routes. So, users can actually add another default route, take another way to the Internet, just for technical people. So, because we have a few servers now, the do-services, I am also looking into redundancy for the network. By now, we have a second server that is on cold standby. I boot it every month to do updates to deploy the actual state, but then I shut it down again, because a server needs a lot of energy. But I do now have also looked into using pacemaker, because these containers, they are pretty independent from their host. They can be started on any host, on any server. So, that's actually really well suitable for high availability and to start them on another server when the first server has gone down. So, just to show you a code, a slide with code. This is how our Nixflake looks like in excerpts. We have Nixflakes configuration for all the containers, for all the servers. And then we boot packages from that. Packages containing Linux that we can, with the tool script, switch to very quickly and in a kind of atomic way. And we can also roll that back in a very quick fashion. And what you can also see here is that device configuration scripts, just packages that you can run with NixRun and do the deployment that way. So, it's really nice interface. And Nixflakes gives us perfect reproduzability by pinning all the inputs, all the versions, all the state that goes into your code. And NixOS also provides facilities to actually start the system in a local virtual machine very quickly. So, I can actually develop all of that in, on my local machine and test it. And once it runs, I can deploy this production and that has worked really well all the time. So, in all the time, we connected three isolated networks, one hackerspace network, 42 neighbor networks. We run 41 Wi-Fi access points with open Wi-Fi and neighbor networks. And if you listen to the numbers, that is less Wi-Fi access points than neighbor networks, people can actually share plastic routers as well. We run seven manageable switches. We have one active and one cold standby server. We have six services and the entire house is reachable globally via IPv6 and reachable from DN42. We have DNS infrastructure, a few internet connections. And I have to say, once the stuff is running smoothly, maintenance gets reduced to keeping devices updated, onboarding new participants and extending the network. I get a lot of satisfaction from this project because it actually provides an important service to people I know. I want you to think about it. You can do it, this too. Once you automate your infrastructure, you can do it very, very easily for others. I want you to ensure enter and reachability, which is easy with IPv6 nowadays and promote usage of the internet, how it is supposed to work end-to-end so that everyone can be a sender. Thank you for listening. Thanks so much Astro for your great talk, reproducible building network infrastructure. Back here on the Herat stage, we of course collected some questions for you. The first one would be, did you have any non-technical issues with open public Wi-Fi? I mean, you already talked about it a bit. You explicitly use VPN for those ones. But what is your experience running those networks on a kind of non-technical level? Legal level, I guess. Because we were very aware of the legal issues in Germany, we decided to use a VPN provider from the beginning on. But the other aspect is that people don't scan my QR codes, people don't read my documentation, they just say, oh, there's a public Wi-Fi, I can use it. That's okay. And years later, they come to me and say, what, I can get like a private Wi-Fi. I've been using this open Wi-Fi all the time and people have been printing on my printer. I don't know who it was. All right, so this is also a kind of a positive experience that is possible as well as long as you're kind of legally safe to run your network. So, there's another question. Well, you made a bit of advertisement for NixOS, but there's a question, why didn't you choose GUX, the GNUL GUX Linux for your system? Too many parameters. No, there's a big NixOS community here and I actually like the language a lot. Okay, so, this would be, would have been an alternative, but you're clearly a NixOS fan and so you just decided to use that. Yes, I am a NixOS Fanboy, there's like a sort stack Ansible Chef and Puppet, but I really think NixOS is best. Nice. Another question we collected here is, do you have or run a pipeline on your Git repository, which is kind of related to the NixOS? So, the person that asks it says, I have no idea how NixOS is really working. So, is it kind of like a pipeline where you can deploy every change to your overall network infrastructure then? Or is it kind of implicit within NixOS then? Kind of implicit. So, all the deployment is self-scripted and was very trivial to do with NixOS. All right. So, you just run the updates on NixOS and everyone just gets the latest infrastructure there. All right, with this. And yeah, with another thanks to you, Astro, to Dresden, we can end this talk. And again, thanks for watching here on KAUS, ZONE TV, live from the Potsdam stage. See you for the next talk.