 Hello, everyone. Today we will be having a talk on Building a Debian Derivative by Alex Doyle. So please enjoy. Thank you. Hi. So let me just say straight up that I had a lot of 20 minutes for this talk, and then 10 minutes for questions, and I've got 20 totals, so I will be blowing through a few things. So some of the slides, they will be posted up wherever we get the slides posted. I'll figure that out later. But let me introduce myself. I'm Alex Doyle. I'm a Build and Release Engineer at Cumulus Networks. I helped Debianize our build system about three years ago. It went from a monolithic image build to a build that actually built packages and could be upgraded a little more dynamically than it was. Because it's Debian, it's a much more sustainable development model with being able to churn out bug fixes and stuff. So I've got some familiarity with how Cumulus Linux builds. In today's talk, I'll be talking about what makes Debian derivative different from Debian. The components of a derivative build system using Cumulus Linux as an example. And I won't be doing a comparison with existing build systems. I think with derivatives, the requirements of what they're supposed to achieve differ enough that you can't really get a good one-to-one corresponding there. And I'll conclude the talk with problems to expect if you're dealing with a derivative and lessons to be learned from that. This talk is an introductory level talk. I wanted to try and broaden the audience, so I won't be delving too deeply into build system details until the very end. There's going to be a lot of introduction at the beginning. So if you're familiar with how build systems work, the first five minutes of this may be a little boring. So the example I'll be using is Cumulus Linux. It's an operating system for white box network switches. The logical question is what's a white box network switch? And it's a switch where the owner can install their choice of operating system. They're usually less expensive than switches with proprietary operating systems. You can think of it as an embedded PC where the network controller has a dedicated switching ASIC and a whole lot of network ports on it. So as a Debian derivative, Cumulus Linux pretty much looks like Debian, plus switch ASIC code from vendors like Broadcom or Melanox, adding Cumulus-specific software to help interface to the chip and make everything manageable, and then you've got Cumulus Linux. So why would anybody create a derivative? Well, basically a derivative is Debian plus other stuff, and the reasons for other stuff may include software that can't be added to Debian. Like I said, the ASIC code is proprietary. There may be strict package interdependencies. Cumulus uses its own version of the kernel, and the ASIC code is tied very hard into that. I'll talk a little bit about that later. You might need a custom installer for your target hardware, in which case we're installing on switches, a little different than PCs. And you may need to deliver bug fixes faster than upstreaming allows, which is not to say that Cumulus doesn't upstream fixes. We do. It just takes a little longer than them getting out into the field with a regular release. But the Linux kernel networking and free-range routing projects are just two examples where we've contributed a lot of code. So common requirements for derivative. Generally to produce it, you have to build custom packages. You have to include Debian packages. You'll need an installer to deploy it sometimes, and then you need to make repositories and the installer available. So components you're really probably going to need. Build hardware, obviously. At Cumulus, we use shared build systems. We find it's easier to support and maintain a couple of identically configured build systems rather than everybody's laptop. You'll need something for source control. It's Git, because it's well supported. It works, and most people kind of at least a little bit understand it. If you have multiple users using the same build system, you'll need some way of isolating them because as they do build, dependencies will come in that will change the configuration of the system, and hilarity will ensue. So we use trutes. I don't know if anybody's got the pronunciation on that. Okay. Good. I didn't lose a bet there. And it creates a virtual file system to isolate builds. So when the dependencies install, they don't interrupt anybody else. Then once you've had that environment, something to help build with it, we use sBuild, which is also used by the Debian build systems, which sets up the true build environment, installs dependencies, build packages, runs sanity checks on the packaging, and cleans up after itself. Once you've got a package, you need to use ReBrow. So we use Depload to do that. And then once it's in the repository, you need something to actually be the repository software. We use ReBrow, which has nice advantages of being able to sync some packages from upstream and then handle adding our own. Things I found that were nice to have when we started developing things. Server configuration build scripts. You can use Bash, Ansible, whatever works. Basically, the idea is to get a reproducible build server setup. You don't want magic machines where your code only builds on this one particular system because reasons, it's super irritating to deal with. And if you have the scripts, you know what the standard is, and you can apply them to virtual machines and run experiments. So that when you're trying to introduce new features, it's easy enough to check, hide your mistakes from everybody else and then publish stuff that works. You may need build and test automation, so we've got Jenkins to run builds when code gets checked in and to run tests on new builds. Because we're a distribution, we'll be shipping a collection of packages and we need to track that somewhere. So we have a release manifest file. It's basically a collection of packages and versions. It gets used by our installer code to use when installing on a switch. So it basically bundles up all the libraries and takes them with it. And it's also handy for creating previous releases for bug tracking purposes. There's our installer program, which I just mentioned, which puts our code on the target to enable any hardware quirks of the target. You know, hardware tends to have a number of differences, especially when you're dealing with switches, so being able to understand this is slightly different than that or this is a different model, super helpful. And then in the development of this, I found I really needed one tool that would build any package. I'd originally thought I could do this with Jenkins, but I ended up using a writing a cumulus master build tool. Other distributions may have different tools for this. But I needed something that would put all the package build quirks in one place, and it's a good interface for the automation to connect to it, like Jenkins. And then I also found out that with doing a derivative, it was really helpful to have a build tool wrapper, which would help enforce conventions when doing builds and could also simplify the user learning curve by setting defaults. I'll talk a little bit about this in a minute. So the general workflow at this point, if you're a developer, you log into the build system, you check out your code with Git, you build using sbuild in a shrewt, you theoretically test your devs before you commit your code changes. Once the code is committed, Jenkins takes over because it triggers on a code commit, checks out the code from Git, builds the package using the master build tool wrapper, it uploads the build package into the local repository, repro updates its index of repository packages to make it available, and then the distribution manifest file is updated because the new package is now available. But these are general workflows that theoretically work in theory, obviously, but in the real world there's a couple of obnoxious details to take a look at, and I've picked out three of them. One, there's a management problem with understanding where packages come from and what you're based off of. And in Cumulus, we have three different types of packages sort of defined by where they're sourced. First, there's unmodified Debian upstream packages where we pull them from upstream repo and just put them into our local repository mirror because we don't modify them at all. And I'd just like to take a minute here to say a big thank you to the Debian security team for those security patches because every time one comes down and I don't have to deal with it apart from putting it in my repository, I am super happy. Then there's patch packages. So we take upstream source, put Cumulus patches on it, build it and then upload it to our local mirror. The kernel and free range routing, as I've mentioned previously are two examples of that. Then there's Cumulus-owned packages where we are effectively the upstream where we just build and upload. Examples of that would be the proprietary ASIC code from vendors that we have to manage and interface code. The second thing to consider is software release tracks. At Cumulus, we've got two software tracks. One is DEV, which is the initial developer commit, should work, code there needs test and released where the code has been tested in DEV and is promoted by the package maintainer and it's good to ship. So because we have two release tracks we need to double everything else. So we need DEV and release versions of obviously Git has two, you promote from DEV to release. We need two package repositories, one to hold code for the development code and one to hold code for the release. And then we need built-shoot environments that will reference either one of them depending on what we're building. So it's an additional complication. The third thing was reasonable version numbering. So one of the things I had to struggle with coming in was thinking, well, we're building our packages two, three times a day. Aren't we going to have ridiculous version numbers by the time we release? Unfortunately, Debian versioning is thought of this, that if you have a Tilde in the version number, it creates a pre-release package where its version is lower than the package without the Tilde. So for example, package foo1.2 is a higher version number than foo1.2 Tilde7 and foo1.2 Tilde8 is a higher version than foo1.2 Tilde7. When we want to release, we build without the Tilde in the name in the version string, and it will install over the dev builds. So this has a couple of advantages, but since we're doing, you know, messing around with versions anyway, if we put a time stamp in there, then that guarantees new builds have higher priority than old ones and we have an idea of when the package was built. And while we're messing around there anyway, we can put a git hash in so QA people can immediately correlate with a developer to say, hey, this version of code looks like it's causing the problem, take a look at it. And so what this looks like in practice, this is a dev build of the free range routing package. And just to break this down really quickly, the FRR40, that's the upstream version that's based off of the plus CL3U1 is the cumulus patch version that's been applied to that. Then after the Tilde, we have the time stamp starting with 15 and the git hash starting with the .9F and then AMD64 dev means it's AMD64 binary. When you release all that stuff with a time stamp and git hash and Tilde is gone. So lessons learned or I didn't see that coming. So four things that turned out to be more complicated than I initially expected. The first thing was the master build tool. So getting back to this, there were a surprising number of build quirks that I hadn't expected. I thought I could just have Jenkins run sbuild and that would be it. But there are things like the kernel where it has to run make to be able to generate its dev in control file so that sbuild has something to work off of. There's things like putting the pre-release package version in with the time stamp and git hash. The release track that you're trying to build for. What git branch do you want to use? Being able to set that to head or roll it back one version if there's something in there that's breaking the build for the time being is really handy. Determining what repository you're uploading to. Is it dev? Is it release? Updating the installer manifest which has to happen after the build. And then it's also a good point for Jenkins automation so our Jenkins master doesn't have to be an expert on package building and I don't have to know a whole lot about Jenkins. And it also provides the option for me to debug a build without using Jenkins which makes it a little more portable if you happen to be building on a laptop. Then there was a surprising amount of social engineering involved in rolling out build tools and getting people to use them. So things to keep in mind about some developers and I certainly followed to this category so I'm kind of comfortable propagating the stereotype. But they want to build fast which builds clean. Nobody's rebuilding from scratch every time. You want to iterate as fast as possible and get reasonable results every time. Because you're there to write code and run code and build time is downtime. And with this mindset the build tools can sometimes be seen as restraints rather than supports. The corollary to that is any build system that sucks will be worked around. And this will cause to quote Thomas the tank engine confusion and delay as things are slightly different for reasons that aren't immediately obvious. And I think there's maybe a bit more of this mindset with a derivative where the end product focus isn't deviant. Not all developers want to have to understand things like packaging because they're here for the other stuff. That's why they were hired. That's what they're working on. It's not their area of expertise. So sort of given this, how do we work with it? First thing I found is the developers are customers. You incorporate their feedback wherever you can. Don't give them a reason to work around the build system. If there are features that will make their lives easier and you can implement them do. When you can have the developer and the build environments match when developers start debugging something they can use a copy of the build that the build system uses. So they're starting from a common base of a common build environment and that helps keep the build from breaking. I think maybe once every three four months somebody checks in something that breaks a build that built in their environment that didn't on the build tools. So that's been really helpful. And I found it was really useful to simplify common build tasks with a wrapper script. I found it helped enforce consistency as default values were automatically applied. It reduced steps because there would be a series of defaults that would have to be set and by wrapping it it would just apply these things. So that cut down on the opportunities for mistakes. For me it just made support easier because I had less to tell them on the other end as to what to do. And by wrapping it it also gave me the opportunity to upgrade tools without changing the developer workflows. The default shrewt somebody is supposed to be using has been upgraded. I changed the version number that I'm using with it and nobody knows and stuff still works. Or it doesn't and I can immediately roll it back without disrupting people provided nobody caught me. One of the things that comes up unusually frequently when somebody mentions the repo you need to get clarification on that. Are they talking about the Debian package repo? Is it Git repo? Python repo? Did they really mean Docker registry? Is my car being repossessed? Do I need to ride home? I mean any of this stuff it comes up shockingly frequently and because of the semantic overlap with source code repository and Debian package repository you can be talking about cross purposes for about two minutes before somebody goes wait a minute. You meant the other thing. Okay. Supporting multiple architectures got a little bit tricky. Cumulus Linux supports AMD64 and RML based switches and it's been surprisingly difficult I would have thought I could have just found an RML based server somewhere and built off of that and that just hasn't been the case. So we tried a number of workarounds which I'll skip over here for time but we ended up, one of our engineers figured out had run Jesse on a Chromebook and we built off of that which I believe makes us an official Silicon Valley startup because we built on hardware we bought at Fry's. We did eventually find ARM based servers but this was three years after the fact. Another interesting thing with type all packages that because they run on anything they're not architecture specific if you think of things like bash scripts or Python they can basically be run anywhere and when you're building for multiple architectures you don't want to build type all packages for multiple architectures. You want to pick one architecture and go with it and the problem is that if a package builds an architecture specific type and a type all the binary from AMD64 and the all from AMD64 will upload and then the binary from RML and the type all from upload will, type all from the RML will upload and displace the AMD64 one which leads to check some errors when you try to download AMD64 because there's a mismatch in the repository. It's the same package same version, different checksum. So we just let AMD64 do the builds because it's a faster architecture and prohibit RML from building type all. And that seems to work pretty well. And lastly, I wanted to cover building things that depend on other things that you build. This was a question that had come up I think in Montreal, somebody had asked it in a buff. So in Cumulus Linux there's a lot of packages that depend on the kernel. The build chain is kernel headers and the SDKs depend on those the stuff the SDK builds depend on the SDKs. Cumulus code has to sync up with the SDK build. So kernel changes have to propagate downward and build all of this stuff at once. And I'm sure there's a good algorithm someplace for doing this, but the number of packages affected by this was small enough that I just mapped it in the master build tool to say, hey, if anything in this chain builds build everything below it. And there's also a question though of when you're doing this, the mechanics of getting the dependencies to the next package in the chain. So initially I thought, well this is great the kernel will build, it will upload its build products, the ASIC code needs to build, it downloads the kernel stuff and it puts this stuff up and it keeps going back and forth until finally everything is built. Which does work, the problem is this takes a while and it leaves the repository in a consistent state where you're getting future code in present code and when somebody downloads them they don't work. And so the solution for this which is not an incredibly general solution but something specific that took us a while to find and I just want to share it with everybody. If you're using sbuild there's the additional packages flag that will take locally built Debs and bring them into your build environment. So in building the build chain the locally built Debs are kept around and then they're added to the next build and so on and so forth and at the end everything's built and then we push that all into the repository at once. So in conclusion this should be enough to get somebody started on derivative. It's certainly a roadmap I would have liked to have had three years ago when I was doing the build system. Hopefully it's provided some insight into derivative requirements, ways to use existing build tools, deployment issues ones likely to encounter and how to create a build system that keeps everybody happy. Are there questions? I think we've got two or three minutes left. Thank you. Hi my name is Riku Voipia and I've also worked on derivative builds and very much of the similar stuff that has come in front of my job. We also started with Jenkinsen building our own stuff around it and eventually we ended up replacing all of that with open build service. That is a tool created by OpenSuset but it can also build Debs packages. Cool. It has its own idiosynchronicies but it can handle all these things like having all an AMD64 and ARM64 packages coming out from the same package and not overwrite each others. Excellent. If you're free I'd like to touch base with anybody that's dealt with any of this or any of these issues I'll be at the conference all week but being able to share similar points of view and run into different ways we've solved problems is one of the things I love about these conferences so if anybody wants to look me up please do. Any other questions? Alright I think we're good. Thank you very much.