 This is a talk called the reproducible build zoo. My name is Vagrant Cascadian. And I'll be telling you about the rest. So reproducible builds, one of the main places where a lot of this work is going on, is at reproduciblebuilds.org. And what we mean by reproducible is packages with the same source code built with the same tool chain should come out identical bit by bit so that you can simply do a checksum on it, and it'll come out exactly the same. All right. So source code, hopefully you're all familiar with it. Tool chain, things like GCC, or possibly even what shell you're running. But well, actually, the shell shouldn't matter. And identical, we're talking specifically about bit by bit, not like it behaves the same or anything along those lines. So source code is readable and writable by people. Computers don't run source code directly. They run binary code. So how can we tell that the binary code the computer is running was produced from the source code? Because all of the claims about free and open source software rely on the fact that the source code you're auditing, editing, fixing is actually what is producing the binary. So we can use checksums. I wrote a simple Python program, outputs two, wrote. And then we've done a checksum on it. And then we did a major refactor of this code here. And it still produces the same checksum. I'm really glad this presentation format was big enough to use SHA256 sum. In my last talk, I had to use 75 sum because it wouldn't fit on the screen. So here you can take the result of your build process and make checksums of it. So the source code plus the build environment plus the build instructions, they should result in bit by bit identical copies. And the important part is anybody can verify the result. This is the key difference between reproducible builds and some other projects that have been working on similar things. So here a random person from the audience could recompile a package that I built. And they should get the same result when it's properly reproducible. And there's a more elaborate definition up there. So there are some security implications to the fact that when you normally build a piece of software, it doesn't come out identical. So a while back, there was a remote exploit in OpenSSH that resulted in a single byte difference in the resulting binary. So tiny changes can make huge differences in the results. And more recently and more specifically, one of the things reproducible builds attempts to address is compromises in the tool chain, whether malicious or unintentional. And so in 2015, XcodeGhost was released, discovered in the wild. And it was a compromised Apple developer tool chain. And it compromised thousands of packages actually in the wild. So the reproducible builds work that the current iteration kind of happened, much of the work happened in 2013 and 2014. But at the time, it was this theoretical thing. Well, along comes 2015 with XcodeGhost. And it demonstrates that, in fact, this is a real world problem. We need to be able to identify compromised tool chains, not just to be able to audit source code. How did I get involved in all this mess? I'm a Debian developer. I maintain a number of packages in Debian, including the Uboot bootloader. Anybody familiar with Uboot in the audience? Yeah, yeah, figured. So Uboot was marked as reproducible. And I knew this just wasn't possible, because I ran tests on Uboot boards. And every time I booted them, it showed the build time in the output of the binary. So I knew that was just not possible. It couldn't possibly be reproducible. The reason was we were only testing the AMD64 architecture. So it built the tools, which actually great. They were reproducible. But the actual binaries you run on an arm board or something like that, those still had unreproducible things in it. So I started getting involved. And I figured out I can build this infrastructure to test reproducibility myself, or I can offer it up to the community and test other packages as well. So just like in Uboot, where it was embedding the timestamp in the binary, one of the main things you can do to improve the reproducibility of your code is remove embedded timestamps of the build date. No timestamps typically aren't the most meaningful identifier of a particular build. So you can just remove the timestamps, or if you really have to, if your project's been around for ages, it's always had to build timestamps. You've got somebody in your community who says, no, we absolutely must keep the timestamps. We can use source date epoch. Source date epoch basically is a variable that's set in the build environment. That's the number of seconds since the Unix clock. So it's a huge, big, long number of seconds. I think it's approaching 1 and 1 half billion seconds if I'm not mistaken. So this is a specification that can be used to modify the binary. Like in Uboot, we modified the binary to respect this environment variable and inject the timestamp it specified. And that way you can use something like this build was done with a particular commit ID or it can meet a build, a timestamp from your change log, anything along those lines. So if you really have to keep timestamps, use source date epoch. But it's even better if you strip the timestamps from your code entirely. Another common problem is locals. So I don't know how many of you feel like you natively speak C as in Unix. A linguist would probably have a hard time identifying the difference between the locale C and the language English I've spoken in the United States rendered in the UTF-8 character format. But if you'll notice, the results on a sort are locale dependent. So we get the capital A, capital B, lowercase a, lowercase b when you're sorting on that, and the lowercase a, uppercase a, so on. So locale, and there are locales that do even the stranger things than this. So it's really important to take into account the locale of your builds. File sort order is really important. The file system you're building on can actually render the files in a different order. I don't believe there are typically guarantees on what sort order the files come out when you do a reader. So typically, you just have to add some sort of sorting mechanism to make sure your inputs are sanitized. This is probably one of the hardest ones. The build path, say you build in your home directory, you build in some temporary directory, you build there. That shouldn't, in most cases, get embedded into the binary result, but it often does, especially when you get into things like debugging symbols. This is kind of one of the main last mile things that we need to figure out for reproducible builds. There's been ongoing work in GCC and other major tool chains. Some patches were already accepted and we're developing a specification that'll be similar to the source state epoch. So to the hardware, which is surely why you're all here. Your typical build farm might look something like this. In fact, the AMD builders that are doing the same work probably look in a farm much like this. You have rows upon rows of nearly identical hardware. And so I thought, well, we're trying to vary things like locale and the time of build and so on. Why don't we actually vary it at the hardware level? In this way, there are lots of random boards all over the place. I had a handful of them sitting at home doing nothing that I saved for U-boot testing now and again, but most of the times they just sat idle. They weren't very powerful or anything, but it would allow us to get beyond this homogeneity of a build environment. So in late 2015, we enabled a few build machines, some dual core and some quad core machines. Nothing impressive, but it got the project started and it also got our test environment to work with remote execution environments. Went live in September, building around 2,000 source packages per day, but at that rate, it would take over 100 days to build all of the Debian Archive and then there would be packages uploaded in the meantime, which might not be built or it was just way too long. So I thought about this awhile and I decided, why don't I add some new boards? So the first four boards I added, this was in the first iteration. One was a banana pie, fairly low specs, one gig of RAM, dual core, had SATA and another one, fairly similar specs, dual core, one gig of RAM. This is kind of a pattern. M SATA instead of SATA, but basically the same thing. A Wombord quad, this was our first quad core machine and it had two gigs of RAM and quad core, we could build a lot more with this. I think it gets roughly double what the other boards get per day on average. And here's another one, the Cubox i4 Pro, very similar specs to the Wombord, but for whatever reason it doesn't seem to build as fast. I don't know. So then I'd waited for a long time. All the boards I could find that had SATA, it was a very limited pool of boards. So I decided, well, let's try a board with USB3. Maybe that'll be fast enough to handle some of these builds. So we added the Odroid Zufor. It actually has an octa-core with two gigs of RAM. When we first started building it, it required a custom kernel build, but we eventually got those changes applied to the Debian kernel packages. I think most of the features were mainlined at the point. But it turned out to work really well. This has been one of our top builders ever since. We ended up getting a couple more later. But they're kind of stuck at Linux 4.7 because of some USB issues I haven't yet troubleshooted. So they're running with USB Rudolph-S and if the USB has issues, that's not gonna work so well. The other thing I wasn't super happy about was it required a firmware blob. But then when I was writing this talk, I thought, well, actually, that's great. Normally I think firmware blobs are something we should avoid, but here we want to provide an environment where we're testing, you don't have to trust the build hardware because we're dealing with reproducible builds. You can build it on one trusted machine and dozens of untrusted machines and that can still help verify the process. If the one with the firmware blob consistently produces a different result, then you know something suspicious is going on. But otherwise it helps to verify that the firmware blob is actually not doing anything malicious. The other thing I didn't like about this one was it was the first one that actually used a fan for cooling. All the others had passive cooling, which was a little bit loud, a little whining noise in the background. So once I kinda, once I decided, hey, that USB 3 system was working fine, I've got this other USB 2 system just sitting around doing nothing, I'll give it a try. It's slow, it's only a dual core, but it keeps up with the other dual cores with similar specs and that kind of opened the floodgates for the options. And this is when up till now, I mostly had systems I already had on hand or the old ride I picked up just cause I was like, yeah, let's try it. It looks like a nice little board. But this, once I realized USB actually is not the limiting factor, it kinda went wild from there. So I had another Raspberry Pi 2 sitting around, added that to the network. Similar specs, quad core, it's getting a little bit better build results. Also a firmware blob, but wow, firmware blobs actually can be good in this environment, strangely enough. I can't believe I'm saying this, but here we are. So then we added some Firefly boards and these came, the Debian project leader approved a grant to do, to basically triple the capacity of our build network. 200 builds a day, even when we got it up to three, 400 builds a day, that just wasn't gonna cut it. So approved a grant request to fund a number of boards and these were some of the earlier ones that got deployed. And these have quad core, two gigs of RAM. And a really interesting thing with these is recently I got, for a long time, I'd been running them and they were running with the default CPU speed. And then the Linux kernel got support for CPU frequency scaling and all of a sudden they were performing better. So this is one of those great things where I didn't have to do anything. I just kept pulling in the new versions and suddenly these boards started performing much better. That was fairly recently. We added some Orange Pie plus two boards. They're a little slower. Technically it looks like they have SATA but it's actually built onto a USB bus in a pretty suboptimal way. So their build numbers, their build averages aren't coming out quite as great but they're still pretty good and they were cheap. So they did a good job there. They also don't yet have ethernet support in mainline so I'm using an ethernet adapter but that's not a huge deal. So we got a couple more. These were the first boards we had that had significantly more than two gigs of RAM. 3.8 gigs of RAM. The hardware actually has four gigs of RAM but the processor isn't able to access the full space of it. So these are very similar to the QBox i4 Pro but have significantly more RAM and they're in some of our upper build numbers. Currently in order to access the full RAM using mainline U-boot, we had to actually patch it a bit. So we need to work a little more on getting patches into that to do auto detection to detect the full amount of RAM. The Beaglebar X15, which apparently is hard to come by but thankfully I got one donated. Is there a question in the back there? Yes, yes. We're building the Debian packages from the archive. So, right. So the question is, how do you deal with different packages some of which being larger, taking longer to build presumably versus smaller packages? We average it over time. So the build numbers are an average number of builds per day and in the case of really big packages, some of these boards are just too slow to ever finish them. So we actually have a timeout to make sure that the really huge packages don't take up the whole archive and we've also excluded some packages from the builds. That answer your question? Great. So the Beaglebar X15, hard to come by but thankfully Beaglebar.org donated one. I worked on getting support enabled in Debian. And this one, despite only being a dual core, has really proven to be one of the faster builders. It's got Cortex-A15 processors so that probably has a lot to do with it but it's even faster considering it only has two cores than a lot of our quad core A15s. So that one's a bit of a mystery but a pleasant mystery all the same. So this is our first board that actually had four gigs of RAM. The Firefly has a four gigabyte variant. We had a patched U-boot for a fairly long time and then recently somebody uploaded patches to mainline U-boot that supported detection of the full space of RAM. And this is great. I love it when I can just pull in other people's work and profit. So rather than updating it when I felt like building a custom version for this, I'm just using the standard Demian U-boot on this one. And the Odroid U3 is another mystery. It's only a, it's got a Exynos 4.4.1.2 processor and somehow for whatever reason it's actually consistently been the best performer out of all of our builds. It's got USB 2.0. It's got two gigs of RAM and a quad core but for some reason this is our fastest board. I can't explain it. But and this is a mystery I'd like to explore further and this also has a firmware blob bummer and yet not so bad in this use case. The QB truck I got a little bit later in the process and it's been unimpressive actually. I thought it would do a little better. This has basically the same specs as the Banana Pi but with twice the RAM and it's pretty consistently performs about the same but it's one more to the pool. And this is a fairly recent addition, the Jetson TK1. This has quickly jumped up to one of the, I think this is typically around the second fastest builder we have. It's got onboard SATA. The installation of the firmware requires this proprietary thing shipped from NVIDIA which makes it hard to do firmware updates because I actually have to run it on a separate machine rather than upgrading it in place. I would love to be proven that there's some simpler way of doing it but as far as I know that's the only way we can update Ubeet on these and I've struggled a little bit with the onboard ethernet so that one's also using a USB ethernet question. Okay, so somebody from the audience mentioned that if you just force it to 100 megabit the ethernet behaves stably. I'm glad I gave this talk. This was donated by NVIDIA and literally just about an hour before this talk NVIDIA offered that they'll be shipping me two more boards of the TX1 which is a 64-bit processor board which kind of segues right into the next board. The Pine64 Plus, we've got two of them. Their build numbers aren't great but they just got installed in the last week so the average build is pretty wildly, it varies hugely day by day. So those numbers aren't great. In fact, if we look at the stats from today they'll be totally different. But I've been following a lot of the mainline development on this and just recently a kernel from Linux Next could work with this, still needs a USB ethernet adapter but this is our first board that's actually running 64-bit but I've configured it as a 32-bit user space. So this is really good for testing if some packages will occasionally embed the running kernel version or the running kernel type or the actually worst case scenario. They'll look at what kernel is running and then build for, they'll optimize for the running kernel rather than the running user land and that's not a good thing and this helps detect those sorts of problems. We've done a lot of that by running some i386 builders as well so you can detect between different types of issues. All right. So the troublesome boards. These were boards I had so much hope for. The QB board 4, I think it's a octa-core, one of those big little boards but it just never has gotten the mainline support yet. It's a work in progress. Every once in a while a new patch will trickle in. Similar story for the QB Truck Plus. Great looking board, two gigs of RAM, eight A7 cores but it's kind of languishing. But fairly soon we might see something. Odroid C1 Plus, I tried running a vendor kernel on it and it just was completely unstable. I couldn't use it for a build farm if it was crashing every time you tried to access the disk. So that one kind of failed but the Odroid C2 has recently had some great improvements in mainline. I've gotten it working with Linux Next but the bootloader only boots over ethernet which makes it kind of hard to maintain the operating system image. But given how good the support is in mainline, I'm thinking that might come online soon. And the Lemaker High Key, the High Key boards have been underwhelming to the wide world of the internet by most people. So I'll put it up there. We got one donated by Lemaker. I haven't been able to get it to do anything useful. It just doesn't have sufficient mainline kernel support. It doesn't have mainline U-boot, did have mainline U-boot support possibly or it also uses EFI boot. Okay, yeah. Yeah. It's been some months that I worked on it but yeah, so apparently there is mainline kernel support and Tiana core support and U-boot. Okay, so yeah. And some of these troublesome boards, every once in a while I get a wild hair and I go, well, let's check. Things have changed, commits have happened. I'll give it a review again. Sounds like it's worth reviewing again. So in upstream Linux, one problem I've found on a lot of these boards is that because Debian uses a modular kernel, frequently when people are developing boards, they'll compile in the support for all of the appropriate drivers. But oftentimes there'll be bugs in the drivers that actually work fine if you build it in but they don't work in a modular kernel or they need a huge tree of other options enabled in just the right order in order to get it to work. So that's been a bit of a challenge but at the moment all of the boards we're running are using a kernel produced by Debian which is for the most part a mainline kernel at some point except for the ones that are kind of stuck on an old version. So we're basically running mainline on just about everything. The Pine 64 plus boards are just running Linux Next. A really, part of what actually got me a handful of these boards was I offered to help add distro boot command support for a few of the boards like the one board and the Cubox iboards and so on. And in Debian that's a huge, huge advantage to have that support where they boot in a consistent manner no matter which platform. Historically Uboot has had every single board has its entirely individual boot logic and that's really hard for a distribution like Debian to support. We have a handful of patches in Debian including some distro boot command support patches that I need to mainstream and if anybody would like to help with that I would love it. And there are also some ancient patches in there from long before I inherited the Uboot package that I have no idea if they're actually still relevant. So when I'm bootstrapping one of these boards I'll typically just use DebootStrap or ChemioDebootStrap which builds a small charoute. ChemioDebootStrap is useful for building a charoute on a foreign architecture although I have enough arm boards sitting around that often I'll just do a native DebootStrap. And then I'll install and configure a kernel user give them the appropriate pseudo writes that sort of thing. And then I kind of do some additional package installation with Ansible just to make sure everything's the same. And then I hand it off to Holger who is another person from the reproducible builds team and he goes through and adds all of our Jenkins test framework set up to the machines and that's when they really get rolling. So we use Jenkins to manage our builds tests.reproduciblebuilds.org. It basically just executes some shell scripts that are installed on all the nodes and then it'll run one job with a profile. It'll say username is this with UID 111 and then there'll be a parallel job that it will build, well, not a parallel job. A second job, once the first job finishes that will run with the user ID 2222 and a different username and in a different build path and so on and so forth. So we'll run two builds on each package build and then we'll see if all of these variations we introduce produce a different result. And then once each build is done it'll copy it to the build server and then it will do comparison using the tool called Difascope and I should have mentioned this in my slides. Difascope is this great tool. It's sometimes been called diff on steroids so it will take, you give it two binary objects and then it will unpack them to the best of its ability using various methods of identifying what type of file it was and then so say you're comparing two tar files it'll unpack the tar files and maybe and then it'll diff the results of those and if it finds, oh there's a PDF file inside of the tar file then it'll extract the PDF file and then do a diff so it results in really useful diff comparisons between binary objects. It just, it keeps going down the chain until it finds something it can't figure out and then it gives you just a straight binary diff. There's some amazing folks and rapid development on that. There's also a web interface to it if you wanna just try it's called try.difascope.org and you can just upload two files and it'll show you the diffs between the two files. So here we are today. Thankfully it took me a while to put these slides together so I got the most up to date information, well almost. We're running with 98 cores with 46.8 gigabytes of RAM, I think I did my math wrong. And it's all in under 225 watts. So we're doing well over 1,700 builds a day in actuality it's closer to 1,800 builds a day on average and so you can see here, this is kind of the early days when we had the small build farm and then we kind of added a few more builds just out of my craziness and that got us to a certain point. And then we got the big grant from Debian and we bought a bunch of boards and a bunch of services were offline for a while because for other random reasons. So even though we got a bunch of boards ready to be installed, it took a bit for the curve to really skyrocket but it really worked. So you can see over time we've averaged many, many, many more builds. There's some improvements I'd like to do the build system but have consistently been backburnered like running builds in parallel. If we know the build profiles we can run the builds at the same time on both machines to minimize idle time. And yeah, that's about it. Any questions at this point? Go for it. Right, right. I believe people have explored, right, repeat the question. So he was asking if the linking order potentially affected the reproducibility of builds and I believe the answer is yes. The link order does matter. I don't know what methods we've used to sanitize that but let's see here. Why don't I show you some build results? So if you'll see here, these are the build results for various architectures and it's too small to read but let's see if I can read that myself fairly. Okay, so this is the 386 architecture. So early on in the age of days that was about how much built reproducibility. We hadn't built the entire archive yet but once we get to the top there we basically built the entire archive. There may be some packages at various points that aren't yet built. The green part is the part that is reproducible with our current tool chain. For the most part I don't know that there are many if any patches that aren't yet in mainline upstream for the tool chains but that shows this is for... Still can't read it. How about we go here? There we go. So the majority of Debian at this point is reproducible and this is for unstable. So at some point we were sanitizing the build path and at some point we sanitized the build path because it was too hard but then at one point we're like, okay we've gotten this far, let's make it harder. Let's step it up and that's a pretty common thing for various projects to do when doing reproducibility is well first let's test it with these few variations and then make some progress and then once we're like okay that was good enough we make it harder on ourselves. So right around then we started introducing build path variations which really, which I think reduced the number of packages we treated as reproducible by about 14%. At the same time we're building Debian testing and we're not doing build path variation in that so we have a pretty good idea of which packages are affected basically only because of build path variations. I know that doesn't directly answer your question but somehow we've addressed the linking order issue if that many of 25,000 packages build reproducibly so I don't know the exact answer but it's somehow addressed, apologies. Any other questions or ideas or things you would like to see on the screen? Yeah, we're working on some of this. One of the main things we're doing is we've created these files called build info files and they basically list, we'll just pick a recent one. Hopefully it's not a terrible example. So the build info file will contain, that's not big enough for you guys. So the build info file will contain the source package, the version that was built, the architectures it's building for, where it was built, so on. It includes the build path because we're kind of in this middle ground with build path, we have some patches going upstream but they're not ready yet. So it tells you where we're building it and then we get some checksums of the objects produced. And then let's see here, it'll list the environment. So sometimes environment variables get embedded into the binaries so this'll list what environment variables you have, it lists the source date epoch and yeah, we're getting close to one and a half billion seconds. Well, let's see if we can look at the actual build info file itself. Great. So here we have all of the installed build depends so it lists all of the package versions that were installed. Ideally we want to get to the point where we actually have checksums for that because the build version is trivially forageable but that gives you the idea of the direction we're going with this stuff. So does that kind of answer your question? So we're working on some tools that will be able to basically take a build info file and then reproduce the build environment with those dependencies and all of those build dependencies installed and then run the build and then run it again. I think the one main tool we're working on is called Reprotest and so that'll basically do some of that. Yeah, any other questions? So we've recently, in the last few months we started building ARM64 as an architecture. We got a number of boards donated by CodeThink and they're one of the fastest systems we've got. They've given us a number of moonshot cartridges if any of you are familiar with those and they pretty much just jumped straight into it rebuilt the entire archive in just several days and they've been keeping pace. Not sure we're using all of them actually to even do the builds but all of our builds are a month, give or take and in a few days we can rebuild the entire archive. So that's 25, 24,000 packages depending on if you're doing unstable or testing. Did you have a question? The question is are you using any of these practices in other areas such as continuous integration and I don't specifically know about continuous integration. I know other projects are incorporating it. I think like the Tor browser was one of the first projects to do this, Bitcoin. Obviously you can see why reproducibility would be important there. We've recently started getting, well in various capacities, Fedora's been getting involved and getting more engaged recently. We've had OpenWRT, so there are a number of other projects involved but I'm not sure about exactly in continuous integration. I sense this whole project is a continuous integration project. Yeah, it's still in its infancy stage. We're edging towards the possibility where we're starting to think about how the end user might consume some of this data. So you might get to the point where a user could define, I only want to install packages that have been verified by at least three different builders. Like, so things like that. We're moving in that direction. So ideally maybe you have like eff.org be a certification service for builds of packages and then Debian builds a package and then the maintainer themselves uploads a package and then you have some other random user. So anybody can submit build info files to the buildinfo.debian.net and they have to be signed I believe using GPG. Is that kind of answer for some of your questions? Yeah, so we're looking at ways to actually use this. We've been in the proof-of-concept phase for a few years now. We just recently got all or most of the changes necessary into dPackage, which is the main packaging tool in Debian. I think they're making some progress in that direction with RPM. Both the OpenSUSE and Fedora folks are working on that. But yeah, so we've been mostly working on fixing these things in the tool chain. So we've been pushing things upstream. We got some patches into GCC that allowed us to specify the, what is it, the debug symbols, paths and things like that. But then a lot of binaries end up compiling the command line into the binary. So we made huge progress forward but then we were still losing. So we're working, that's why we're working on the build path prefix map specification. I'm running out of ideas unless you have anything more to say. Anyone? Come on. All right. Well, thank you all. It's been a good time and hope to see you all integrating some ideas about reproducibility into your processes. Oh, before we retire this, for real. No, not that. First, a word from our sponsors. So, a lot of this work for the past several years has been funded by the Core Infrastructure Initiative. They've really given us a lot of freedom to work on things as we see fit and really do what we need to do to make this a reality. We want to see a more secure world where things like XcodeGhost are seen as a bad practice and you just have to fix it as a matter of best practices. So we want to see reproducibility as a matter of best practices and they see that and have helped fund my and several other developers' work over the last, me only recently and several other people over the course of several years. Several organizations have donated boards. Leemaker, Technixian, Solidron, Devian, Beagleboard.org and NVIDIA. And the reproducible build folks are a great team. If you go to reproduciblebuilds.org, there are a number of ways you can get involved, mailing lists, IRC channels and so on. So we would love to welcome you into our communities and engage with you. So please join us and we'd love to have you build things reproducibly. All right, have a good one. Thank you.