 All right, welcome everyone. Today I will talk about reproducible builds and estate over content delivery network. Essentially it's an update on the, on previous talks about reproducible builds within AGL within AGL and what next steps are going on right now in upstream Yocto and within automotive grid Linux. I'm Jan Simon-Muller, I'm the release manager and I'm also doing the continuous integration and automated testing within AGL and I'm on the Yocto project board for IGL. So today I will give a quick overview about reproducible builds to explain what it is. I will show how it's done within Yocto, within the Yocto project and AGL. I'll show you how you can use the estate, how the one of the newer features the hash equivalent server works. What is now being done to improve the speeds by using a content delivering network for the estate and an outlook how everything will fit together than in the end. So reproducible builds, what does that mean? So reproducible builds mean that from the same sources and the same configuration, we will get the exact same binary today, tomorrow, on my PC, on your PC. That's the goal. And in other words, the build system needs to be deterministic and produce a deterministic output so why is that not always the case? Well, here's one example, why two builds might not be the same. Oops, does it want to play? So one example is here, if the binary has, for example, timestamps in it here that will change between today and tomorrow, right? Which means the binary will have a different date in it which means it's not identical. So that's one example. Build IDs are a common example that will make it different every time you build. There could be folders, folder names from the, during the build or commit IDs. Think of it, the commit ID might, you just change the documentation, right? The actual application binary should be exactly the same. No, we embed a build commit ID. Of course that helps during development, right? But later on during the distribution, this hurts. So better don't do that. Use your release version and encode that. For compile of languages tools. So in GCC, Selang, there are solutions to deal with the build in build ID with the debug symbol path and so on. So this can be dealt with with compiler arguments already. So that's under control. For go and rust, that is still in progress. That's not fully solved yet. Some of those have a unique ID for each binary which is, right, a problem. Some of them encode path or custom folders per binary and things like that. So yeah, not fully baked at least in regards of reproducible builds. Remember, we want exactly the same binary output. There's a project, reproduciblebuilds.org, which is the home for all efforts around this. The Yoctoproject is not alone there. Most distributions take part in there and it's a collaborative project. The Yoctoproject and especially Open Embedded Core, the foundation of the Yoctoproject is fully reproducible. Out of the, I think 3,000 packages, I think it's one or two that still need to be fixed. What is true for OECore, so for the core layer is not true for additional layers yet. And in AGL, we are working on our layers. The Meta-AGL is fully reproducible. Meta-AGL devil has more dependencies and those dependencies still need work. So since Yoctoproject 4.0, reproducible builds are turned on by default. So we do not have to turn it on ourselves. We do set timestamps for source date epoch and for the root file system. That's what we set on top. Now, why is this beneficial? Well, you can then rebuild like your release image at a later point in time. Also, in other talks, previous talks, you have heard from things like SPDX, right? And SPDX tracks the binary. If your binary is different every time, you basically have to produce a new SPDX document for that binary, right? If you produce the same binary, you have your SPDX information already. Also, reproducible builds increase the build speed, obviously, right? If we have the same binaries, we can use them out of the cache. And fully reproducible builds also increase the support for offline builds as it allows us to pull down all the resources up front and then have a reliable way to do offline builds. Here's a quick slide about the build workflow of BitBake. Essentially, we need a few host tools. We have our sources. We have, we then produce first the native tools and with those and the sources and the meta data, we produce the target packages and from there, we produce the target image. And you see, we have here a set of hashes that identify those steps. And in Yocto, we can basically track these, well, forward, but also backward, essentially. To use a state, there are two things you can do. A, regarding the sources, you can turn on BB generate mirror tables during your build and then your download folder can be copied to a web server and that location can then end up in premieres and that will speed up the, everything regarding fetching sources. The estate directory is now a binary cache. So this is not only compiled files, but this is a cache for multiple binary artifacts, binary build artifacts and that can be reused. So essentially this estate here, you can also put on a web server and then there is estate mirrors that you can set. That's exactly what we do in AGL. We have our source mirror, we have our estate mirror and reproducible builds will help you to improve your build times because well, we have those binary artifacts already, right? There's a newer component and that is called the hash serve. Now, we do track changes in the recipes and essentially any change in a recipe will enforce a rebuild. Okay, yes, that's what we want, right? We build everything from source and that will then trigger a rebuild of any depending artifact as well. Now, let's assume we just changed a typo somewhere, not in the code, so it doesn't change anything in the binary, so it could be in the documentation. This should not change the actual application binary. Or the library. Still up to now, we would enforce a complete rebuild of anything that depends on our package, right? But we can detect that the binary is already, that the hash of the binary artifact is already known, already present, right? Remember, we have here, we do a lot of hashing and we know that, oh, wait a second, this binary we did actually produce before. So what we can do is we can say, we know this already. So this hash of the binary is equivalent to the one that happened before, right? So we can say this is equivalent and that then means, here's an example. So let's say we do rebuild lib y here. Well, usually this would then trigger a rebuild of your big app, right? With hash equivalency and kind of just a documentation change which doesn't end up in the binary, we can now detect the equivalence and say, wait a minute, the new hash of lib y is essentially equivalent to the previous one, so we don't need to rebuild, we can short circuit and save a lot of build time here. And we can query big app from the cache now. So that's the new development which is now in upstream Yachto. We started to use that in HGL as well. We have a hash equivalent server running and they're using that actively now. So to summarize that, reproducible builds are good for build performance, they are good for maintainability and they go very well together with a local big bag or with a local cache for big bag. So here's another piece of the puzzle, right? We have the hash equivalent server. So we can say, wait a minute, we have this already. We did build this already, go look at this. So that means we will do more queries to our estate server. And what's happening upstream in Yachto is that the Yachto project estate is now available through a content delivery network. That's restricted to all files that are smaller than 64 megabytes, but the good news is that this is already 99. something percent of the artifact cache. So essentially all Yachto files are available through a content delivery network. So the effect is that the downloads will be faster if you use this as your estate mirror. Now, one step after the other. Now we have the artifacts. We scaled out the artifacts. The crucial piece is then the hash equivalent server. So next step, we need to scale our hash equivalency server. In upstream Yachto, the hash serve is running on this host and port. But this is a single server instance, right? So this is basically a single point of failure and kind of, yeah, it reaches its limits essentially. So lesson learned and those patches are being merged into upstream Yachto last week and being improved now. They are extending the hash server so they can now run multiple processes overload balancer with a database backend and so on. So that's the next piece of the puzzle. Essentially, we are hearing up for creating a binary distro that is one project that is being worked on in upstream Yachto now. That work is being sponsored by the sovereign tech fund. It's a German fund. And this work is ongoing just now. Automotive Red Linux will follow based on, well, when this code lands in the release in the long-term support release of Yachto. At the moment, we are also setting up the same bits in the CDN for our estate to speed up delivering the binary artifacts. We have set up in the same way and hash a quick server. There's a read-only port and a read-only port for a PR serve which is needed for a binary distro. So we are following that set up in the same way. Here are some references. The announcement for the CDN setup. There are, yes, last week, there was the Yachto project submit the fall Yachto summit and I linked two talks that deal with hash equivalency and with estate. Any questions? Thank you for joining and have a nice rest of the conference.