 All right. I don't think it's a bold claim to say that every good software development process had a continuous integration, continuous deployment in some form. So neither of us is going to tell us about a bit more about that. And how to maybe make it better. And with that, please give a round of applause. Thank you very much. So before this talk, a traditional talk by WarpForgeForg, which was very expediting despite technical difficulties. I don't have a working software, actually, but only ideas. If anyone wants to collaborate, you can come to me. If not, maybe it stays on my backlog of project that I am probably starting when I go into retirement or something. So yeah, if you read the description of this talk, my main premise is that projects do want an evergreen master, which means that every, basically, Comet or every revision was tested and was OK. I mean, why do we do even need an evergreen master? I mean, if some Comet failed and the next one is OK, yeah, why should we even care? I mean, you can go like it's shiny. It's nice to have these green check marks all over the place. But also, if you allow for, basically, bad revisions to reside in your repository, you may start accumulating a lot of errors. And there are a lot of projects where, well, they do have a test suit, and a lot of those tests are known to fail and will not get fixed anytime soon. Also, if you find some error or some regression which wasn't previously covered by your test suit or something, you usually do a B-sect. And it's kind of annoying if a half of your Comets do need some special treatment because those are broken. If you basically have an evergreen master, your chances that you can just automate your B-sect and be done with it is far greater. So can we actually do have an evergreen main branch in our GitHub repositories? With all the CI solutions I have seen up to now, not really. So I'll use a bit of common Git terminology throughout this talk. When I talk about a job, it's a concrete test that I run on that basically describes steps to check your software and will return as basically a success or failure. And a job run is an execution of this job on a specific revision. For now, I will basically, a job will denote a whole CI pipeline, meaning that if a job passed, then everything passed. But later on, yeah, go into more detail about that. So classical example. We have a few comets, basically. And the last one is OK. And we have one developer who finds a bit of unused craft and wants to clean it up. And another developer who wants to implement a new feature and is like, whoa, there's already code for that that just can use. And if you merge them after another, the first one goes through, the other one gives you a bad revision. So now the maintainer can easily decide to revert this comet and have this one bad comet basically stuck on his main branch. Or, yeah, can basically reset the main branch, which isn't a very good idea. I mean, if you go to, like, yeah. A lot of people will yell at you if you just reset your main branch in your public project. Also, if you do this and another developer already based a feature branch on this bad revision, then a merge will just reintroduce this bad comet again. So we can go and basically test before updating the target branch, which means that we do the merge and check the merge commit, actually. We can do this without updating the main branch, at least in Git and also in a lot of other distributed control versioning systems, which are, hopefully, the only ones still around. And then choose to update the main branch on the merge commit, which we deem the one we want to choose. Of course, once we move the main branch, we have to do another merge to get the new result for the other branch. And then, yeah, if there are problems, we'll detect them early and just choose to not update the main branch, meaning that this bad revision won't go into the main branch, won't become part of our public history in our comet, in our repository, and will not basically hold us for the rest of our lives if, in, yeah, five years, someone finds a nasty bug, which was introduced previous to this point. If we find a bad revision and still want to merge this other main branch, we can basically do some change on main again. In this example, it would be a revert. It could be another thing we do to make the other branch work again. And then basically do another merge, check it, find it's OK, and then update the main branch. Yeah, needless to say, this is not what CI is to. We can also, to make our lives easier, introduce references for individual jobs we run because, I mean, if those are not yet part of main, those merge comets, we still may want to look at them before actually choosing them. So it's basically a piece of convenience. Assuming we have this cleanup in the feature branch existing, we could denote those mergers, job one and job two, and job three, the failed merge or something, which we can then basically identify by this reference name and investigate what did go wrong, basically, by just fetching the ref from the repository and checking it out and run some lower tests. Now we can go in a step further and say, as a maintainer of a repository, we want to say which comet we want to choose as our next, basically, main. So we can go and add a new test reference, which will basically serve as a pointer for the new merge commit we want to choose. So suppose this last dot is a merge commit from some feature branch or something. So we wait until the check is complete. Main gets updated. We can automate it. And then we go further to the next merge. Let it get checked and the main branch updated. And if we go like this, we find ourselves anxiously staring at screens of our, basically, CI service, which isn't too uncommon for existing setups, actually. I found myself looking anxiously at mergers go through. And I guess I'm not doing the one. But this test ref also has another nice feature. We can just say, OK, I have these feature branches. And I will merge them in order, update this test ref, and tell our CI system to check those in order. And basically assure us this test, all of them are green, without actually waiting for each individual comet to go through. And this is actually a testing window which gives the name of the stock. So in this example, our CI service can basically follow the test ref until all the mergers we prescribed as a maintainer are checked and found OK. So let's look at our choices when a test actually fails. Again, we have this testing window with a bit more comets. And each of the first three is found OK. But now we have a bad comet. And we can choose yet another reference, the bad reference, to point at that one. And basically tell our maintainer, OK, this revision is bad. Go look at it. And then the maintainer can come up with another basically path to achieve the goal by either skipping this merge of this feature, which may not be that important or requesting an update of this feature branch from another contributor. And after we've updated test ref, the CI service can basically carry on and check the rest of this bit of history basically. If we have other contributors basically, everything which is in main is a possible merge base for feature branches. This is nothing new. This is just how we work currently. But of course, using the test ref or anything between main and test as a base for a feature branch is a bad idea because we don't know whether our base will actually make it into main. So yeah, actually nothing really changes for contributors. For them, the test ref is something they can usually ignore. It's mainly something for the maintainer. So given these examples, we can derive some rules. We can move test, so the test reference arbitrarily, as long as it's a descent of main. We can force-post this reference, which is pretty important for this reaction on a bad revision. Main follows test and only moves forward. So we preserve this behavior that once a comet is in main, it stays there and can serve as, for example, base for a feature branch or for a release branch or something like that. Bad, the bad reference may be placed between main and the test reference. And yeah, it's never updated. Main is never updated to any of bad descent. So basically, bad is a roadblock for the progression of main and also a handy thing for the maintainer to have to figure out what actually went wrong. Contributors, again, yeah, they just see main and use this. We can enforce these rules by basically github, githooks, or we can basically hack githy or githlub or something to enforce this for us. One thing we have to keep in mind, which I'm still unsure about, is how to actually handle this test ref and the main ref. Because for a maintainer, you can go and be like, OK, a maintainer should know what she is doing. They just touch the test ref and pushes to main are basically ignored or forbidden by a hook. Or you can choose to go this path where you say, OK, if you try to push to main, what's actually happening is that main stays the same and the test ref is what gets pushed. So earlier I said that a job for me for now was a complete CI run. But there are many reasons why we want multiple kind of jobs. So again, we have a testing window now with three comments. And we have some check we want to run. Maybe a simple compile test, maybe a check on formatting or something, cheap stuff, basically. We can have another test, which basically does some expensive operations, which can take maybe an hour or something. You'll find many projects which have tests used, which are very large. Yeah. We can have this quick reference basically denoted through its own reference, which is called good slash quick. And we can have basically this reference progress linearly comment by comment. And in this situation, let's say, the previous test has completed after two hours or something. And we got OK by nightly builds or something. Then we can say we have another reference placed good nightly. So we have basically different good references, which different tests denotes, yeah, this is OK. This commit passed this specific test. So we update main to basically the oldest good reference because that's basically passed all tests we have, basically. So the quick test can now just progress. And yeah, as the nightly might have suggested, we can have it run fewer times, so not for every commit, and then update the main again. So why would I do this? Because earlier I said that we want an evergreen master and do not want to skip tests for some comments or have bad revisions in our history. So cheap tests are cheap. We can just run them on a very common in our main branch. But there are expensive tests. And most of them have actually a very little chance to hit actual problems. For example, if we have performance regressions, they don't pop up and vanish with another commit. Yeah, or some other problems found through formal verification don't usually pop up and vanish from one commit to the other. And yeah, we can go crazy. We can have very expensive tests run seldomly and still have this relatively high assurance that the range of commits between the last run and the current run is still OK. So it's a bit of a gamble, but you can do basically a bit of math on it, on the probabilities, and say, OK, it's not so bad. And again, we earlier wanted to mainly emulate it by section hazards. And usually cheap tests like compilers, if it's not just something like LibreOffice, which takes, or we do former checks or licensed checks or something, those usually are sufficient for eliminating those hazards. We don't usually care that often for performance issues when we find a five-year-old bug in our software. So going a bit further, we even may go and prolong this testing window for an extended period of time. The nightly before implied that we run those tests daily. But if we have a really expensive test, like formally verify LibreOffice, which is not going to happen, yeah, we can do this just once a week, because the probability that in this time frame of a week, we introduced that tolerance of the bug should be, hopefully, very low if it compiled, at least with modern programming interests. So yeah. Or you can go, OK, I have this test, which isn't even that expensive in terms of like license or power or computational time. But there is a manual step involved. We can do this once a week or every second day or something and still be OK, because we can just denote, use this good reference for this specific manual test to denote, OK, I checked this comment. This comment was OK. Go further. If you happen to do DevOps, for example, you can use different good graphs to make staging or canary part of your actual testing and not in the sense of you fix something after it hit main, which may end up introducing very ugly hotfix. But find the actual commit which introduced the issue. We'll see that later. And then say, OK, I'll skip this for now. Maybe move this feature to the backlog or something or just skip it and then go on with the other work, which is not blocking. So as I said, we can have manual tests part of your CI, basically, which we can use as means for a maintainer to approve of changes. For example, chosen by sub-mentainers. If you are a bit familiar with the development of a Linux kernel, there's Linux Next, which basically is a testing ground for mergers from many, many sub-mentainers. And very often, you find things like, OK, this didn't merge cleanly. I have this solution for the merge. And you should pick this specific solution when you do the actual merge. If you use a testing window, then this particular solution is already a commit which is basically scheduled to be in main already. Similarly, if you have subsystems which are not ready yet, because they are something experimental or something which are just playing around, you can still have them covered by those cheap compile checks or something. But if the maintainer finds them, again, he or she can just say, OK, move this back. Just push it up and we'll look at this later. So the first idea you might have is, well, why don't maintainer just update main itself? But yeah, this would basically, it is an option, but I don't find it too nice. So I'll assume that you use a good ref to actually have maintainer approval. So in this example, we saw that we had an old good ref, which is updated. And after these quick checks progressed, the maintainer basically can have a look at all those checks which are already found good without basically worrying about investing time in checking merge commits which are known to fail, for example. This is another possibility for an implementation to have one good ref to depend on another good ref. So to just always stay behind it, to not waste computation time or maintainer time on useless stuff. So assuming the maintainer found a bad commit, he can just complain to the supplementainers or something. And then we can come up with another history which maybe finds approval of this maintainer and not waste basically time with manually tracking merge resolutions across different pieces of potential history. Another thing we can do is basically bring our community into testing because good refs are actually quite cheap and it doesn't actually hurt too much to have lots of them. So each individual compile or a format check can have its own reference if it's not too messy to have all of them around. And so if we have, for example, for a library, different users, we can just let them contribute by checking this new version of our library. For example, for not the main branch, but some release branch with their software. And if it's OK for them, they can give feedback by a good reference, which they place on the commit they actually tested. Or say, yeah, we found a serious problem. We are not OK with this. Please fix it. Obviously, we would allow this only for users we trust. And so they can basically contribute and help in testing all software. Yeah, as we can also introduce time out for good references for, let's say, sources of varying degrees of trust by simply checking how old the commit is this ref points to. Because if we have a good reference by some, let's say, library user pointing to a commit which is three weeks old, then we can assume that this particular user probably stopped testing software or just keeps us back. Then we can have this reference removed and then progress main all the way. So there's another thing involving test failure, which we can actually do to improve the situation. So let's assume we have an expensive test which only looks at each 10th commit or at the newest commit every week or something. And we actually found a problem. We can actually bisect because we removed all those bisection hazards. And we can probably do this automatically and then choose a new good or bad ref which then serves as the basis for an alternate history. And yeah, there are some considerations we have to do, for example. Because yeah, assume we have a test bench with like 100 different tests. And one of them is basically telling us there's no problem. We can just pick this one and use it for bisection. But in the case we are moving a good reference, we usually want to do the full test of all those tests in the suit for this check that was performed by Rick. So this introduces another few considerations to a rule. So the rule for a test basically stays the same. For main, I already explained, we update it automatically to the oldest good ref or more precisely the common answers of all known good refs, which we basically define through our own tests. Or we let it be defined through library users, kernel users, hardware vendors, or whatever. Good refs also only follow tests and basically serve as an intermediate and differentiation point between different tests. And they also only move forward in the good case. In the bad case, they are allowed to move to a different branch, but that's another story. We previously introduced a bad reference, and this we only allow to be placed between main and test. As we said before, bad serves as a roadblock for main, so it stops the progression of main. And a very important piece of information, bad is never updated to any of its own descendants only to our ancestors, which is important because if we have multiple job executions for our range and different tests find problems, then we don't want our bad reference to move forward. But tell us what's the earliest problem, what's the earliest roadblock in our history? Yeah, there are a few other things which I actually forgot to put in this presentation. But yeah, if someone's interested, can elaborate even further. Also came with a few conventions, at least forget, because that's what I tend to use most often. I'm not actually familiar with other distributed version control systems. So yeah, it basically would be a basis for an implementation. And having this convention actually allows us to have interoperable implementation because if we have, for example, different kinds of test infrastructure, which only modify the good ref and the bad ref, according to the rules we previously defined, then we can have basically different implementations for different kind of jobs if we find those, yeah, different implementations serve different purposes. Or if, for example, library users prefer different representation. Yeah. So that was my talk. I hope I didn't bore you too much. Yeah. Awesome, so we have a bit of time left for Q&A. If you have any question, please just put up your hand and I'll come get the mic to you. I think you said you don't have an implementation yet, right? Yeah. So the basic ideas at least reminded me quite a bit of Zule with the speculative merging basically of in parallel building the commits that are scheduled for merging and also this check and gate pipeline with the different levels of how much you would test or execute and test. So have you compared it to that? No, I think I stumbled up on Zule once, but I think I didn't have too deep a look actually. Yeah. But still it would be nice to have different implementations which are integral for different purposes. So I'll have a look, yeah, of course. Put it on. I have two questions first. What is the TW stands for in your? Test if we know. It's plain. Ah, OK. And then the other thing is this Zule. I have never heard of it. How do you write it? V-U-U-L. Pardon? V-U-U-L. V-U-U-L. Yeah, so basically I guess like the bad guy in Ghostbusters. Or is it with O's or U's? U's. Yeah, then it's OK. Thank you. I actually have a question myself. So I was wondering if you've heard of Boris yet. Boris is especially in the rust ecosystem sometimes a merge bot for, I assume, similar problems. Yeah, I've seen it once. But again, I think I was distracted at this point. Yeah. OK, so thank you for your talk. And I guess if any more questions pop up, feel free to reach out to him.