 So, collaborative package maintenance with SourceGit. Welcome, everyone. Before we start, blank screen, I have four questions for you. Who of the audience is actually maintaining a package in federal, center stream, rail, somewhere? Quite a lot of people. Good. Who of these people who are maintaining packages are actually doing with someone else? Like, they are not the only maintainers, and they have an active relationship with someone else maintaining the package. OK. A few of them. A little bit redundant question. Who knows what the DisGit is? A few people, OK. And the most important question, who has any idea what SourceGit is? Wow. Why are you here? OK. I'm going to start then to talk about DisGit a little bit. And DisGit is basically the place where sources are stored in all the distributions in the rail ecosystem, meaning federal, Linux, center stream, rail, and so on. And basically, the structure of DisGit is something like this. There is a Git repository with some files in it. These files are like spec files, patch files, if there are any downstream patches. Some other files, like the ones used for testing, gating, YAML, and all these nice things. And then there is this special file called Sources, which is actually just a pointer in the Lukasite cache, which is going to store the source archive, going to be retrieved during build, and apply the patches, and so on, and so on. And this structure is then populated in some way from the upstream projects. I don't know. Some people are going to produce the source archive directly from the Git repository. Other projects are going to fetch it, like the official release storeables from whatever website they want. And then the current developer workflow is that maintenance downstream happens in DisGit. But this is a little bit weird, when somebody joins Fedora and wants to start maintaining, it's like, OK, why do I have to do this way? Why do we need to work with patch files, and not like regular Git commits? And why do we need these source archives in the Lukasite cache? And you have to learn all these to be able to do downstream maintenance in these distributions. And whenever I ask these questions, I get the answer well, historical reasons. And actually, this January, I will share this nice quote from an email from Jesse Keating said in 2010 when Fedora 14 branching and DisGit rollout happened previously. There was such a thing that was called DIST CVS. So the text says that, of course, this is just the first iteration to transition out from the old version control system. But we have great plans, like automatic patch management with exploded sources, linking to upstream source repositories, automatic change log from Git change logs, and things like that. And yes, 22 Fedora releases later, source Git still wants to do exactly the same thing. So what is actually source Git? Sometimes, a lot of people raise their hands. Can somebody help me and give an explanation to the others? Volunteers? No, no, no, no, no. Didn't prepare for that? Oh, OK, then I think I lost this game. OK, source Git is just like a regular fork of the upstream project. It's nothing fancy. Source Git tries to use Git as it was meant by Linux, meaning doing changes in Git commits instead of patches. And if you need something downstream, you just create a branch for it and use it. And the way my team, the packet team, visions source Git to work is really just an add-on on top of this Git, which gets into the middle between upstream projects and this Git. And then everything becomes, or the maintenance work could happen in source Git. And you can use proper Git to work with your downstream packages. You just pull from upstream whatever change you have. You do it in your branches. And then the transformation to this Git can be done by bots. And hopefully no future maintainer will need to actually learn it how to do that. So source Git is really just Git. But we have to put that source there because we have this Git, and so we need some kind of differentiation. Another important thing to say is that when we are discussing source Git, some people think that this is going to be required and everyone is going to need to use it, well, no. We know that there are some kind of, it doesn't make sense for everyone, for example. So we want to make it an add-on. It doesn't make sense for packages which don't have downstream patches, which are, I think, the majority of packages in federal Linux, at least. But when you get to center stream where you have long living versions throughout the life cycle of the distribution, you start having more and more packages which need to carry downstream patches and rebase them periodically. And in these cases, working in proper Git and reviewing actual pull requests, for example, instead of reviewing a pull request with a patch file, it should be much easier. Workflow. The workflow we try to develop and propose for source Git is nothing new. Many teams already do this. Like pretty much all the teams who were present here in the first panel system, the weird kernel, are all doing this already. They simply fork their upstream projects and they are doing their work in downstream branches. The only difference is, our goal is to come up with a unified workflow, one which could be adopted by any package, and basically you don't have to develop your own solution for the same problem. And the overall workflow is really just simple. Work in downstream branches, open MRs or PRs for your changes, merge them, and everything is taken care in the background by bots. Tooling. The tooling really has two layers. One is the CLI, which I like to think about it as the backup layer. Because we try to do automation for it as bots, bots can fail. I mean, those are also applications. Infrastructure can fail. Things can go wrong. We need a backup plan. And that actually comes from kind of the philosophy of our group, which is the cyber group, where we say that bots are first class citizens. They can do the same things as engineers can do. And an engineer can take over a bot's job whenever they want. If the bot fails, somebody can step in. A human can step in. And they can do it. So we developed, like we have the CLI, which is the basis of all the source gate workflow, and which could be used by anyone to do all the operations that the bots would do themselves. And if I can find a window, OK, this is going to be true. So here is a short demo of the whole CLI is packed into the packet command. And because in packet, packet is really an upstream tool, and source gate is more like downstream, there was a confusion in our users, like, OK, which commands are for what. So we started grouping everything under packet source gate. And currently, there are really just four sub commands, one in it, which is a helper to initialize a source gate repository, a status, which can tell the sync status between a source gate repo and a disk gate repo, whether they are in sync. And then there are the two update commands, which can transfer content from a source gate repo to disk gate repo or the other way around. And so let's see how this would work. To set up a source gate repo, you just really need to take, clone the upstream and clone the downstream disk gate repository. So this is the disk gate repo. You can see that this one, I like ACL because it has a single downstream patch. It's not too much, but it's just enough. So I use this always as an example. So this one has a patch file, has some other files for checking signatures, has the sources file, has a test directory. So this is in disk gate. And then we can get also the upstream sources. Basically, locally, when working, I like to use this convention of cloning disk gate into rpm slash package name, and then the upstream sources into src slash package name, and then just work with these directories. So once we have this, we have to figure it out which is the version, which is released in Fedora. Now I'm looking at Rohite. So that's version 2.3.1. And once I found it, it's possible to compose this source gate init command, which is just taking three parameters. One is which is the git graph from which you are going to initialize your source gate repo, actually. To be more correct, it's a source gate branch. And then directory where you can find upstream sources, branch on which to initialize the source gate branch, or source gate repo, and the same in disk gate. And this on the background is going to just run the prep section of your spec file, create a git repo from it. And in this branch, you can see that there's going to be a coin with some initialization. And there is going to be the patch as a git commit. Now here, we are using some git trailers for backwards compatibility, so that when we are transforming back to this git, we can actually reproduce the same patch file name as you were using before. Or we can add those comments above the patch lines in the spec file, which say why you are carrying these downstream patches, and so on. But obviously, if you leave them out, they are going to be just some sane defaults. Okay, let's do some, if we do some update in the source gate repo, sorry, forgot my own demo. So one directory added in the source gate repo is the .distro one. This is going to collect all the files from disk gate, plus it's going to add a source gate YAML configuration, which should help tooling figure it out how to do these transformations from one format to other. This is a lot of text, but actually most of this, it's default, it should be the same for each package. So with time, that's just going to be much shorter. Doing an update downstream is just like modifying the file. You don't need to generate a patch. You just commit everything the regular way you do, creating a branch. And once you have your commit there, you can just run, you can check the status that now these are out of sync and can tell which commit you need to sync, and then you can use the update disk command to sync, which is going to be going to update the disk repository, and also going to add your patch file in the spec file correctly. We also have a backward command for the case, because program packages most of the time are going to do their changes in disk gate, and we want to get back these changes into source gate. So we have the update source gate command, which is going to get changes from this gate back to source gate. I'm going to step this. OK. And back to the CLI. Another part of the tooling is going to be bought. We have a few of them deployed already in GitLab, namely in the center stream SRC namespace and Fedora SRC namespace. These bots, for now, the only thing they do whenever you open a source gate PR, they are going to mirror it in disk gate and then take all the CI results that is published by CI in disk gate and bring it back to source gate so that it's visible for you. It's a little bit redundant, but it basically comes from the fact that source gate is just an add-on on top of disk gate. But on the long run, we would like to have a nice user experience so that you don't have to touch disk gate, you just do all your work in source gate. Packet, we have source gate, because we also get this question a lot. The whole source gate workflow is building on top of packet, as you've seen. The main difference is the packet's main use case is to have upstream developers test their changes downstream while source gate specifically focuses on downstream maintainers. And of course, many things are common, but the use case is different, so priorities and functionality might differ. What is next? Monorepo support, this is already research and implementation should start soon. This is going to let packages where the upstream projects holds multiple packages. It's packaged into multiple packages downstream or maps into multiple disk gate repos to handle this. This is going to be fun. Handling new versions, which is basically rebasing. What do you do when your upstream release is something new? We've seen in teams who are using source gate already that there are many, many ways of doing this, from rebasing to merging. So we are still exploring these ways, and we need to figure it out how this is going to make sense for most of the people. And then accepting changes, as you've seen, I didn't tell anything what happens when you merge your source gate MR. The workflow here is not really clear. We had multiple discussions with teams. Some would like tooling to first merge the disk gate ones so that they are sure that everything works. Other teams would like to just cancel all the PR and pick the commits and push into branches. So yeah, this is still under exploration. If you want to get in touch, please get in touch if you have questions. And documentation is unpacked at .dev slash source gate, and you can reach us or me on matrix, if you would like to. Yes, Tomas. So the question is, what if there is one disk gate repository and there are two sources? Those are two upstream gate repositories. So one for the application and one for the test. That's a good question. I haven't seen such a package. Which is this? OK. I would definitely like to think about it. Honestly, I didn't know that there is such a thing. But yeah, it makes sense. And they are pulled into one disk gate and released as one package downstream. This is why these sessions are good. You always find them. Yes, please. So the question was, how specific is for Fedora? There is a reason why we propose to use this .distro name for that directory. Like one of the reason was that we want to use in all rail distributions. Our implementation of this is very rail ecosystem Fedora specific. But I really see no reason why the same format with the right tooling developed could be used in other distributions. Actually, I know that Debian is doing something similar. And they are doing for a very long time. They also don't do for all the packages. But there are some packages which do this. Other questions? Yes, please. So the question is, how are rebases done with source gate? Source gate, yes. As I said at the end, we are not sure yet. We have a proposal for doing. So the main problem is that in this gate, right, you cannot force push to branches. Now when you do a rebase, you would need to force push. So that's a challenge. We have a proposal for source gate of actually allowing to force push some branches, but still making sure to be able to track back past history, like from where the rebase happened. Because this would help especially to give some tooling for developers when they pull from source gate to their local clones. And suddenly they notice that the branch on which they were building was rebased. So it would be nice to have some tooling to help developers to easily get out from these situations. It's not impossible to get out of the situation, but people should know. And it would be better to have some tooling. So we have a proposal for rebases. But we also know from Kernel team, for example, and if there's someone here from Kernel, they can chime in that they are sometimes doing just merges and don't rebase. They just merge new versions into downstream benches. We are still exploring what's the better solution, whether we should support both of these, should be customizable, we don't know yet.