 So, I'm Josh Triplett and I'd like to tell you about a tool that I've been building to work with Git called Git Series. So I want to start out by explaining the problem that motivated creating this before going through exactly what it does and how it works. So let's say that you work on some major open source project and you have an interesting idea for a new feature. So you go ahead and collaborate, you figure out how this feature ought to work, you send out an RFC, you say, hey, here's an idea, here's what I might want to do with it. You get all sorts of feedbacks telling you how it might work, how you should structure the patch who might be interested in working on it. You go off and do a pile of development, pile of cleanup, pile of yak shaving, figure out exactly how to get the feature to work the way you want. You make a series of Git commits and when you're done, you have some stack of patches you want to get upstreamed. So you run something like Format Patch, you tell it I've got three patches I need to format and you'll get a patch series that looks like this. Patch one does a bunch of cleanup and preparatory yak shaving to figure out how to get the infrastructure to work. Patch two actually implements the thing you wanted and patch three has all sorts of lovely uses for it. Now Format Patch isn't the only way you could collaborate, it's one of the more common but you could also use Git request pull to send a please pull from this tree that I've pushed publicly or you could create a pull request on GitHub, GitLab, Garrett, any number of interesting tools but at the end of the day you have a stack of patches and you want to get them upstream. So naturally those patches are perfect and flawless on the first try and get accepted with no further feedback and everything's glorious, right? Most of the time not so much. So you get some amount of feedback varying from here's an interesting idea, here's some feedback, here's some problems, here's some bike shed painting, the wonderful world of emoji. So they tell you in particular you need to split the cleanup patch and the yak shaving patch into two separate things because they're logically different. They came up with a wonderful new use of the feature and would like to see you implement that as well. They would like some benchmark data to tell you okay, here's how much better this makes it or here's how much worse it doesn't make it. They'd like some tests to tell you how accurate it works and make sure it continues to work in the future and they need you to fix that typo in the commit message of the first patch. So now at this point you're not going to produce an additional set of patches on top of what you've already done. You're actually going to go back and rewrite history as though you had done it that way to begin with. And the reason you do that is so that then when you send a stack of patches that stack of patches as merged into the public history will look like a reasonable series of development changes that make sense and as opposed to seeing a pull request that says implement the feature, fix the thing I just implemented, fix it some more, maybe it'll work this time. You don't want to see that in your public history even though that is what you actually did. There's a lovely article I recommend reading by Joey Hess who wrote our beautiful fake histories talking all about this and there was a conversation yesterday in a presentation talking about how the get histories we actually see are fairly artificial and curated and that's perfectly reasonable for the public history. But git gives you all sorts of tools to do this. So you could amend a commit, you can use the wonderful git rebase dash i to interactively rearrange patches, fix patches, all sorts of things. So this makes a change that git calls a non-fast forwarding change. So git calls a change fast forwarding if the changed version is strictly a descendant of the commit that you're trying to push it on top of. And if a change doesn't fast forward you have to force push it which is generally a sign that you might be doing something wrong or at least something you should pay very close attention to. But okay, you've done this non-fast forwarding step. You've run rebase dash i, you've rearranged the patch series and made it all wonderful again. And now you have git format patch version two and now you've got six patches. So now you've got the separated cleanup and yak shaving, the fixed typo, the benchmark, the tests, multiple uses of the feature. And here's a version that people might be willing to merge. So development proceeds on from there. Maybe you have to do a v3, v4, v5, but what happened to version one? Did you save a copy of it anywhere? Do you still have it? What happens if you need to review? Somebody makes an offhand comment. I liked the way you did this in v1 better or what's the delta here? They want to see the history of development, the real history of development and not the lovely curated history. So there's a few ways you can do that. One of the most common if you didn't plan ahead for this is get reflog. This is a great emergency tool for where did I leave that patch that I managed to throw away in a rebase. You can dig through the reflog and see what was I doing when I rebased. There are several downsides to this. It's ephemeral. If you threw away history, then after depending on the type of rewrite, either 30 days or 90 days, Git will start discarding those entries to save space. So if you think of it recently, this can help. It's also not something you can push and pull to other systems. So if you're trying to show somebody else your history, you'd have to go get the hash out of the reflog, save it to a real ref again, and push it somewhere. So this is not usually a mechanism for anything other than local rescue. You could also dig a mail out of your sent mail folder, which I've had to do more than once. Or go grab it out of the public mailing list archives. But in general, the problem is that Git tracks history, but what it's tracking is curated history. We rewrite history all the time. So what we need is the history of history. We need to know what changed. We need a version of our versions. And Git doesn't actually normally do that. Now there is one Git tool that sort of does this. It's not designed to solve this problem. But Git submodule is capable of tracking this. Hands up, anybody who's had a wonderful and positive experience with Git submodule? So it definitely has options that could probably do this. And I'm sure that you could figure out what those options might be. But in practice, I've seen people solve this problem two common ways. And they both have one thing in common. They both involve pulling one of those two histories out of Git, either the history itself or the history of history. So one common solution is that you pull the patches out of Git. So for example, you might pull them out into a quilt patch series. A quilt is a tool that manages a stack of patches and keeps a series file that says, here's the patches to apply in what order? You might also keep them in a distribution packaging system, like Debian has a patches system that's a variation on quilt. But if you do this, you lose a lot of the power of working with Git. You don't have rebase, you don't have rebase-i. Git doesn't have any of the knowledge of where those patches fit in history. You don't know what they're based on and how to move them around. So it's workable, I've done this, but it's not quite what you would hope. And it still loses out on versioning the patches with Git. So the other approach I've seen, and the one I've used the most frequently, is pulling the history of the patches out of Git. So you keep the patches themselves in Git, you do your rewrites as needed. And every time you come up with a new version, you version the branch names. So you might have a branch named Feature V1, which it was probably named Feature, and then you renamed it Feature V1 when you realized you needed a V2. And then you have a V3 with that typo fix and a V8 rebased on top of 4.6 with Alice's fix incorporated. And at this point, you just add a .pptx and attach it to an email system. And everybody who's worked in a corporate environment feels right at home. We have a version control system, we should be past this. And naming our branch names like we do our file names in the absence of version control doesn't feel very satisfying. So, and if you think that example sounds a little bit artificial, that was pared down from a real internal example of a project that I'd collaborated on. So given that, that's one of the problems. We have version control, we should be using it. For all of our history, there's a few other things we might want, as long as we're trying to build something to solve this problem. When you send out a patch, you might wanna have a cover letter. So in addition to patch one through patch six, you might have a patch zero of five or zero of six or whatever that says, here's a summary of the idea I'm trying to implement that isn't tied to one patch, it's more here's an overview of what I'm trying to do. This is really common for a multiple patch series. And this is not something you currently can track and get at all. It doesn't track something orthogonal to the rest of the series, it certainly doesn't version it for you. The other is what you base your patches on top of. So earlier I showed format patch dash three, I have three patches. Version two, six patches. But I had to actually know how many patches I had in my series. And that might not be off the top of my head if I'm working on a complex series, rebasing it, reorganizing it. I don't keep that count in my head and update it every time I merge or split a patch. Same thing goes for rebase-i. Every time you run it you need to find, well, what's the commit right before the commit that I want to start adjusting? So in practice, this workflow for me usually starts with get log. Go find the spot above where I want to rebase. Copy paste a SHA one into the get rebase-i command line. So that's a bit of a pain. It'd be nice to know what the base is that I'm working from. And finally, this workflow just doesn't allow collaboration at all. There's a very clear statement right around the time every get tutorial talks about rebase-i or commit a mend or all sorts of ways to fix your mistakes as though they never happened, never rewrite published history. And this is a very sensible thing to say. It means other people would have to force pull otherwise, all sorts of ugliness happens. You don't want to do this. But if you can't rewrite published history, how do you collaborate on history that needs rewriting? Anybody ever tried to write a kernel patch series like a new syscall or a new feature together with other people and then rebase it? Anybody ever done that successfully when you weren't sitting right next to each other? I've done this and it mostly involved emailing patch files back and forth. That was not fun. So, and there's all sorts of things you might want to collaborate on. You might want to collaborate on a patch series like I mentioned or a back port of a feature. Let me take these 500 patches to this driver and move them from 4.7 to 3.18, because I hate myself for some reason. Or alternatively, you might want to make a distribution package of something. This is something where you're based off of version 1.3 upstream. You've got a stack of patches and some distribution metadata. You want to move to version 1.4. That's not exactly trivial to do with version control either. And it is kind of a rebased stack of patches sitting on top of a upstream change that you're not pushing into regularly. So, this is the set of problems that I built Git series to solve. So, Git series tracks the history of a patch series, how you've changed it through non-fast forwarding changes. You can rewrite history and it will keep track of what the old history looked like, including a commit message telling you what you were doing. It tracks a cover letter so that you can version that over time. And it tracks the base that you started your series from to make it easy to rebase this. So, before I go into how this works in detail, I'd like to give a demo of how a workflow based off of Git series would work for doing some development. So, bear with me for one second. I'm going to switch back to mirror mode so that I can see the terminal that I'm going to be typing at. And let's take this terminal. That looks good. So, I'm in a Linux kernel repository. And I've got all the standard bits to work with. I want to work on some interesting new feature. And, you know, for the sake of this, I'm not going to be doing non-trivial development on the kernel in front of the audience. So, let's pretend I have an interesting feature in mind. So, I want to use Git series for this. And the first thing Git series will conveniently tell me is, you don't have a series yet. Start one. All right, so let me Git series start feature. And now I have a head detached at the top of my tree, because Git series is tracking where head is. So, it doesn't also maintain a standard branch for it. So, I want to start this series, let's say, on 4.7 kernel. So, let me get that checked out. And I want to tell Git series that that's what I've done. I'm going to base that on 4.7. I could also say Git series base head, for example, anything that references a commit. This is what I've started from. Now, I want to make some interesting change to the kernel. So, in lieu of an interesting change, I'll say this is change A. And I'll commit that. And then let me make some second change to the kernel. And I don't really want to deal with commit conflicts at the moment, so I'll make that change more than three lines away. And I'll call that change B. So, there is a two patch series to the kernel. And there's what it looks like. So, now I want to work on that with Git series. So, let me run Git series status. And that'll tell me, okay, you're on the series feature. You haven't committed any series commit yet. And you have a base and a series that you've set and not committed. So, this works just like Git status. And you can add the same way. So, I can Git series add the series, for example. And Git series status again. It'll tell me you've now added the series. You could commit that. You have a change to the base, which you could also commit. So, I tried to make this as much like the standard Git workflow as possible with files named base series and cover. So, I could Git series add base. Git series commit. And give it some sensible commit message. Initial version of feature. So, I've committed that. And status will now tell me there's nothing to commit. I haven't changed the series. And log will tell me, hey, I have an initial version of the feature as the series that I just created. So, now I've got an initial version of history. It hasn't changed significantly yet. So, I might also want to add a cover letter to this series. So, let me add a cover letter. And series status will now tell me, well, I've added a cover letter. So, you don't always have to use series add unless you wanted to incrementally add different pieces at a time. You'd usually just do a commit-a, just like with Git. So, let me commit add cover letter. And Git series log will show me, okay, I've made two changes to the series. So, this hasn't been that interesting so far because I haven't actually rewritten history in any way though. So, let's say I get some amount of feedback. I can get series rebase-i. Now, notice I don't have to tell it what patch I'm basing this off of because it already knows the base. So, it just tells me here's the set of changes. Let's say I want to reorder them and I need to reword something in change-a. So, it'll go switch that, it'll rebase. Further details about change-a. So, now I've rewritten history. And if I get series status, it'll tell me you've now modified the series. The base is the same because I didn't rebase the series on anything new. I've just rewritten it. So, at this point I can commit it again. Reorder patches and fix typo or add details to a. Okay, so, now I've again got the history of I've rewritten these changes and this is one fast-forwarding history of a series. So, I could actually push this series in series form somewhere else and somebody could pull that down and see you had an old version, you added a cover letter and then you reordered the patches and added some more details to a patch. So, that's helpful. And then, let's say I want to rebase that. I started this series on 4.7 and let's say I need to rebase it on 4.8 RC3 that was just released. Now, in the context of kernel development, I should mention that the standard advice is do not rebase right before you submit because then you've just created something that hasn't been tested. So, rebase when there's a good reason for it, test it after you rebase, standard advice applies. So, just providing that reminder, but I can rebase this on top of V4.8 RC3 and same thing applies. It's gonna go check out 4.8 RC3, rebase the patch series on top of that and I just booted up so that wasn't in cache, unfortunately. So, now I've rebased the series and again, status now tells me I've changed the base since I've moved it to 4.8 RC3 and then the series because I had to rewrite history to put it on top of 4.8 RC3. So, let's commit that as well. Update to 4.8 RC3. So, now I wanna submit these patches presumably after retesting them. So, let's format patch and again, I don't have to tell it what I've based that on. That generated a cover letter saying, hey, here's the summary of why this is important, here's a short log, here's a diff stat and then I can look at the change, first change in the series and same thing, here's a patch with an emit commit message and then here's the diff for that patch. One thing worth noticing is format patch normally needs a dash dash thread argument. If you want it to set up the headers to build a message thread, that seems like the sensible default to use most of the time. So, git series format actually defaults to that so that when you go to send a mail it'll all be nicely threaded. So, this gives you the patch series ready to send off somewhere but again, you don't have to do this via format patch and sending changes via email. You might instead do this by a pull request. Maybe you're saying please pull my maintainer tree to merge these changes into the kernel. So, let's do it that way instead. Let's create a tag. Normally I would do this with a signed tag. Awesome new feature. So, I have a tag feature and I'm going to push that tag to some far distant remote location in my home directory because I do not wanna count on the local network and pushing off to a separate repository in the middle of a presentation. So, I'm pushing to this remote repository the tag feature and then let's do a pull request. So, I ask for a pull request from remote location Linux and I wanna pull the tag feature. That will immediately generate a git pull mail that says here's the summary of why this is important. I'd like you to pull changes since 4.8 RC3 which are available in this obviously widely available URL under refs tags feature. And here's the change on the top of that stack. Here's the short log, here's the tag message. So, a perfectly formatted please pull mail ready to send. So, either way, hopefully I can get that feature into the kernel with some actually interesting changes in it and not silly edits to the readme. But let's say I wanna work on some other feature at the same time while I'm waiting on that. So, I have git series telling me much like git branch. Here's the one series I have and I'm currently on it. So, let me switch to a new series. Let's git series start another feature. And I wanna start that series at say 4.4 and set the base to let's just say head. Set it at 4.4, git series add series. Let's see I'm halfway through making this and here's the current state of setting up the series. And hang on, I need to go make some other change on the original feature I was working on. So, let me switch back to the other feature. Now, notice I had a bunch of things sitting there waiting to be done on that one. But I could check out this other feature. It's sitting here waiting for me to do work on it. And if you're used to git checkout and how it handles changes in your index and your working directory then it probably looks like oh, what happened to those changes? Did they get thrown away? Did I just lose what I was working on? You know, including the carefully written cover letter that I might have just added. Git series checkout another feature again when I'm done working on the first feature. So, it actually remembers all of your working changes and stage changes independently for every series. So, this doesn't change the core behavior of git with respect to things you haven't committed but anything that is either a change of base or rebase, an addition of a cover letter, none of that will ever get thrown away as you move between series. So, it's easy to change contexts if you're working on more than one project at once. So, that's a quick demo of the workflow. You can create a series, you can rebase a series, you can format it and send it off, you can set up a pull request, you can switch between series and work on several things at once. So, let me switch back to the presentation and go through exactly how that works. There's my mouse and my part of the presentation. There we go. So, what are the internals of this look like? So, first thing I wanna mention is that this is kind of a general overview of the high level internals. There is full documentation for the storage format and exactly how and why it's the way it is available in an internals doc in the repository which you could also view online. But I wanna go over the key details of this. So, first of all, I wanna do a review of what Git internals look like just because a lot of people may have worked with the interface to Git but I don't wanna assume that everybody has worked with the internal storage format of Git. So, as a quick review, Git has four major types of object. It has blobs which you could think of like a file. It has a tree which you could think of as a directory storing trees and blobs by name. It has a commit which has the tree of that current version, some message and other metadata like who committed it and when and some number of parent commits. Then you have a tag which may or may not have a signature, has a message and says this commit at this time by this person, this date. Here's an important message. Here's a GPG signature. And then you have refs which are not an object type. There's something stored in the .git directory that says, okay, refs heads master which is your branch named master refers to this commit and refs tags feature is this tag. So, that's all of the Git internals that you would need to know to work through the upcoming explanation of how Git series works. So, one other detail though is that trees in Git can actually refer to commits not just to blobs and other trees. This is called a Git link. This is something that Git sub module uses internally but it's not inherently part of Git sub module. It's part of the storage format. And you could use this anytime to reference this version of a commit is stored in this tree object. So, Git series uses this to track the series and the base. So, I also wanna go over some of the requirements for what Git series has to fulfill in order to do what it needs to do. So, first of all, every object always has to be reachable by Git at all times because if Git ever can't reach something it may throw away that object when it's next repacking the repository or for that matter you need to be able to push and pull the series and everything that you want accessible on the remote end to view the history of the series has to be reachable from the ref that you push. So, given that I need to fit in Git's existing object model. I don't wanna add a new object type. I don't wanna version repositories where you need a new version of Git to work with Git series. This should just all work transparently. So, let's talk about some non-trivia, you know, some moderate history here. I've got some upstream project with a couple of commits in it. You know, commit X and then commit Y that came after that. I've got a two-patch series V1 that sits on top of upstream and then I've got another two-patch series that's been rewritten and also moved to sit on top of a different upstream, upstream X. So, I know I'm gonna have two versions of this series and those are gonna be commits in some way and I know I'm gonna have a parent relationship there where series V2 is descended from series V1. I know I'm gonna wanna reference that from ahead so I'm gonna call that Git series slash feature which then means you can use Git commands like Git branch or Git push to work with it just by prefixing it with Git series slash. So, first of all, series V1 and V2 will each have a tree object as part of their commit and I'm using the trees as effectively a key value store. Here's a name and here's an associated value. So, I'll have a name called series that is a Git link pointing to V1A, the top of the series and another one called base that points to where that series is based off of. I can do the same thing for V2, point a series at the top and a base at the bottom and I can even have a blob for the cover letter. So, that's a normal set of Git objects, everything is reachable. Now, there is one interesting caveat to this setup and that's that Git by default will not follow a Git link for reachability or for push or pull and the reason for this is a historical oddity of why Git links came into existence. They started for sub modules which are expected to reference a different repository so the expectation is that a Git link may refer to a commit you don't have and it may include information saying here's where what repository URL to get it from and similar but it can't assume that that's in the repository you have so it doesn't try to go follow that link. So, to fix that problem, those series commits in addition to series two descending from series one, it also includes an extra parent commit as though it's a fake merge commit that extra parent points to the series itself and Git will follow that link. So, then Git series just ignores that parent so when I do a Git series log, it'll look at the graph Git series slash feature. It'll say, well that points to series V2. Series V2 has two parents, series V1 and the thing that's also referenced as series as a Git link. It'll say, well that's referenced as a Git link inside the tree so I'll ignore it as just a reachability helper and I'll just follow the other history. You can actually have series merge commits that might have two parents that are other series commits as well as these additional fake parents that are used for reachability so those can all be trivially ignored for the storage format. So, there's a couple other minor details of the storage mechanism that I wanna go through. One is I was always operating on the current series. I had some notion of I'm working on feature and I'm working on other feature. Git does this for branches as well so your current branch is referred to by head which is a symbolic reference. So, I use the same mechanism. I have a ref slash S head for the series head. The reason this is under refs and head isn't is that Git only looks at head outside of refs for things that it considers reachable. Everything else has to be inside of refs. Apart from that, I also showed a working and a staged version that status kept showing. Here's the red version you haven't added yet. Here's the green version you've staged for a commit. So, where do those live? So, those live in additional couple of refs for any given series. There's a working and a staged version and those just point to a temporarily generated commit. Each time you add something, it will generate a commit and reference it here and then it'll pull the data out of that and build a real series commit as you go. So, again, the main reason for this is to keep all those objects reachable. If I were to do a Git series cover and write a non-trivial cover letter, then that's reachable from my current series and then I go switch off to some other series. I want that cover to be still reachable so it doesn't get thrown away when the repository gets repacked. And it's not yet referenced from one of my actual series because I haven't done a series commit, so it has to be referenced from a ref. The fact that I don't put this inside of refs heads or refs tags and instead just put it directly in refs means that it won't show up as noise in places like Git branch showing you all your active branches. So, other tools like bisect or stash also use the same technique to store temporary refs. If you've ever run Git stash, it creates a ref stash. So, same kind of idea. This is an internal detail. Apart from that, I want to mention an interesting thing I ran into when developing Git series. I found several times that there are ways that I could avoid having particular classes of errors to deal with. So, I found on multiple occasions that I went off and started writing a long and complex error message about a state the user could get into that I wanted to say, well, if you go try to do this, some problems gonna happen, you need to deal with that. And what I found every time was that those long and complex error messages suggest I might have a design flaw of some kind. And I looked for, is there some way I could redesign the architecture here so that that error just can't happen? I don't have to explain it, I can just avert it. So, given that, here's a couple of examples of that. One is, what happens if I go detach from a series or go check out a new series that has uncommitted changes like I showed in the end of the demo? So, what I had started to do originally was I'd only had, much like Git has one index for a repository, I had one staged and working version for a repository. So then I went, okay, I wanna check out a new version but you've changed this version so I need a warning message and maybe I needed a series checkout dash dash force to say I wanna throw those changes away. And before I went down that route, I realized I could just give every series its own independent working and staged versions and the error can't happen anymore so I don't have to deal with it, I don't need a force option, I just avoid throwing away information. Same thing goes for what happens if I create a series and I haven't made any commits on it and then I want to check out a new series or detach from a series. In Git, if you create a brand new repository, head will point to refs heads master but master doesn't exist yet. So if you run Git branch, it won't actually say you have a master branch. When you do your first commit, it will come into existence but if you switched branches, master will not exist. Git calls this an unborn branch and it involves several special cases. The case I was thinking of when I was trying to write an error message for this was well what happens if you've again made changes on that commit and on that series but you haven't committed them yet, what happens to those? So rather than writing some non-trivial error, I ended up just saying well as soon as you start a series you have working in staged versions of it so I can create those internal refs and then Git series will say this brand, this series is new, it doesn't have any commits yet and Git series checkout will actually work to switch to that, show you the current state and then when you go to series commit it will create the very first commit which means unlike Git, which can only have the unborn branch that head points to at any given time, here you could create several series that are all new and haven't had committed data on them yet and switch back and forth between them and Git series will actually show you you haven't started this one yet, you haven't started this one yet either, this one you've started and done some work on. So same idea, if you have a long and complex error message you might have a design flaw that you could redesign to make the error impossible. Not in every case but it's worth thinking about the next time your error message starts to word wrap in your code and you're trying to figure out how to phrase it. So apart from that, Git series rebase was quite interesting to implement here because libgit2, which was essential for implementing this project doesn't actually have support for doing a rebase or a rebase-i. So I wanted to implement this, it was one of the more critical features of Git series to prove how valuable it was but libgit2 didn't know how to do it. Well it turns out anytime you do a rebase it might potentially fail in the middle or stop with a current state so Git saves all that state in the .git directory and then if you have to resolve something and go on you run git rebase-sh continue. So the approach I took as a temporary measure while working on support in libgit2 for doing rebases was I would write out all the state information that Git would for a rebase and then exec git rebase-sh continue. So this will cause Git to resume the rebase that it never started, go do all the hard work for me and all I had to do was write out the to-do file that says pick this, pick this, reword that, pick this. So that turned out to make this quite workable and there's still reasons I wanna do it internally at some point for various reasons having to do with wanting to see what the history is as I rewrite it so that I can match up commits and provide better helpful information and notes and that kind of thing but as a temporary measure this worked remarkably well. So the last thing I wanna cover is the tools I use to build this. So I built this in Rust using libgit2 and the Rust bindings to libgit2 and I mentioned already first of all libgit2 was absolutely essential for building this. I would never have built this if I had to build it entirely in the Git code base mostly because it would probably never make it upstream because it's a particular structured policy for this is how you should build a series and transmit it and apart from that I would not wanna write this as a shell script wrapped around a pile of Git commands because it's doing a lot of interesting state manipulation and that would not be trivial at all. Libgit2 makes it really interesting and easy to experiment with Git repositories and I'd highly recommend it. The other is that I had a very positive experience writing this with Rust. This was actually the project that I used to learn how to write non-trivial Rust programs. I ended up with something like 1500 lines of Rust and it's still a very young language. It's working very well and can be used in production but I certainly had to submit various patches to the Git2 bindings and a few other libraries to make it do things that I needed to build this software but it was still a really fun experience of not having to deal with memory management. Libgit2 has a pile of interesting objects like commits and tags and refs and all sorts of other pieces of the repository that normally get reference counted and you need to carefully manage them and I didn't have to think about that at all. I just create an object and when I stop looking at it, it disappears. It's a lot like working with garbage collection so I don't wanna spend a bunch of time evangelizing Rust. There's all sorts of other presentations about that. There's a presentation later today talking about building Rust systems, doing systems programming in Rust as well as a whole conference coming up on it next month but I just wanted to mention I had a very interesting experience building Git series in Rust and I would highly recommend it. So with that, I want to wrap things up by saying that Git series is available on GitHub. There's instructions available to how to get started and how to install it and with that, let me open things up for questions. Thank you. Yes. So the question was, do I have any recommendations for tracking which people you should send the patch series to as reviewers? So that is something that I've had a problem working with that in the past of I have 20 different people but this patch series needs to go to some combination of running gitmaintainer.pl from the kernel and finding out here's the set of people that might care about this, see seeing them consistently on every mail. There are scripts to help automate that for a given patch series but that is something that it's kind of local metadata. It's not the kind of thing I would necessarily wanna commit as part of the history of a series but at the same time it is useful information that I'd really like to track to make it easier to say, hey, let me generate a formatted series that automatically knows who all it should go to. I think that'd be really useful information. There's several things I am looking to add to ease the management of a patch series. Another is that format should really, when you're generating the cover letter should really generate a patch change log that says v2, I changed this, v3, I fixed this typo in the commit message. v4 I added these benchmarks to here and that needs a little bit of extra information because you don't wanna call every commit to the series a new version because you haven't mailed every version out so you kind of need a get series tag to say these are the versions I've sent out but I am looking at building a UI for that as well. So to and cc version tracking notes on commits I would like to track those and make that easier. If you have any suggestions for how that should work please do contact me and let's work out what the UI should look like. Other questions? Yes? I see. Sure, so the question was that how does a get series relate to a standard get branch and in what circumstances are you changing a branch versus changing a series? So when you're working on a series normally you'll work with a detached head in get. So you won't be changing any particular branch you're not working on master or a branch named feature or similar you're just working on the series and then you have the series tracking what the current version of that branch is you can potentially push that version of the series to some branch remotely or some tag remotely so that you could say please pull from here and you could absolutely mirror that to a local branch if you need to work with it with standard get tools without using a series for example. I actually have a conversation going with a kernel maintainer who is wanting a mechanism to automatically handle that to make it easier to push and mirror branches to a remote repository so I'm looking at having each get series automatically keep a particular branch up to date with the latest version of that series so that it's easy to have what looks like a non fast forwarding branch that tracks at least here's the current version of that so that's another case where implementing it would be trivial I wanna nail down exactly what the UI should feel like for that but normally when you're working on a series you're not working on a particular branch and I didn't want to set up a scenario where you have a series being worked on and then you also have a branch that head is pointing to because then when you rewrite history you're not just rewriting the history in the series you're also rewriting the branch which might not be what you would expect so I wanna avoid any surprises by rewriting a branch that you might have expected to not get rewritten but I do think it would make sense for example for get series slash feature to have a corresponding branch named feature for example. Yes. That's sort of related to that does get series have any have you thought about how you would actually do that and suppose you've got one person in say Korea and one person in the United States and it's 5 p.m. so you push your series to a repository and then it works on it and hopefully there's no overlap but there are interesting scenarios where at 11 p.m. you have this insight you start making a change and there are conflicts. Absolutely. It is an interesting problem I'm wondering. It is an interesting problem yes I've put a great deal of thought into that. The get series in its current state has some of the preliminary support for that so it will track the history of a series you can push that history elsewhere and pull it back down and check it out and work on it on another repository. If you're working on it entirely at disjoint times and don't make conflicting changes then that should work perfectly there's one minor UI item I need to enhance which is that when you've pulled down say origin get series feature for example and you need to check that out as your local version of get series feature I need the automatic logic that get checkout has for saying origin master exists so I need to check out a local tracking branch of that. There's a little bit of extra there but in general it should just work as get series checkout feature and it works the same way as get checkout. In terms of collaboration that's enough that if you aren't simultaneously changing the series and have diverging changes then that's enough to very easily collaborate at least a lot easier than mailing patches around which has the same problem otherwise. In terms of what happens if you both diverge and change the series at the same time that gets rather interesting. There is full support for what the history would look like in get series so if you have a series that diverges and then re-merges and diverges again and so on get series log will navigate that just fine. There's no issue there. The format is completely capable of supporting that and I have actually tested that by manually creating a series merge commit that resolves a non-trivial merge and then looking at the log and saying okay that seems reasonable that's what it did. That said in terms of helping you deal with that the current plan for how I'd like to do that is a lot like what the earliest versions of get did for merging which is to say here's their version here's your version let me know when you're done. And that would still be an improvement over what's available today where the same kind of thing happens if you do a non-fast forwarding merge you have to go resolve it. You could say here's the upstream version here's what you've done figure out what's changed and resolve those. My current plan is to start with that and then give you the tools to say okay I'm done now merge the two and here's a commit message. But then on top of that I'd like to have some tools for dealing with increasingly non-trivial cases. If you've just changed a commit message in one branch and the other branch has rewritten a patch you should be able to resolve those without conflict. If I've reordered two patches but haven't otherwise changed them and then another change changes one of those patches in many cases it should be possible to figure that out and update it. Git itself doesn't really have much mechanism for figuring that out either there are some mechanisms like git patch ID which help you figure out what the identity of a patch is when it gets reordered but patch ID is fairly fragile if you change anything in the patch the ID will change. There's some work that was being done by if you saw the presentation yesterday on doing a token identification in git and figuring out authorship that same group of folks is working on a interesting way to identify a patch when it mostly matches. I'd like to look at some of that and see if it might help. This is a case where I'd like to start with the simplest thing that could possibly work and incrementally add more things to make it easier. And hopefully it'll become something that you can use fairly easily for most common cases without for example having to reinvent the patch calculus that the darks author created for adjusting patches and doing renames and similar. I don't want to have to go that far and I'd much rather have something much simpler than that but incremental tools as I go. So you actually can name a change so even when the patch ID changes right fall back to a change ID like Garrett does. Absolutely that would be handy to have. I do want to avoid putting it in the commit message of the patch because that tends to draw fire when you send it upstream because it's noise. One thing I'd love to do is work with the git developers to find out if patches could natively have a change ID in their metadata for example. There's also other non-trivial things that come up there of if I take a patch and split it into two patches which one gets the change ID and which one gets a new ID or do they both get new IDs that both say here's what the old ID was. There's one other item that I plan to do some research on there which is that Mercurial has a whole logic for doing patch evolution and saying this patch supersedes that patch, this pair of patches supersedes this patch. It's an interesting mechanism that's incredibly powerful but unfortunately relies on metadata that is exclusive to Mercurial and not available in Git. So because of that, I don't think it would be trivial to port directly but I'd like to have all of those same features if possible and there might be a few bits of Git metadata that could be added. I know that Git objects like commits have managed to change in a non-backward incompatible way in the past for example when signed commits were added that was possible without breaking old versions of Git too badly. So I wonder if it would be possible to add a change ID without breaking Git. I would love to try that. Yeah, you'd have to know. Exactly and come up with here's a random 128 bit number. Let's try that or something sensible. Any other questions? Yes. So the question was since I'm using a Git link in a tree to reference a commit but not using a sub module so I don't have a .Git modules file and it's not a Git sub module would that break assumptions in tools that a Git link always refers to a sub module? So yes, it would break that assumption. It's not abusing the format in that there's kind of a mechanism versus policy split here where the only thing native to Git is the Git link and then Git modules and also metadata in .Git slash commit sorry .Git slash config are metadata for Git sub module the tool. It's not required that a Git link always refer to a sub module. It's possible there may be other tools out there that make that assumption but I have seen non-trivial software that uses Git links and doesn't use sub modules for them other than Git series. So in general that's not really a safe assumption and I think the easiest thing you may want to look for is if there is a Git modules file that references it it's a sub module and on the flip side a sub module will not typically have a commit referenced from the Git link that is available in your local repository. It will be in a separate repository and there will be sub module metadata in the config and .Git modules. So the two cases are distinguishable but yes a tool that was assuming that every Git link as a sub module would trip on these particular commits, yes. I think this will be the last question there's about a minute left. How well has this been received? So this has been released maybe a few weeks ago. So far the response has been fairly good. I've had several developers including one quite excited kernel maintainer talk about how they're using this and what they're interested in doing with it ask for some additional workflow changes. Somebody's already building a big stack of non-trivial local scripts written around it to manage their stack of interdependent patch branches and that thrilled me no end. I did get some feedback on, the feedback on the fact that it's written in Rust has been a little bit mixed. A bunch of people who enjoy Rust already are excited about it. A couple of core Git developers are, well this isn't written in C so I'll never touch it and I wrote it in Rust knowing there would be people who would have that reaction and I don't really care. I would like there to be less C code in the world. So I'm fine with that. Apart from that, lots of good discussion, lots of feedback, vast majority of it positive and then just some discussion about how does this interact with other tools and what other format choices could have been made. So I'm happy to chat with people afterward, have any other conversations about this or follow up via email or similar. Thank you very much.