 My name is Michael Haggerty. I'd like to talk about Git iMerge, incremental merging for Git. This was originally planned for a longer time slot, so I think I'm going to have to go pretty quickly. The two common ways of bringing two branches together would be to use Git merge or Git rebase. They both have big problems, which I'm sure you're aware of. I'll go through quickly. Git merge. You have one big commit that you have to resolve all in one burst. Resolving big conflicts is hard. Merging is all or nothing. Once you've started a Git merge, you can't really do anything else with your repository until you've either resolved the conflict or given up and gone back to your old version. You can't record your progress. You can't switch to another branch while you're in progress. If you make a mistake, you can't go back. It's all or nothing. There's no way to test a partially merged tree. Once you have those merge conflict markers in your code, you can't build your code. You can't test it. You have to resolve the whole conflict, and then you can test. It's hard to collaborate on a merge. There's no way to push a conflicting merge to your repository. So Git rebase, on the other hand, is a little better in some ways and worse in other ways. For Git rebase, you have to resolve each of your branch commits with all of the commits you've made on master. I'll talk about branch and master. They could be generic branches, of course. Which is better than Git merge. Merge, you have to bring all of your changes together at once. Rebase at least breaks it up a little bit, and I think you've noticed that that's usually easier. But rebasing is also all or nothing. Once you're in the middle of a rebase, you can't use a repository for anything else. It's hard to store a rebase that's in progress. It's hard to share the work with somebody else. You can't collaborate on a merge. And once you're done, rebasing discards history. Is this on now? OK, rebasing discards the final history. And so what you're left with is a repository that doesn't include all of the things that have happened to it. And that has a lot of side effects. Well, the main one is that it's not recommended to rebase work that's already been published. Because if somebody else has based their work on your original branch, they're going to be screwed when they try to rebase or merge their branch. So rebasing is hostile to collaboration. So I'm going to skip ahead a little bit. And then I'll come back and summarize the advantages of incremental merge. But first I just want to give you like a sketch of what is the basic idea. So suppose you want to merge a branch into master. And I've just drawn this in the traditional way. You have some commits on master. You have some commits on branch. And you want to bring them together. The first step is just draw the diagram a little bit differently. I've just drawn the branch commits down the left side. And now one of the main problems with merge is you've got so many changes on master, so many changes on branch, and you have to bring them together all at once. So let's do the smallest thing possible to try to make progress towards the full merge. The smallest thing is we just merged the very first commit on master and the very first commit on branch. This is a lot easier, of course. There are fewer changes. You've got a log message for each of those two changes. You can look at the log message, see what those commits were trying to do, bring them together. In fact, it's pretty likely there's not even going to be a conflict. And Git can do that for you by itself. And now, yeah, so this is important. At that point, you can store that merge commit into your repository with two parents. You can tell Git, yes, I've done this merge between commit one and commit A. And here's the result. And Git can remember that, can store it in the repository with two parents, one and A. And then Git knows what you've done and has a true picture of what's been accomplished so far. Then you just continue. So this next commit is maybe not quite so obvious, but you've got to commit B1. It's the merge of the merge commit that you just made, A1, plus the branch commit B. So what this B1 commit includes is the change that was made in master in step one and commit one. And it includes the changes made in commits A and B on branch. But the change from commit A is already merged together. That's already in A1. So you're not going to need to worry about conflicts that were already resolved in that first merge. What you really need to do is you need to, I think I put something here. You need to add the changes that were made in commit B on the branch to the state that already exists in commit A1. Or equivalently, you'd need to take the changes that were originally made in branch, I'm sorry, in master commit one, and apply them to the state that's visible in commit B. So the key point is that this commit should be no harder than commit A1, because you're bringing one change from one side, one change from the other side, doing a pairwise merge, pairwise conflict resolution, and then you're done. And when you're done, you commit it as a merge commit with two parents, B and A1. Now, it should be pretty obvious what you do. You just keep continuing here. Let me just take a characteristic typical commit here in the middle, because it's a little bit less obvious, but it's really the same principle. This commit C2 is taking the change that was made in master commit number two and adding it to the state that's already recorded in merge commit C1. Or equivalently, it takes the changes that were made in branch commit C and adds them to the state that's already recorded in your merge commit B2. Is it clear? Are there any questions? In all of these commits, you're just taking one commit from each side. You're doing a pairwise merge of those two. You have both of the original commit messages to see the intent of those two sides. You bring them together, and you record the result. So you, oh, this is an important typical take-home message. I shouldn't skip this, I guess. Oh, I didn't explain that already. OK, so here we go. Oh, here we go. Now we have a problem. Not all of these commits are going to be conflict-free. So at some point, the tool, Get I Merge, can do all of the conflict lists, all of the merges that have no conflicts automatically. At some point, the user is going to have to get involved. There will be some pair of merges of commits that don't merge, that have a conflict, and then you're asked to resolve it, similar to with the Get Rebase. You Get Rebase goes da, da, da, da. As long as it's happy, it just keeps going. When it gets to a conflict, it presents you with a merge conflict, and you resolve it. You commit it, and you check it in. Now you have F2. And this is what you do. You continue through the whole diagram. Now what can you do once you've got one of these diagrams? You've done all of these incremental merges. I'll tell you later how to make them easier and automated. The fact is that one of these completed diagrams contains all of the information you might ever want to know about bringing these two branches together. And I'm going to show you concretely. In fact, it knows too much information, you're probably going to want to throw some of the information away to avoid cluttering up your repository with it permanently. So the first question, where is the simple merge of branch and master? And the answer is, this is commit G11. G11, as you can see, contains all of the incremental merges going across to master. All of the commits going down to branch. They're all together in G1. So the contents, the tree of G1, if you were to check it out, is exactly the tree that you would hope to get if you merged master and branch together directly. But the picture looks different, obviously. The reason is because we have too much information. But what you can do is you can discard the information of all of these internal commits. Take G11 and rewrite it to have only two parents, Daimli, 11, and G. Once you've rewritten it, the other commits are no longer reachable. They just get garbage collected sometime. And what you have is a merge commit. It might look more familiar like this. But topologically, it's the same thing. And the contents are exactly the same. You have a merge commit. Here's the next question. Where is the rebase of branch onto master? Where is the result of a rebase? Where do we find that in this diagram? The answer is, it's the rightmost column. Those commits on the rightmost column are essentially the commits A through G rebased over to the current tip of master. And again, you can have the tool rewrite your history to look like this. And now it looks more like a rebase. It looks as if you had done these changes A through G on the tip of master rather than when you did actually in the past. Similarly, you can find the rebase of master onto branch. That's just the bottom most row of commits. And by rewriting the commits, you can transform the repository to look like this. Once you've done that, you can push the results. Just nobody will even know you've just get merged. Get I merged, I'm sorry. If you have already published your original branch, that is the A through G version of the branch, and you still want to do a sort of a rebase, but you know you're going to screw over your colleagues if you do that and commit it. There's another alternative. You can rewrite the history to look like this. This is the same rebase commits that you would get if you did a normal traditional rebase, but each of the rebase commits has as a second parent the original commit from the old version of the branch, the branch before it was rebased. This is something I call rebase with history. The tool can do this for you. The nice qualities of it, it keeps both of the old and the new versions of the branch in your permanent history. Anybody who's based their work on the old version of the branch can just rebase or merge like they would normally want to do. Therefore, it makes for a version of rebase that's friendly to publishing that you can publish. Now, just a quick word about efficiency. The tool doesn't actually have to do all of the incremental merges between master and branch. So this would be a typical scenario. You have three conflicts marked with X's here. The tool has figured out where these conflicts are, but there's this big block of merges that it's all done. It turns out the tool doesn't need to do all of the intermediate merges. There's an algorithm based on bisection that can find exactly where the conflicts are via a heuristic. And once it knows where the conflicts are, it just fills in the boundary of the rectangle that goes to the conflict. So this rectangle E through E6 through 6, this rectangle can all be done very quickly. So it's not nearly as a processor intensive or, I don't know, resource intensive as it might be. Where are we here? A quick demo. So there is a tool I'm working on. It's still very much beta, but it's up on GitHub under, it's called Git iMerge with a dash in between. And I just wanted to show very quickly how it works. You start it just like a Git, or very much like a Git merge. You check out the branch you want to merge to. You do Git iMerge start, and then here on the right side is the branch that you want to merge into your starting branch. And you have to give the merge a name, this name merge branch. I'll tell you why in a moment. Then the tool just works away. It tries the auto merges to do a bisect and find where the conflicts are. Once it does, it uses these auto filling commits to outline the rectangles of the mergeable areas. And all of these things get recorded to your repository. Once it's found the border of what it can do automatically for you, it switches over to this conflict situation where it first shows the first and second commits that conflicted, but the versions that were originally from master and from branch. So the commit messages that came with these commits should explain what those commits are trying to do. And then it asks you to fix the commit and type continue. There's a very crude diagram functionality in here. It shows the state of your merge. These question marks are merges that could be skipped. The rectangles here are the mergeable areas. And this hash mark shows where the conflict is or where there can be more than one conflict. You add, you fix the change, add, Git commit, and then type git-im merge continue and da-da-da-da. It goes on and on. It might find more merges because as it's going, it then covers more of the diagram and there might be more conflicts. When it's done, then you can do see a diagram again. You see that by the fact that the bottom right corner is filled in, you know that the merge is finished. And then you can git-im merge finish and you can say what your goal is if you want a simple merge. It'll convert it to a simple merge for you. If you want a rebase, it'll convert it to a rebase, rebase with history. And I'm going to add a couple other alternatives. That's not so interesting. So let me just summarize what I see as the advantages of this style of merging. You only ever have to merge conflicts that involve one commit on branch, one commit on master. And small conflicts are much easier to resolve than big conflicts. In fact, big conflicts can be completely intractable. I've had branches that I had to abandon because I couldn't merge them. This would be a hope of rescuing some of those old branches. You can see the individual commits, their commit messages and so on, and that helps you figure out how to resolve the conflict. Git merge records all of the intermediate states. Every time you resolve a little conflict, you store it to your repository. It's there, it's permanent. In fact, you could push it, you could pull it, you could share it with a colleague, you could ask the colleague to fix a merge conflict in his part of the code. This isn't all implemented, but it's pretty trivial because everything's stored in the repository with the correct history. It never shows the same conflict twice. Once you've resolved the conflict, Git knows that that conflict, how that conflict resolution can be applied to further emerges. You can test every intermediate state. Once you've solved one of these mini conflicts, you can run your test suite on it and see. You could even tell Git I merge to run the test suite as part of its automatic search for conflict. If you're not only interested in textual conflicts that Git finds for you automatically, you could run your test suite. And if the test suite fails, you call that a conflict too that needs human attention. If there's a problem and you don't discover it right away, you can use Git bisect to find which of those grid of merges was done incorrectly. And it's automated through this script. It's surprisingly fast thanks to Git being so fast and also because of this bisection algorithm that makes it possible to do it quickly. And the final result can be simplified to be saved in the permanent record as a normal merge, a normal rebase, or a rebase with history. So I think that's all I wanted to say. If there are any questions, I'd be glad to... This is very early results still, and so I'd love to have feedback, testing, and people contributing. Thank you.