 Hey, folks, as we've seen in recent episodes of Code Club, using Git and GitHub is just a tremendously powerful tool. If you don't buy into the idea that we can use Git for version control and keeping track of different versions in the history of our project, well, we've certainly also seen that we can exploit the niceness of the folks at GitHub to put up a website and to let us use their servers to run a reproducible pipeline as we did in the last episode. So as nice as all these features are on GitHub, really what's most important about Git is the ability to keep track of version control. In today's episode, I'm going to talk to you about a tool within Git called Rebase that can be used to rewrite or repackage old commits to make it easier for you to navigate and see what changes are made over the course of your project. Over here on GitHub, I'm at github.com, riffamonus slash trout index, all those s's and slashes. Anyway, we are in our projects repository here on GitHub. And if we go ahead and click on the upper right corner of this box, we'll see a clock going counterclockwise. And at least on mine, I've got 35 indicating that there have been 35 commits over the history of this project. Again, if we kind of scroll all the way down through here, we'll see back on September 7th, the initial commit when this was just a dream. Anyway, but if we scroll back more recent in history, we now see that there have been three days since I put up the rendering or the workflow that generates the website with the drought visualization that we created as part of the series. I've noticed two things in here. Well, on October 18th, a week or so ago, I created the index page, the web page that we see when you go to riffamonus.com slash riffamonus.org slash drought index. So we created that page in two steps. We created a hello world, just to kind of prove to ourselves that this was working, that we could render something from GitHub onto a web page. And then we made it pretty using our markdown. Also, if we then go forward to the next time I was working on this, you'll see that there are quite a few commits that I made on October 25th, where I created the pipeline. I created run pipeline.yaml all the way through the process of deleting the index.html file, so we could then recreate everything by rerunning the snakemake file and then recommitting everything. Anyway, there's a whole bunch of steps in here that could be squashed down together into a single step that we might call create and run workflow. So what I'm going to do is exactly that in today's episode of Code Club. We'll come back and do the simple case of these two commits and squash them down into one. And then we'll do the more complicated and truly only complicated because there's more commits to squash all these down into a single commit. On GitHub, one of the nice things is that we can track the changes of the repository and look at see what happened at each stage of a commit. And so where we have these two angled icons, the greater than and less than sign, we can see what the commit looked like at that point in the history. We could then see index.html for hello world. All it said was hello world, right? But if we come back to our history and then we come to what the commit looked like after the next commit, we come down and we see index.html, which then is the full fleshed out HTML version of the file, right? So I don't know that I really need to keep track of where it said hello world, right? If we come back to our history, we can also click on the individual commits to see what changes were made at each step. So for this first commit, the hello world commit, we see that we added hello world. If we come to the next commit, we see that we've made a couple changes. So we added to the get ignore the line dot vs code. I must have been seeing some of the files from vs code that I didn't want to commit. So I ignored that. And then also in the snake file, I indicated that we need to build out the drought.png file here, as well as the index.html file. This is saying minus, because we removed the line without a comma and added a line with the comma, right? Then we also added a rule here. And then in our environment file, we added our markdown, right? We can see all the individual changes that happened along the way, in addition to creating the index.rmd file, as well as the index.html file. So this is a split view where we can kind of see the differences side by side. We can also see the unified view where we see all of the changes together in a single column, right? So for example, in the snake file, we can see that we removed this line and added this line as well as index.html to the rule targets section of that snake file, right? So again, this can get a little bit cumbersome as we start thinking about longer and longer series of commits, like we had up here where we were making the pipeline. So again, I could look at this first commit and I see that I added in my demo. And over the course of the whole series of commits on October 25th, we could come to here and we could then say, well, this is how things look different. But this isn't really showing me the full change of things over the history of that series of commits on that day. What I'd like to do to make it easier to navigate all these different commits is to squash them together to make it basically a single commit so I can look at the before and after of what the project looked like instead of looking at each different step of the commit history. To do all this, we'll come over to Visual Studio Code. And I don't know that I need to be in an environment, but it's always good to do it. And so I'll start out by doing conda, activate drought, and I'll do a get status. And if I do a get log. And so we see the last commit that I made was generate pretty index page. And so if I come back to my history, that was before I implemented the workflow. So I need to go ahead and do a get pull. So I'll do that get pull to pull down those additional changes that were created to bring about the workflow. Now if I do get status and get log, I now see that I've got new days rendering for yesterday afternoon, as well as the Wednesday, the Tuesday, as well as kind of that section of code that allowed me to build out that workflow. So to squash together different commits, we're going to use a function within Git called get rebase. So to use get rebase, I need to know what commit I want to go back to. So we've seen that with get log that we can see all these different commits, I could certainly also go back to GitHub. Another more convenient way is to say get log, and then to use the one line argument. And so this then puts each commit on a separate line, right. And so the first thing that I want to commit are these two lines, right, the generate pretty index page, as well as hello world. I'm going to go ahead and grab this commit hash, right? And I'll copy that. And then I'll do get rebase hyphen, I so the hyphen I is for interactive. I've only ever seen get rebase use in interactive mode. So what I'm showing you, although there are many different arguments and options for using rebase, what I'm going to be showing you today is the most commonly used. So I'll go ahead and do get rebase hyphen, I and then I'm going to put in the hash code there. This then opens up for me here in nano, it's a bit odd that I'm using a text editor IDE like VS code, and I'm using nano, I suspect if I knew how to use the tool better, it would probably open up in the text editor for VS code, but whatever. So what we see here are the last 16 commits. And again, for now, I want to squash together these first two commits. And so what I will do is come down to the second one. And I want to squash this second one together with the first one. And so instead of pick, I'll put in squash, this will squash that 16th and 15th commits together. I'll go ahead and save this. And then exit out with control X. This then opens up a interactive way of defining a commit message that says this is the combination of two commits. This is the first commit message, Hello, world. This is commit message number two, generate pretty index page. And so I actually want to use the generate pretty index page as the name of my commit. So I'll go ahead and do control K to cut that and then control you to uncut that. And then I'll go ahead and delete these lines. And I'll again save with control O, and then control X. Now, if I do get log hyphen hyphen one line, I see that I now have squashed those two commits together to generate pretty index page. And I no longer have that Hello world. Now what I can do is I can then push this up to GitHub. And to do that, I can do get push. However, this causes a problem. And the problem is that we are rewriting history that we have already made public to the world. So one of the challenges of using git rebase is that you're rewriting history. And so, yes, we're rewriting history, we're not really deleting any commits, we're kind of repackaging them. But if you make this public and someone needed that Hello world commit, or they were depending on that Hello world commit existing, or they'd already pulled down a copy of my repo, they wouldn't have all the commits that I have, or actually, they'd have more commits that I have, right, there'd be that inconsistency. And so there's a trade off between rewriting history that you have perhaps made public and making your history a little bit more concise, right. So there is a bit of a trade off. Some people love git rebase, some people absolutely hate git rebase and think it's antithetical to the whole idea of version control. Anyway, I'm not going to take a site. I don't know that anyone pulled down my version of this repository, maybe you have, but what it's more important to me is showing you how we can use git rebase to make our repository a bit simpler. So now what we can do is git push hyphen hyphen force. So git push hyphen hyphen force will force the pushing of the updated repository up to GitHub. And now if I come back to my history, and I do a refresh, I now see that basically the whole commit history has been moved up to October 28, which is today, maybe I don't care so much about the day, but generate pretty index page is all squashed together as a single commit. Whereas before it was to separate commits. And again, if I look at the changes of what happened in this commit, I see that that VS code, I see the addition of the index dot HTML file. I see all these modifications to the snakemake file, the addition of our markdown to the environment file, and then the creation of index dot RMD, as well as the presence now of index dot HTML and all of that is new. There's no legacy of that Hello World index file. And now what I'd like to do is to do another iteration of git rebase to go ahead and squash together all of the commits that went into making the workflow and running the workflow. So let's come back and again, we'll do git log hyphen hyphen one line. And the commit I want to go back to, of course, is that generate the pretty index page. So we'll come back to here. I'll copy that. And then I'll do git rebase hyphen, I, and then the first seven or so characters of that hash code. Again, this brings up our interactive git rebase interface. And I again am going to leave the first one as pick. That means we want this commit, and then everything else below it, I'm going to squash into that. So I'll go ahead and remove this pick and do squash, as well as this. Again, if you have a fancy text editor, that might make a lot of this editing a lot easier. But nanos, okay, and squash. Right. And so then new days rendering, I'm going to go ahead and squash that into as well as specify time. That was when we changed it from going on a push to on a schedule. And then I'll also do this one, where we deleted the PNG and HTML file to make sure that when we updated the workflow that there was actually something that had changed to then commit. And I will then leave those new days renderings for the last three days in here. So now again, I can go ahead and do a control Oh, and control X to save and exit. Again, this brings me to my commit message. I'm going to go ahead and use control K to delete out all of these lines in here. And I will write a new commit message. So I'll say create get hub actions workflow to run snake make pipeline. And I'll save that. And then I'll exit out. And so now here's the test. I do get log hyphen hyphen one line. I now see that I've got create GitHub actions workflow to run snake make pipeline, generate pretty index page. This we saw was maybe a dozen different commits that were kind of tweaking one thing at a time as we're building out that YAML file that ran the workflow or that changed the snake make to add in the conda environments and things like that. Right. And so on the whole, all those changes now are condensed down into a single commit, which makes it a lot easier to go back and see what has changed at each stage. So again, I'll go ahead and hit Q to get out of that view. Again, if I then do get push hyphen hyphen force, again, because we did rewrite the history, some history that we'd already made public. So we need that hyphen hyphen force to force pushing it up to GitHub, coming back to my commit history here on GitHub. I'll go ahead and refresh the page. And so now what I see is I still have these three new day renderings, but again, create GitHub actions workflow to run snake make pipeline is the commit after generate pretty index page. And if I click on this commit, so again, I see the creation of the run pipeline dot YAML file. For my purposes, I don't need to see the iteration and development of this file. I suppose if you had a version that was working, and you liked it, and then you came back the next day and made some changes to update it, I probably wouldn't squash those two commits together. Because maybe you change something in between that breaks the workflow. And so then you wouldn't really be able to go back to before the change, right? And so you have to be thoughtful about what you're squashing together, because you can't really go back very easily into a commit that you had already squashed together, right? So you have to be confident that the things belong together in a single commit, and that things didn't break after that subsequent commit. So then we have these changes to the snake file again, where we added things. And I think this all looks pretty good. Again, adding the environment file, and then the change to the index page. So obviously, I also deleted the index dot HTML, and world drought dot PNG file, that'd probably be good to leave those in there. And you know what, maybe I will come back and add those in. So again, I could do another get rebase on that. And I think I'll squash the first new days rendering in with the pick. And I will then save and commit that out. And I will go ahead and use the create GitHub actions workflow as the commit message. So again, I'll save that, get push hyphen hyphen force and then refresh. And so now we see that we have that create GitHub actions workflow. And again, if we scroll down to the end, we see that we did have the deleted and added versions of the workflow, the one from the 17th to the one from the 25th. Again, I think that's enough said about GitHub rebase. Again, know that there is a bit of controversy in whether or not we should be rewriting or reworking the history. Once it's been made public, the downsides of using get rebase is that someone else has perhaps taken a copy of your repository, your histories might not align with each other. The other downside of get rebase is that say you make a commit, but you hadn't fully tested your code, if you squash that new commit together with the old commit, then it won't be possible, at least not very easily to go back before the last commit that you made to figure out what changed to break your results. Again, if you're confident though that things work after you make those subsequent commits, go ahead and squash them together. And you can think of that commit as representing a single feature that you have added to your repository. And that's exactly what we did. We added the single feature of running a snake make workflow with GitHub actions up on GitHub. So this is obviously a little bit more of an advanced concept and using get and version control. But I think it also is useful for helping us to think about how we can organize our version history of our repository. And that's really valuable because ultimately, get is about version control and being able to look back through the history and to see how things have changed, right? So if our version history is like just a big dumpster file, then that's not very helpful. I know people find this very useful when they're making subtle tweaks trying to get something to work. And they make a commit after each change. And so they have a version history that says trying this trying something else, or they start inserting f bombs or all sorts of other crazy things. And so get rebase will help them to squash all those together into a single commit. And I think that is a very useful application for using get rebase. All right, give this a shot with some of your projects. Let me know how it goes. And we'll see you next time for another episode of Code Club.