 Hey folks, I'm Pat Schloss and this is the 181st episode of Code Club. No, that's not some amazing milestone, except man, I've made a lot of these, haven't I? So you might not know this, but every episode that I prepare here on YouTube has a chunk of code that I make available through a website called GitHub. We also have the code stored on my blog for Code Club here, but all that also is up on GitHub. And oftentimes I will say, if you want to get the code that I'm working with, check out the link below down in the description. Well, today I'm finally going to show you how you can go to the link down below in the description and get caught up so that you can get the code that I'm starting with and follow along. In addition, I'll also show you how you can see how the code has changed from the beginning of an episode to the end of an episode. We will do all of this on GitHub without having to know any Git, any version control yourself. So even if you don't know version control, I hope to prove to you how nice it is to have code up on GitHub and how easy it is to get that code to get data out of GitHub for yourself and others to be able to use and in the end to enhance the reproducibility of your work. And as we'll see here, to enhance your knowledge of R and get caught up with the projects that I'm working on. Now, of course, if you come in midstream to any of these projects and you're trying to get caught up, that might mean that you're not following along. So please be sure that you subscribe and you click that bell icon so you know when the next episodes are released. So I've been rewatching episode 173, where I talked about taking a very repetitious set of code and repeating all that repetition with a for loop. We talked about this being dry where you don't repeat yourself. Well, anyway, I'm not going to talk about for loops. But what I want to do is show you how we can go back to the beginning of this episode, the code at the beginning of this episode, as well as the code at the end of this episode. So let's go ahead and see how we'll do that. This is the description down below. If you click on show more, that opens up all sorts of other stuff in here. And so you can find my blog post about this episode at this link. So if you go ahead and click on that link, that opens up the blog post. Again, it's usually very short and compact. All it has is the opening paragraph of the description, a link to the video, as well as a section about the code. And so it says you can browse the state of the repository at the beginning of the episode, and the end of the repository. So let's go ahead and look at the beginning of the episode. So this is the GitHub page for the distances project that I have as part of the overall Rifomonas account here on GitHub. One thing you'll notice is that in this rectangular window here, it says 966c, 744d50, you don't really need to know all that. What that is, is that is a commit shot or a number. It's an hexadecimal, technically a number that indicates kind of a signpost of where I was in the development of the code before filming the episode on creating for loops. This is what the project looked like. This is kind of what the overall project route directory looked like. I had no directories at that point in the project. And I had seven files in one directory. Let's contrast that with where it is as the date I'm filming this on January 20th of 22. You'll see that I now have some directory structure, some other files in here, right? And so the project has evolved a bit since that initial commit. Again, if I go back, this is what the project looked like before filming that episode. Let's pretend that we want to go ahead and get the full distance matrix, the mice break artist dist, distance matrix, I can click on that link. This then opens up a web page that has the full distance matrix. I think there's like 340 samples in here, 348 samples in here. And it's a 348 by 348 distance matrix. This would be a bear to go ahead and copy and paste, right? So I could go ahead and try to copy that and kind of pull down and it just gets kind of klugey, right? The better approach would be to go ahead and do raw. And so if you click the raw button, this opens up a text output of the distance matrix. So I could do something like command a highlight everything. And then I could copy and paste it. Alternatively, I could take this and do save page as I could then save it to my desktop as mice break Curtis dist, save that. Yes, I want to use dist. And then I could open this file in my text editor and see that it's properly saved as text. If instead of clicking on that raw button, I had gone here and done save this will save it as a HTML file. And so it'll save all the HTML text, which I don't want. I only want the raw distance matrix. Okay. So again, if you want the raw distance matrix, click raw, then you get the text version of that distance matrix. And this is a really good way to get one file out of the overall repository that you want. Again, you could come back to the project route, and you could look at any of these files like read LT matrix. And you could see what read LT matrix that are looked like at the beginning of the episode. So again, this is a good approach of getting individual files out of a repository. This might be useful if you say want those distance matrix files, but you don't want to have to worry about all the other stuff going on in the project. Or if there's an individual file that on your version somehow got screwed up, and you want to get a refreshed version. This is a really good approach. A second approach that you can take to getting caught up is again, we're looking at the project route directory at that commit. We could then click on this green code button. And then there's a clone, there's open with GitHub desktop, but then there's also download zip. And so what I would encourage you to do if you don't want to mess around with git is download zip. This will then download the zipped version of the project into my downloads directory or wherever you have your browser set to download files to. This is then I can double click it will decompress the zip file. And you'll see that it's now distances hyphen 966 C. Huh, what is that 966 C? Again, that is the commit at that point in the project. So I can then open it. And voila, doesn't this look a lot like what I had over here in GitHub. Again, this is what the project looks like within its own project route directory at this specific commit. And again, this has a distances dot our project file, I could double click on that. And this then will open up my project in the correct working directory with the code as it was at that point in the project. So let's say we wanted to see what the project looked like at the end of the episode. Well, let's go back to the blog post. And again, we can go to end of the episode. This will take us to a different commit. Again, at the end of the episode, right? And so you'll see what the project looks like here. And you could always go in and look at read LT matrix dot R. And so you could see that this looks a lot shorter than it did with that previous commit. Again, we replace that with this for loop. But say you wanted to get a better sense of what happened over the course of that episode. Well, you can click on this 2905455. Again, that's the commit shot for this particular point in the project. So I can go ahead and click on that. This will open up another GitHub window, where you can see all of the changes that were made to the file or any of the files at this point in the project. And so what you'll notice is that I went from mice simple, so it's got a minus sign. So it removed that line and added in another line where it had mice. So my simple only had 10 samples. My spray credits had 348, right. And so then you can see, well, all this in red was removed. Again, that's a lot of repeated code that only worked for 10 samples and not for the full matrix. And that we then replaced it with what's in green here, right? So again, this is a really nice way of looking at what has changed. This is the unified view. Alternatively, you could look at the split view. And in the split view, you get line by line comparison between the two versions with the old version on the left and the newer version on the right. Again, so we see that, you know, line one, we basically took mice simple and turned that to mice. We then removed a bunch of lines and replaced it with what's in green here in that newer commit. So now I'm back at the project root directory. I'm on the main branch, which means I'm at the current time in the project, not where I was back in December, but where I am today in January. And I can now look at the current state of the project. But I could also click on this 15 commits. If I click on that 15 commits, I can see each of the 15 different commits. And again, the one we started with was replace repeated code with a for loop, right? And so that the previous was that 966C and then 2905455, right? But I could come back and look at any of these points in the history to see what had changed, right? So I could say read in square matrices as well, and I could click on the A1B5F0. And this will give me the split view of my code, my readltmatrix.r. Alternatively, I could go back, and I could then click on the angled braces, and I can now browse what the repository looked like at this point in history. And again, if I wanted to download that version of the project, I could do download zip, and I could continue on like it already shown you. Again, GitHub is a really powerful tool for making code public, as I'm doing here to get code out to you as you watch these videos. But more important than watching these videos and using GitHub to get code that I'm trying to share with you, imagine what you could do with GitHub with your own code for your own project. No more do you need to write in your paper code available upon request by emailing the authors. You could instead put a link to a repository off of your own lab's account that would then show people that are reading your paper all of your code, and they could theoretically go back to the history of your project, and they could see how your project evolved over time. That is a whole another level of transparency and fostering reproducibility that I strongly encourage you to do. If you go look at my own lab's GitHub account, you will find their links for all of the papers that we have published over the past five or six years where you can go in and you could look at how those projects have evolved over time, see the mistakes we made, see comments on the decisions we've made, and really get a better sense of how we do data science and how our thinking evolves on how to answer biological questions as we go over the course of a project. So I hope this little, you know, very functional episode of how do you get caught up with Code Club can be kind of a seed that will grow in your mind and thinking about how you can make your research more reproducible and more accessible to other people that want to see all the great stuff that you are doing.