 Okay, welcome back. So yeah, now we have this short section on, that's a Git crash course. So there's already a question in HackMD, is this Git basics or something more advanced? So really it's pretty basic. In fact, we might even say more than basic. So we've noticed that many people when they start using advanced computing, they have almost no experience with Git or any version control. And this is quite a disadvantage. So what we'll do is we'll sort of go over how it's used and give some basic examples, but most of what you need to do is self-study anyway. So if you're good at Git, then you might just leave this running and go get your lunch, do whatever, and then come back later or not leave it running. So our goal is to start the next segment at the expected time, so at three o'clock Helsinki time. But you might wanna come back a bit early, can't hurt. Okay, so we have a poll in HackMD here. Let's see, let me scroll down. Yes, so have you used Git before? So our goal here is to explain more about why you use Git and how we sort of personally use it and it relates to our research and to go through the details. So even if you use Git but wonder just how should I use it in practice, you might get something out of this. I like the third option here. Yes, I use it, but I have no idea what I'm doing. I can certainly relate to that a lot. Okay, so should we begin? So why do we use Git? What does it do for you? So, okay, so Git is basic, the term is version control. So it allows you to save the state of your software project or whatever project, anything with files or text. And it allows you to not just save the current state but to control the different versions. So go back and forth, see what changes have happened between versions and pick up kind of old versions of different files, for example, and use them. So what does it mean saving the state? Like what state is this that's being saved? So basically I would go with a metaphor of you're saving a file. I mean, you edit the file in the text editor, you can save the file and the file goes to the disk and then later you can open it, but that's only one version. So one thing you, one way you can think of Git is you can take a snapshot of the file once you have saved it and kind of save it even more, make it even safer by putting it somewhere with the version number and Git will take care of that, those version numbers for you. Like a great example of this that I found, it happens almost every time, not every time, but I see it far too often someone's like, okay, I got this, these paper comments back and now I have to make some small changes and you do something and then suddenly everything breaks and you have no idea what you did and you spend days trying to fix it. And there's really no reason for this to happen. So you can go back to any point in time. Yeah. And you can find which where the thing broke or even better, you can see the changes that broke it if something in your code is broken, you can go back into different versions, rerun things and rerun a test, for example, and see which version broke it. And that makes debugging much easier, much faster. Yeah. So I think here there's two major strategies when people get started. One is you start a new project for yourself and the other is you have some existing project, someone else has started and then you start and you pick up on that. So you don't have to set up everything from scratch. So which will we be doing today now? Okay, so what we'll be doing is we will be cloning someone else's, okay, there's a couple of words here, cloning someone else's repository. So repository is a folder that contains some files that you control through Git. So it's basically one coding project, for example. And by cloning, I mean, we'll be taking it from a place on the internet and making a copy on our laptop. Then we'll make a change and publish it online, publish the change online. Yeah. Okay, sounds good. So what's the first setup? Like to install Git and do the basic setup, how difficult is this? I found a website and okay, I'm on Ubuntu. Okay. I'm on Ubuntu, that's what I should start with. And what I would do to install the Ubuntu is very familiar to people who have used it, use apt and the package name for one good option is Git all. But if you're not on Ubuntu, there is Git bash for Windows. And on macOS, you almost certainly already have it. So if we just run, let's test that we have Git, Git-test-version, you see something, you have it. Okay, great. But yeah, Git-patch is great on Windows, great option. Yeah. Then all of our clusters and stuff like that, it's basically already installed. So. Yes. Yeah. Okay. And after it's installed, is there some initial setup? Yes. So you do need to run some configuration. Git is nice in that it provides you basically, it always gives you some guidance of what you need to do. So I never remember the commands to change the configuration properly. Yeah, I know that feeling. Now, but Git will tell me, unfortunately it will only tell me once we get to the commit stage. So maybe we should, I should not try to figure it out on the fly, but rather we should go forward and do the configuration when we get to it. But basically it needs to see your email and your name to tell everyone who made the changes. That's actually a good style because that's what happens to most people, including me, you forget to do it and then you do it. But that's fine. Okay, so now we have Git ready. Do we clone our repository or? Yes, let's do that. Just to mention, sometimes you need to, or sometimes you're starting your own project and it is also possible to create a completely empty repository. And I mean, that's how every project starts. What we'll be doing is cloning a working process, progress. But maybe I should first run this command. So Git is the program and in it means that initializes a new repository. Okay. So this is what you do. So what do you have? Starting a new project. Yeah. And it happens in a folder. So it will control a folder, which is why I'm creating a new one. And now I can run Git in it safely. And it will control basically this new project folder. Okay. It tells me initialize empty Git repository. And from there on, we do basically the same thing as we are going to do in this cloning example. Okay. Okay. So just to mention that command because that is something people might need. Okay. But what I'm actually going to do is clone from the website GitHub clone a specific repository. Again, I have a folder with files in it. Here you already see what files this repository contains. And if you have everything set up, it is worth running this step with me because you will need this repository later, I think sometime tomorrow. Yeah. During this course. Okay. So here's an important point. So it asks between HTTPS and SSH. You can use SSH if you have set up keys so that Git can access the files in your GitHub account or on GitHub in general, using your username and your account on GitHub. HTTPS doesn't need that. But if you are then making changes, you need to somehow of course give a password or something. And well, in HTTPS it will ask for a password. And this will work for a while and sometime this year it will stop, it will either stop working or require some additional setup. So actually the SSH option will probably become the simplest one. Right now though, you can just take this HTTPS version and use that. So the command for cloning, creating a copy. Whoops. Okay. Git clone. Should people clone from the upstream one or from? That is, it does not make a history as far as I know. Okay. This is the original one. So if somebody makes changes to it before tomorrow, before you need it tomorrow, then it's probably better for you to have this one. Yeah. So this is all the side comp slash HTTPS dash examples. Can you paste that to the chat, to the HackMD? It's done. All right. And I will take my own version because then I can also change it. Okay, great. I will also use actually the SSH version here. Yeah. Maybe I should, well, okay. I mean, this way I don't have a type in the password. Yeah. We should point out there's these two repositories. So one is derived from the other and the term here is forked, which makes it a sort of community project. So if someone else has code, you can make your fork and then work on it and then send changes back. So it allows this. This is something that happens on GitHub or you can do the same workflow on basically any other website that uses Git. So GitLab or any of your local, you might have a local GitLab, for example, Aldo is version.alto.fi. Yeah. You can fork a repository and that makes it, makes a copy of it for you. And then you can make changes to that website, to that repository, that code. Whereas I would not be able to make changes directly to this one because it's not mine. When I do the fork by clicking here, then it's mine. Yeah. So changes, of course, don't go here. I need to do something to get them. That is an additional step. Yeah. And well, when you're using this by yourself, these forks and things don't matter, but when you make something that becomes popular and many people are contributing to it, then it becomes quite a useful thing. When it's a big project, and especially if it's a project that somebody else is running and you are not really involved, but you do want to make a change to it, suggest a change to it, then this workflow makes a lot of sense. Okay. So let's go on. What's next? So we want to make a change. What we're doing here is directly changing my repository, my version of this code. Or actually, well, what we're doing now is making the copy on my laptop. Okay. And now I have a copy. So I list folders. Now I have the new project, which I just created. And I also have these HPC examples. Let's go in there to see what we have. We have the same folders as in on the website. So here you'll be saw what files it should contain. Right. And we have all of them, which is nice. Okay. You also have the git ignore if you add the dash a, you will see. That's git ignore there. Okay. And well, what do we want to do? We want to add something to make a change. Should we add a file? Should we change one of the files? How about we add some new lines to the read me? Maybe that's the most obvious of the things. We can use any text editor if you are following along. But yeah, I will use spin. Okay. So here we've opened the file. This is quite short. Try to add examples. And we'll do. We'll see try to cluster. So we'll do a lot more file editing tomorrow. So yeah. We'll get into that later. Okay. So I'm trying to think of something clever, but I'll just add something random. So something random. Yeah, sounds good. Okay. So now I have changed that file and something I'm actually probably if I wasn't talking, I would have typed already git status, which will show you the status. So right now there is one modified file here. It shows modified files as thread. And it already tells me what to do. So I can git restore to throw away the changes I made and get back to the last saved version. So right now I have saved this file. I have made changes to it. And if I was working only with one file, one copy, without git, that would be it. I mean, I would have changed the files and saved it. And there would be no way of going back except go and edit it by hand. With git, I can get back to the last saved version of the file using git restore. So that's great. Okay. The other option is git add and the file name. Would you like to show git diff to see what changes? Okay, git diff is also a useful command. So if you just run git diff without any other parameters, it will show you all the changes made to the files in this folder. All of the files that git is actually following. So if I create a completely new file, I will actually probably say that there is a new file in the status, but it will not show anything in the diff. Okay, but anyway, so it shows that there is a new line here and also a new line here that's empty. Okay, so there are some changes. And this is so useful. So whenever you go and you make some edits and then something breaks and your code doesn't run or gives wrong answers, you can go look at the diff and see what went wrong. And that can save you days of, well, I mean, I've had times when I've spent hours or days trying to undo something I did just because I got lazy and wasn't making new commits as often as I should. Yeah, yeah, so of course, if you are not saving things often enough, this diff will get very large and then it will get harder to find the problem. But it's still easier than trying to find it in the entire code base because you are only looking at the changes since the last state. Okay, so the next thing to do if we want to put these changes online, the next thing we need to do is to type git ad. Git ad will add changes that you have made, either a completely new file or an existing file. So what happened now? It didn't tell me what happened, but if I type git status, I will see that there is now a modified file. I can still restore it, but now, well, now if I run just git restore and the file name, it will actually take the one I just saved. So let's try. Let's add something I don't want to add and let's assume that that breaks my code. Okay, git status shows there are, I have modified and added and then I have modified again. And we can look at git diff. That will actually only show the one I have just modified. Okay, so it is comparing to the one I have added. Adding is basically saving it, it's a second layer of saving. The file is already in the file system, but I have also saved it into git, right? So you can think of it as saving a file. So it's like some sort of multi-stages. It has the past before you commit. Yes, yeah, right. So that there are kind of multiple layers of saving the file in here. Okay, so now if I type git restore, read me. What happens? Should I use diff? Yeah, I know. Git diff shows nothing because of it's all, I have restored it, so there are no changes. Let's use vim to look at the file. So the first thing is still there. The second thing is gone. So this I have added and then restore gets it back. So that's kind of useful. If you keep adding things and then modifying, if you break something you can always restore, which is nice. And the first thing we added is still there and we're waiting to do something. Yes, the first thing we added is still there. Is this permanently saved yet? It is only saved in the file system in git. So it's only on my laptop. And if I add something else, it will be overwritten. So if I make another modification to read me.md and then add it using git add, that will overwrite the previous one. And that is something we sometimes will quite often want to avoid. So yeah, the next step will help with that. There is also this dash dash staged. You can give restored. And that will go back to the very original. The read me file, get rid of the first edit as well. Okay. So the stage thing, do we commit it? Do we keep it permanently now? Let's say we're doing the most simplest thing. We made the change, we would add it and then we... Now if we want it online and we want to keep it safe. So let's say this is all we want to do today. To save it permanently, we type git commit. And now we need to add a commit message. This is actually where it will complain about me not having added my email address and name. So we'll add a message. Let's make it a bit more descriptive. Okay, edit a line. Okay, so now I have to... Yeah, that's nice. Based on your... Oh, it has done something. It has taken username and host name. Okay. And assumed that that's my email address, which is not. So let's try this one. Actually, this git commit edit is nicer. It also did do the commit. Yeah. That's okay. Let's look at what it assumed to my email is. This is new. I was expecting it to just tell me to run these two commands and then try again. So git log. Log, git log will list all of your previous commits. And I'm just now checking what it... I mean, it probably assumed that this is my email address. So this is the message I added. And yeah, this is not my email address. Yeah. So it sort of inferred something based on the computer. It's not needed. So that is probably usually not what you want. Okay. So here we go. So we can change it here. So this was the first command that git listed. So it still is very helpful. It gives me the commands I need to run in order. Yeah. Okay. So that is now correct. Do you need to comment out the line? There was... That config line. Right. Okay. So I made a mistake. So it tells me that this is the file now, but this command doesn't work anymore. Okay. So what did I... Oh, sorry. Yes, I commented. The comment. Uncommented this line, even though I shouldn't... Okay. So now this command works. Okay. All right. So what it told me to do was first edit the configuration and then rerun or run this command. What this will do is change my commit to reset the order. That's kind of... It does what it says. Okay. So now I can change the message again and I don't want to change it. So there it is. Okay. Now, after all of this hassle, sorry, I need to tell you what the git commit does. So... This was all about getting the email and the name correct, which I can check. So now it is correct. Okay. Yeah. Where does this email go? Git will basically remember it as a part of the change I made. And when I publish the changes online, people can see it by looking at the change. Yeah. So anyone that includes the repository gets this information? Yeah. Yes. And if you don't want this, then you... Well, there's some options for having a GitHub private email address that can be used for linking and so on, but you can read more about that later. We don't need to do that now. I don't know about that, but yeah. Okay. I mean, in principle, you can add anything there, but it is good practice to actually use an email that people can read you from. Yeah. Okay. All right. So what have we done now? If we type git status, we see that there are no changes. Yeah. Good. There are no added files and no changes that haven't been added. Yeah. So what we did is take the file we just saved, the changes we just saved and put them, well, basically copy them into a different place. So make it permanent. And so, yeah, how should I say this? I mean, I guess that's all. We've made it permanent. Yeah. So now this change is permanently a part of the history of this folder. So what this git log command does is basically, it shows me of the history of this folder. All of the changes that have been made to it. It only shows the message I added though. It doesn't actually show all of the, all of the changes to the files, right? So it's good to have a very good descriptive message. Yeah. You can also use git diff, for example, to view the changes. Or git show with the hash. Git show. Yes, I can show a specific comment. Let's take this one. So changes made in this update to the repository. Look like this. Yeah. Yeah. So this is the same format as the git diff. And the newest one, the one I made is this one. So we added something random there. Yeah. So basically, you can always tell what happened to... Yeah. So you can, yeah, you can always see the changes that were made and you can see when they were made and I hope. Okay. And what kind of, these hashes are kind of complicated. I mean, they look weird, but they are basically just a unique identifier that it is calculated from the changes you made. But you can think of it just as a unique identifier. Yeah. Okay. And then what? Then, okay. Well, we could continue to make changes and try to get the repository to the code, to the condition where we want it to be. If this is all we want to do, we are fine with which changes we make. Then we can push them back to the online repository. And this is pushing to your personal copy. Yeah, this is pushing to my copy on GitHub. Okay. So I will show the website again. Okay. Here we are. Just reloading although there are no new files. But yes, it's actually showing the readme.md always in the bottom of the repository, of the website. And we see that the change we made has appeared there. Okay. So in order to get it to appear on GitHub, you do need to commit it. You need to make it permanent. As long as it's only added, it's only on your own laptop, on that local machine. But otherwise, committing is basically just taking those added changes, making that change a permanent little chunk of changes. Yeah. Okay. Maybe I'll demonstrate some things on the website. Well, we don't have much time. The only one I was really thinking of is basically, this is the same as the Git log. Oh, yeah, that's a good thing. There is this icon here. So here you can see the changes made in each of these comments, just like in the Git log and Git show comments. Yeah. Yeah. I mean, so with the GitHub web interface, you can do or view most things you could from the command line. Yeah. And usually I'll use both depending on what I'm doing and what I need. Yeah. I usually use this one mostly when I have a cloned repository and I'm not sure if I need to. Yeah. Okay. So, yeah. What are some common problems people have? Or should we go on? Yeah, then maybe there are some questions in HackMD. I mean, one thing I was, I didn't learn until way too late is how to actually undo a change. So if we wanted to, let's say, okay, I always just start thinking and I type Git status because, okay. But if we, let's say we want to get rid of this weird something random line because why is that there? Yeah. That's a good example. So there is some weird new line here that somebody called the Randahari added for no reasons. So I can run GitHub, for example, to find where the change happened. Maybe I get log first and then git diff and I will start with the latest change. So I mean, this is the newest one. Actually, the git show is better. I should learn to use git show more often. Yeah. Let's just git show. Okay. Or what about git log dash p to show all the changes? Okay. In line. So now we see the line. Oh, this is nice. So, yeah. Now we can see that there is actually the change that I was wondering about. So somebody added an extra line there. And we don't really need that line. Yeah. P means print. Then patch or something. It shows us the patch of each one. It's a good question. Yeah. The patch format is this, basically this plus sign here. And you can have also deleted lines with a minus sign. Yes. Yeah. There is a deleted line with a minus sign and an added line with a plus sign. So the effect is removing this white space here. Yeah. Okay. But in any case, we want to get rid of this. And one way to do that in this case, because it is the latest commit, is to go take the file from this version. So this is, of course, not always. Well, I mean, okay. It works in this case. And this is a useful command. It will allow you to pick a file from any given version of the repository of your code. And the command is checkout. So I don't remember the order correctly. Is it first, commit, and then the file name. First, submit, and file name. Yeah. And read me. Okay. Okay. So it updated one path, one file to this. From, so from that state, it updated read me. Oh, from, yeah. Okay. So now if we look at what readme.md has, it doesn't have the extra line. We can also do git diff. Oh, it's not showing anything. That's interesting. Well, this git status show. I need a T in git. Oh, it has already added it. Interesting. I did not expect that. So yeah, okay. This is always kind of, this is very useful, but sometimes you don't do anything. You don't do something in a couple of years and, you know, it can surprise you. Okay. So it has changed the file and then it has added it. How did that change to this? Well, basically saved it again, right? Okay. Okay. So if you want to get rid of that file, we can come it again. Do we want to do that? Why not? Yeah. So this time it will go much more smoothly. So remove weird line. Okay. Okay. And now git diff will show nothing and git log-p will show that there is a new commit by me and what it does, it removes two lines. Yeah. Okay. Okay. So that's a nice thing that you can do. And maybe that makes a bit more sense of why we have this commit, why we don't just save the latest good version but actually save the entire history a bit by bit. So that means that you can go back and basically cherry pick a file from a given version. Yeah. So that's very useful. You can also see the diff between the current version and the old version and then pick and choose which changes you want to undo and which ones you don't want to undo. Yeah. So that's probably the most useful method for undoing things. Yeah. So we're coming close to... I'm going to push again just to, because otherwise I feel like something is incomplete, okay? So we're coming close to the end of our time. We've got, well, maybe two minutes. Where do we go from here? So I think that probably we've brought up more questions that have answered right now, but that's sort of the point. So there was no way that 45 minutes we could show you both the complexities and how to do something useful. So yeah, you have continuous git to study. Actually a few weeks ago there was a massive course called code refinery where we went, so half of it was going through git and the other half was other programming tools. Actually we'll mention this after the break so maybe there's not need to talk more about it now. What I think maybe the most important thing is to realize that version control of some sort is absolutely essential to whatever work you may be doing. Would you agree with that? Yeah. Actually more and more the things that I'm doing I want to put under version control sort of coerce things to go under version control like developing courses for example. Just because it's so useful and I want to use it to manage everything. And it is so it's easy to get started and do simple things. When you start getting more advanced things can get weird. And I would say the most important thing is to when you get there, ask people. And ask for help. Like we said at the beginning, continue learning from each other, from your colleagues, from support staff, whoever it may be. And you can pick up all these things and it will not be too hard. And once you start using it very well then your code is no longer basically a throwaway thing for one certain project but you can start accumulating it into some sort of projects which you can use more and more. And like it can become something that someone else will use which will really help your future career. Yeah. And any job you might take if it anyhow involves doing something with computers and programming or whatever people will expect you to know some version control so it's better to start now. Let's see. There's a question differences between GitHub, GitLab and other Git applications. So Git itself is a command line program. It's also the repository format. And then GitHub and GitLab are web services that can speak to the Git command line program and service other repositories. There is also BitBucket which is quite commonly used. And then, yeah, almost many institutions have their own internal GitLab instances. Yes. And the differences are not big. They used to be maybe a bit further apart but I think GitHub, GitLab and BitBucket have kind of basically arrived at the same place recently. I mean, really, it doesn't matter exactly what you use as long as you use something. Maybe we should... Is there anything else very important to say otherwise? I guess we can go and keep answering via HackMD and we'll return about two or three minutes after three. Yeah, there are a lot of useful tools built around Git. Git itself is a relatively simple program. I think we've demonstrated most of the most important workflows or most important commands. So you can get started and try not to get confused if people are using some complicated looking tools. It's probably just running multiple Git commands in a row. Yeah. Okay, well, I hope that you enjoyed this little intro and more than anything, I hope that it inspired you to want to learn more slowly over your career. You don't have to be an expert tomorrow but you should be in five years or something. Okay, well, with that said, let's go to a break and we'll see you three minutes after the hour. Thanks a lot.