 Let's talk about Git. My name's Josh Freeman. I'm a software developer here in San Antonio for GROC Interactive, where primarily we turn tacos and caffeine into software. And we have a pretty good time with it. So I want to talk about some stories. Stories have power. I love a good story. The narrative that the author tries to tell, the context in which these stories are told, the motivations of the characters involved. This talk is not going to be at too technical in depth. There's going to be some how-to, but mostly we're going to be talking about the communication process that we undergo in software. And when we get into that, we're going to start a little bit with compression. Video, music, pictures, there's a certain amount of compression in the data that's fine. We can get away with that. We're good. But when you're trying to tell something that communicates something, especially textually, compression is not such a good idea. And we're going to start this. I'm going to prove this point with some headlines from a story that you should know, but has been highly compressed. Headline, local farmer farming family murdered by military police, nephew 21, sole survivor. Headline, nephew of murdered family leaves home to join separatist force. Headline, farm return pilot destroys key military facility. At this point, who knows what story I'm talking about? Star Wars. So compression is good in a lot of respects, but if you compress too much, you lose a lot of the benefit of the information and the narrative you're trying to tell. And that's what I want to talk about today in avoiding that when we're working with Git. So if you like SKCD, certainly you've seen this comic. I'm going to assume that you understand the basics of Git. And this is going to be more of an intermediate talk to level up your commit messages. But as a quick refresher, why do we use Git? Well, first of all, using FTP and email to transmit files is dumb. Secondly, versioning of your software is something that we find that we need over and over and over again. And we always want to find out, OK, especially if you work with designers, right? Designers always have composition.psd, composition.final.psd, composition.final-2.psd, composition.final, oh, really, this time.psd. Even designers know that there's this need for iterative improvement. They're never quite done. And software very much is that process where it's a living thing that grows as your understanding of the business or the domain grows as well. So it's Git communication tool. I pause it, yes. When I first started getting into software four or five years ago, I was at a point in my life where I really did not want to talk or work with people. So software was very attractive, right? You just sit in a keyboard all day long behind a screen. Minimal interactions with people, great. Well, I quickly found out to be even remotely good at my job, you can't live that way and you can't work that way. And I think as attending a conference demonstrates, you have to be able to engage with people, whether it's your teammates or your clients, in order to get something productive and useful done. Git, I think, is the most important tool that we have at our disposal. And I think it's also one of the tools that we use, sometimes least effectively. If we use Git, it forces us to ask the who, the what, the when, and the where, but never requires us to ask why. So software has three audiences. It has the author, ourselves writing it. It has the team that we're working with. And it has the future maintainers, who may be the same people or completely different people. And communication through both code and your version control needs to be empathetic to all the groups. Software is the result of thousands of decisions that we make over months, even years. We add this feature, we clarify this method, we change this behavior, and we should be able to answer, have an answer for why these changes were made at each point. So there's two hard problems in computer science. Naming things, cash and validation, and off by one errors. And what we're gonna focus on today is the naming things. I firmly fall into the camp of Bob Martin that comments are an apology. And if you have to write a comment in order to explain what's going on, typically that means you have not chosen a correct or a more effective name for the functions or the methods or the variables that you're using. I won't say that you should never use a comment, but if you find yourself writing lots of comments, I would suggest to you that you need to find better names to indicate what you're trying to accomplish. The only source of truth is what the code says it is. And if there's ever a divergence between your comments and your code, obviously the code is going to be the correct version. And we all know, we've all had comments that just drift and get out of the thing. We've all done it, we've all seen it. So I wanna briefly show something that I did about a year and a half ago. Don't worry about the details. This is some jQuery going on here. But the thing to focus on is the fact that the comment block is two, two and a half times longer than the code it represents. It doesn't really clarify what's going on. In fact, if you know any jQuery, any of the jQuery API, it's pretty obvious what's going on. So does this method name even describe what's happening? I don't think so. In fact, we need a comment to describe what this thing is supposed to be doing. We're not removing, in fact, we're not even removing anything. So I picked a bad name and all this information about what it was doing reflects that. The only stuff in this comment that I would say should be there is probably the parameters. And that would be for automatic generation of documentation. That would make sense. But the rest of this stuff is really information that I was using to learn and explain what was going on, which would be better served thrown in a commit message. So if you end up writing stuff like this where you have huge chunks of comments to explain small bits of code, your messages probably start to look like this. Or you never get it right the first time or you're just really frustrating your team. And especially when we start working with Git, in the beginning when we work with Git, we find ourselves really enthusiastic in the beginning. And as we get further and further along, our comments and messages become more useless. We start with good intentions, but we quickly go awry. And this is very true when you're working by yourself and you don't ever expect this code to become public. Or at least be seen publicly. So what do we do to fix this? First of all, we need to stop doing that. If you are doing git commit-m, that's wrong. It gives the wrong idea of what you're trying to do. When you type that on your command line, it's given this idea that everything about the patch that you staged has to fit in this one line or this one group. And if you give a whole lot of information, it becomes unwieldy and unreadable in your terminal. So stop doing git commit-m stuff and instead just git commit and drop it into your text editor of choice. Myself, it's gonna be vim for the messages. Use whatever makes you happy. So how do we do this? The summary. Well, first of all, summarize. And the next two parts about the summary and the message come from Tim Pope's article where he talks about a more useful git message. So we write imperatively in 50 characters or fewer. And the reason we start with an imperative command in the beginning is we want to match the way that git revert and git merge, those automatic messages that git provides us, we just want to match their convention. So write this, add that, remove this, whatever it is, start within a imperative command and then keep it under 50 characters or fewer. And a lot of this has to do with legacy git clients that they render things differently. And so to work best across all groups, just kind of stick with this format. So then the message. In the message, provide details about what's going on and keep it under 72 characters. If you're someone that lives exclusively in the terminal and only looks at their git messages through that, the 72 character limit is really going to help them out. Again, this depends upon your intended audience and the team, but for the most portability, keep it at 72 characters. So in this, you need to answer the following questions. Why is this changed necessary? So are we fixing a bug? Are we adding a new feature? Is this some client request? How does this address the issue? What am I doing specifically that makes me think this is the right way to solve this particular problem? Does it have any side effects? Is this going to impact another group or team? And then reference of resources. So we all spend a ton of time researching, right? I mean, I don't know of anyone that could do their job as well as they do now without Stack Overflow, right? Stack Overflow has saved all of our butts countless of times. Well, we spend so much time doing this research and then as soon as we're done, we don't keep track of what we did. We don't keep track of the things we found to help solve this problem. Put that stuff inside your commit message. So the next time you're having to solve that same problem, you can go back and say, okay, this is where that information came from. So how do we help enforce this convention? Well, the first thing I do is I create a get commit template and just a standard text file that has this following format. And for the commit messages, if you have a comment blocked like this, it will be ignored and won't show up. So then you set it, you create your get commit template and then you go to your get config and you add this template and you point to the file that you're using. What this boils down to is when you run get commit, it will then pull up whatever text editor you're using. You'll have this template in order to kind of provide a skeleton of the things you need to be talking about. So now that you've done all this work and you're writing useful commit messages, when do they become useful? Well, first of all, it's when you start playing the blame game. And if you don't know, get blame is a tool that get provides in order to look at a specific file and you can look at each line and see who made that change and what commit they used in order to affect that change. So get blame. It's an unfortunate name. I don't really like the name blame because it always gives the idea of an accusatory tone. Even if someone screwed up, okay, blame them. But if you're just trying to figure out why did they do something and you're blaming them, that just gives you the wrong mental idea of what's going on and what you're trying to accomplish. So I've created an alias. So get thank actually runs get blame. And so now whenever I find teammates that have broken things, I can thank them for their contributions. So one of my teammates thanked me last week. And he asked me, why are we using an older version of the AWS SDK? And I remember doing it, but I don't remember why I did it. And it took me about 15 to 30 minutes of thinking about it off and on to figure out why did I do this? Then I remembered, well, it was because the version of Paperclip we were using required an older version of that gem. And I remember at the time, I spent a lot of time debugging why this particular dependency, why Paperclip wasn't working for us. And all that information, all that time, all that searching that I spent gone because I didn't put that information in here. And my coworker had to ask me why? I could have provided them the answer. He could have looked the answer up himself, but neither one of us knew because I didn't put in the information. So here's an example. We're using GBG2 and we need to execute some system commands. And initially we were just executing them directly and using interpolation to pass on our arguments. Well interpolation, you're going to have escaping issues. It provides vulnerabilities. So ShellWords has a mechanism, is a built in dependency to the Ruby library that allows us to fix some of those issues. I didn't know about ShellWords. I didn't even know it existed a week or two weeks ago. So all I did was I found the documentation for this, put a little snippet, what does this do? Put it right there in the commit message and then a link to where it goes. So if anyone on my team or anyone in the future says what's this ShellWords thing was even used for? If they look at the specific commit message, they know exactly where this came from and why it's there. So one of my clients let us use this example. This particular one was an email from us, or email to me. Well, we all get emails, right? What happens when you receive an email telling you to do something? And now that information is no longer available to the rest of your team. Well, you put in the snippet of the email into your commit message. And in this case, we also have a ticket that was running back and forth. Don't use ticket numbers. Use the full URL that links to whatever it is that you're wanting to do. Because if you ever change ticketing systems, well, which system are you using? It becomes harder to find out. Links and adding information, the cost is effectively free. So now that we've put all this information in here, we need to search it. And I find that I often use this about once every six months. Yeah, about once every six months. Because Fabricon Generation is a cumbersome process, right? And so now I know that I just go back to an old project where I did it. I searched the log for the Fabricons, and I find, oh, this is where I go to do it. I like this particular service. So I go ahead and use that. Saves me the time of having to go search for the right service and see if I can find, I mean, if they've updated their UI, I'm not gonna be able to recognize it six months later. So let's do this from scratch. So the first thing we're gonna do is create a gem. And this is bundle gem and tenfold git. And we're gonna change it to that directory. So typically when we're starting new projects, whether you're running Rails new or bundle gem or whatever it is, you're going to want, you're gonna have a lot of boilerplate that gets added. Your first, start your first commit there. That's a good place to start with a baseline and build off from there. So now we're going to build something that's highly redundant and useless. And it's intentionally incomplete and incorrect. So what we're gonna do is we're building a abstraction on basic math functions. And we've got some tests and we've got some functionality that we've added. And before I built this example, I didn't even know that Agenda and Subtrahand were words. So putting that information in there might explain it. Now granted, this is kind of, I get that this is a contrived example, but if you have details like this, I don't really think that they need to be comments. The, if people have questions about it, why are you going to clutter up your text editor with information about, that just describes what the method does. So add this information into your commit message. So now we're gonna change several things and in fact, we're gonna change three things. The first thing we're gonna notice is we're going to change it to where we can add, no matter how many parameters we pass in, we can add them up all together. The next thing we're gonna do is we're gonna fix multiply. And the previous example, I misspelled multiply, I used an A instead of an I. So we're gonna fix that. And then finally, we're gonna add some division. That again is horribly incomplete. So now we're going to go to the most important tool that I want you to take away from today. And that's adding them patches. So when you add in patches, what that does is it solves the problem when you have lots of changes that you made. How do you add different chunks together? How do you avoid adding the entire thing together and committing it instead of having to delete stuff and then go back and add it? Patch solves that. So what I've done is I've run get add patch and it pulls up this chunk of text that it's saying, hey, do you want to add all of this? Well, that's more information than I want to add. And so when it says at the bottom, stage this hunk, first of all, I'm asking, I put in a question mark that provides a menu for us to look at for all the available options. So what we want to do is we want to split that up and that current hunk into smaller hunks. So I run split and once I run split, it's going to break it up into a smaller hunk. And now I was looking at this particular part and it's asking, do I want to add this? And yeah, I do. So I say yes. The rest of it, I don't want added. So I say no to the remaining hunks. Then when I go to the test file, I do the same thing. I say yes, I want to add the test for the particular hunk. But the tests for the change of the multiplication and division methods, I don't want modified. So now that we're done with that, we want to say, okay, did I stage the right thing? And in order to do that, Git provides a mechanism for us where if you use git diff, you can find the things that are different from your current head. Well, instead, if you add the cached flag, you can now look at what you just have staged for commit. And so when we run git diff cached, we'll see that if I run commit, this is what's going to be added for us, just these two bits. So now I'm going to add a commit message that has some problems. First of all, allow math.add. That's not typically the convention in Ruby for methods. It's really math, I want to say hashtag, but octahorpe or pound sign or whatever you want to call it. It really should be that. And then that last line, I'm being kind of funny and cheeky. Well, you can be funny in your commit messages. However, if you're not explaining why you're being funny or things like that, you're doing your team and your future selves a disservice. We're going to fix that in a bit. So now we go to round two, same thing. We're going to split this up. I want this part added, I do not want this added. So now we come here. We come to this particular split and I try to split it, but git doesn't know what to do because too much text is together. So git provides a way for us to edit this hunk manually. And by doing that, we pass in an E for edit. So when we do that, we hop into edit mode. And we see here on the lines 14 and 15 that it tells us how to edit that hunk. So all we're going to do is we want to remove lines seven through 10. So we do that and then that's what's left. So yeah, I'm going to stage that and so I'm going to write out. We're going to do git diff cached again. So now we look at it and we see, okay, only the multiplication corrections are going to be committed. Great. So I'm going to commit. Now here's where I break one of the rules. Sometimes you don't need a full message. I don't need an explanation why I'm correcting a spelling mistake. That's fine. If it's going, if your team really needs to understand why it's a good idea to correct spelling mistakes, okay, add it in. But sometimes a well-worded summary is sufficient. So now we're done. All we have to do is git add and we finish the rest out. This final commit, we notice some issues and we note them. We've got some issues with this division. First of all, it's only going to be effective. It's basic integer division and it's what happens is very dependent upon the types of numbers that we pass into that method. And I think it's okay to note, hey, there might be problems here and we might need to look at this in the future. So we're done. We've committed all of our changes, but remember that commit message from before, we didn't really like how that was worded. So we're gonna go back and we're gonna do some rebasing. Don't be afraid of rebasing. Pull out some test branches, play with it and learn how it works. And in this case, what I wanna do is, I like what we've added. What we've added and committed is fine, but it's the message that was wrong. So what we're gonna do is we're gonna reword this. So we specify that particular command reword and now inside the rebase, this is our previous commit message. And we're gonna fix some of our issues. And we're gonna note, again, we're gonna fix some of the naming conventions. And then at the bottom, we say, hey, look, this addition method, its signature is different from all the rest of the methods. Multiplication division, they all take two arguments, but this one does something a little bit different. Also, because it's taking multiple arguments and for add, it's probably not addition anymore, it's really summation. Is this really a addition anymore? It may, maybe not. We may wanna look at this one as well and fix it. So you're done, right? And you've made your changes and you've pushed them up to GitHub or Bitbucket. Whatever you'd like. So now we're gonna talk a little bit about our pull request model. And our branching model. And we're gonna start first with how I got into Git. This was three years ago where we had a team of four people and I was the most junior developer. And it was my job on Friday nights to merge all of our stuff into master. Master was production, develop was where we all worked and lived in. And so on Friday nights, we would merge everything from develop into master and that would then be our canonical production branch. So every Friday, once every couple of months, we would merge develop into master. And it was a terrifying experience because we were merging thousands of lines at a time because every Friday, well, not really. We would wait weeks or even months to make these merges in. And we could not have, there was no assurance or confidence in any of the changes that we were making. So at GROC, we follow Vincent Driessen's model, a successful Git branching model. We make a few modifications but by and large we follow it pretty closely. Where master is our production. So if anything lives on master, as soon as we hit master, CI kicks off and deploys the production. Develop is where we do most of our reviewing. We do all of our development in feature branches. So anytime a new feature or a bug or whatever it is that needs to be changed needs to be deployed, we branch with the feature branch off of develop, we make our changes and then submit a pull request back into develop. Once it's approved in develop, typically there's a short turnaround, turnaround from merging from develop back into master to deploy to production. And the develop to master is really business dependent. What does that stakeholder want? How quickly do they want things deployed to production? So pull requests. Let's talk about those for a little bit. Keep them small and keep them focused. I did this not too long ago. There's a nonlinear relationship between the time spent reviewing a pull request and the number of lines in that pull request, especially if you're reviewing it thoroughly. 43 files changed is not manageable. If you're reviewing it thoroughly, it will take hours to do. An apology is appropriate here. I'm sorry, Jason. So if a picture is worth a thousand words, a good gift is worth about an hour billable. I have found that the rate at which other people would pull down my code or ask me to demo it in front of them drastically dropped as soon as I started creating images and gifts of the code that I was trying to demo, the functionality I was trying to demonstrate. LiceCap is your friend. All you need to do is take a screenshot or a short video of the things that you're doing and then pull that into GitHub and you're great. We'll let this finish, all right. All you gotta do is take your gift, drag it and drop it into GitHub and you're golden. And that really helps with the rate at which people ask you to review stuff. And if you have a process where you have two, three, four people reviewing your code, that adds up really, really quickly if you're having to demo it for multiple people. Save yourselves time, save yourselves money, save your clients money, do a little bit of work up front and as you, in fact, prove it to yourself that it works. So finally, we're gonna talk about code reviews. And this is where it all comes together. This is the most important part, I think, of Git and what I'm trying to accomplish with Git. So we do all this time. We spend all this time and all this effort writing useful messages, adding, or committing logical chunks of code in a meaningful and useful way. Why do we do this? And it goes back to the beginning where we're trying to tell a story and we're trying to explain what's happening, what's changing, why is it changing? We're trying to communicate that to ourselves. We're trying to communicate it to our team. So when you have a pull request, that is the perfect opportunity to explain this is what I'm doing and this is why I am doing it. So in the beginning, when I first started really working with Ruby, when I submitted pull requests, everything was textual. Everything lives in the GitHub interface. And all the communication always happened there. And this was a problem because I had established a good rapport and a good shorthand with a previous team when I was doing most of my PHP work. Now that I moved to Ruby, I was with a different group of individuals who I had not established that rapport with, who I had not built up that shorthand. And there was a lot of friction in the beginning. And it was because we lived exclusively in the GitHub UI instead of going, hey man, can we talk about this real quick and figure out what's going on? And it took a while to realize where this friction was coming from when we started having these conversations in person where you could hear the tone of someone's voice when they're saying, yeah, I don't really think that that works right. And I think you could do this better or this is where I would go. And code is a intensely personal thing, right? As much as we try to espouse that we are not our code, we spend an awful lot of time putting our heart, soul, effort and energy into our code. And so whenever someone finds things wrong with it or finds ways to improve it, as humans it is difficult to remain objective. It is difficult to not take some of those things quite so personally. And when it's really easy to become a keyboard warrior and fight back and forth when you don't have to see someone's face. But if you sit down and talk with someone or if you're a remote, pull up a screen hero or Skype or pick up the phone and talk to them and say, hey, let's go over this code, you're gonna help reduce a lot of potential for tension. But also you're gonna share knowledge. And that I think is the most crucial bit is at GROC we work really hard to try to understand the core business domain problems of our clients. We're not code monkeys that try to just get something up and working. We try to figure out what is it that you want? What are you trying to accomplish? What is your core problem you want solved? And if you have a good code review process you can validate your ideas through other people. You can show I understand this problem and if you actually don't, someone else can say, I don't think this is quite correct. And the code review process is a great opportunity for that to say, hey, do we understand the problem we're trying to solve correctly? And it's a great way to say, yeah, we got it or pull the brakes and say, I think we need to go back to the client and figure out what's going on or do some more research or whatever it is. Do them in person or do them over the phone or just be nice. And if you're doing it just through text, be really nice. So the takeaway, the two things I want you to do the most, stop using get add period or get add dash a. There are times for it when it's really simple but by and large, start adding in patches that will greatly encourage how much you want to group things in logical chunks. Also, stop committing with the M flag. It gives you the wrong idea, it gives you the wrong connotation of what you're trying to accomplish. And as we saw with the Star Wars example, sometimes too much compression is not a good thing and the whole point gets lost. And then in the code review process, be excellent to each other. Thank you for your commitment. So really you have to take a proactive approach, I think. So the question is for my previous example having a 43 file large pull request, how do you avoid doing that once you get to that point? It's kind of hard to avoid once you've gotten to that point. You can do some trickery in order to kind of fake it but really I think the best thing to do is, we have a motto at GROC is if it's not on GitHub, it doesn't exist. And if you push to GitHub frequently, you're gonna be able to see how many files you have changed. You're gonna have an idea of what the scope of your changes are. Another thing that we do is we break out our features and our bugs into, they each have their own branch, a very short-lived branch. So we create a branch, we do our work and then we merge it back in. And when you do that in small enough features, you typically avoid that problem. Thank you all very much.