 morning again. So last week, two weeks ago this happened. There was some guy in Pioneer Square who was wearing like a mesh tank top thingy and jean shorts and mismatched chucks and I was like this is atrocious. So Gary decides you know he's gonna wear tie class it up a bit and I don't know I didn't feel like he could really do much better than I could. So then yesterday I actually forgot about this. Josh are you in here? Yeah. So he wasn't the only one wearing a tie yesterday. So I figured I'd better follow through with what I said and so yeah Gary you're ready? Gary Bernhardt is gonna talk to you about Unix. See I'm sadly not wearing a tie. But sure enough I was gonna wear it with that totally soy sauce three other days. Sad times. Right so chain saws and Unix. Chain saws are awesome if you need to kill trees but sadly chain saws are also very good at killing people accidentally. So people fear chainsaws as they should because dying is not such a great thing. People fear powerful tools like Unix as well probably due to some sort of evolutionary response to powerful things but Unix cannot kill you. And this has been yes. Emacs might kill you. Primary goals in the last couple years is to convince people to just get over the fear of powerful things and learn them because the worst is gonna happen is you'll give up. So our agenda today expressed in basic. Three times we will see an example of Unix e-stop and then I will philosophize after each of them. And then after all of that I will give you some unsolicited advice. So example one three. Suppose you have rebased some commits or done some other destructive operation to the history of your version control. When you wrote those commits you kept the test passing at all times of course but the question is when you've changed them would the tests have passed if that was the way you originally wrote the commits right because you've changed the content of all those commits. So this is a question that's actually quite easy to answer at the Unix show. First of all there is a command called git-revlist that will list the revisions between two graphs. And we can use that to iterate over them and do whatever we want such as for example just echoing them out which doesn't do much of anything. But instead of that we can iterate over them and for each of them do a git checkout. And if I whack return here then git complaints that we're going into detached head which I'm sure everyone has seen before. And at the bottom there you can see head is now, head is now, head is now. So we just checked out three revisions programmatically. And after checking each of them out we can execute whatever command we have to run our tests. You see the word Python there, don't forget, it's okay. To hit by whack return in here then you will see the tests run three times. And you can tell they're running over three different revisions because the number of tests goes 20, 20, 21. So this solves the problem of did I break any code with my breakers because you can automatically run tests over the history. There's one more thing we should do which is to do set-e before that whole thing. That says stop if anything fails so that way if the test fails we'll get dropped into that revision and we'll see the failure as the last thing in the output. So that is how to solve that problem. I'd like showing examples like this because they show off the power of Unix for people who don't really use it to its full capacity. But also so if we run that then we get the same output as before but if it's something that fails it can stop. So I'd like to show this because it shows off the power. But there's something special about doing this kind of programming and it is programming. And the special thing is that it's half-assed but it's the right half of the ass. We don't need to fully solve this problem. We just need to see that the tests run for each commit and see the output and that's good enough. So this will be a recurring theme in this talk. Half-assed is okay when you only need half of an ass. And of course ass is the binary state. There's many levels of ass. Half and full are good approximations. Okay so now I get to philosophize for a moment. I have a quote here from the structure and interpretation of computer programs. The language used at each level of a stratified design has primitives, means of combination and means of abstraction appropriate to that level of detail. And I'm going to steal these three words throughout this talk and talk about Unix as a programmable system. Primitives, means of combination, means of abstraction. And briefly what these mean are primitives are things in a language like Ruby. They're things like integers, lists, excuse me, arrays. That was a Python again. Integers, arrays, strings. Means of combination primarily is the method call. That's how you take two ideas and put them together. And the means of abstraction is a module, a class, a method. A way to take primitives and other abstractions, combine them with the means of combination, and then put a name on them to create a new abstraction. And these are sort of the three things you need for a truly programmable system in which you can build new ideas. And I'm using these terms loosely. Abelson and Sussman were probably not like my use of them so much, but I think it's a wonderful way to talk about stuff like this. So in Unix, we have all the primitives of a normal language, like we have numbers, we have strings, we have arrays. In the Unix shell, I keep, I'm going to say Unix throughout, I'm in the shell. But we also have files as primitives, which are, they are, they are, you can't change the way a file works, right? So it's primitive part of the system. And binary, it's like LS, is a primitive part of the system. You can't change it. It's there for your use to be combined to make other things. Now, if this is how you use Unix, if you deal with files and you execute commands, then you are using it like gods. And that's okay, you can get stuff done, you can see the CPs and stuff, MV, maybe RM, gem, stall, bundle, rail server, rake. You can do all these things, but you're not using any of the power of Unix, you're just using it like gods. Okay, enough philosophizing. Example two, three. This is a quote from a paper called Power Law Distributions in Class Relationships. And it says, a power law implies that small values are extremely common, whereas large values are extremely rare. And it's a mathematical distribution, and it looks like this, you plot it on a linear linear axis. So at the left, you have many things with a small number of something, and on the right, you have a few things with a large number of something. And the paper I mentioned earlier is about auditorium systems in this distribution. So here is a log log plot, so a power law shows up linearly. And this is the number of methods in a class on X, and the number of classes having that many methods on Y. So basically what you're seeing is at the top left, there are huge numbers of classes with almost no methods. On the bottom right, there's a very small number of classes with many, many methods. And this distribution shows up all over the systems that we build, whether we want it or not. It's just how things work, including the web, but also the project-oriented systems. Here's another one, number of fields on a class, this was for Java. Same thing, power law distribution. Most of the system has few and there's these outliers that have a huge number. And one more, number of constructors per class. Same distribution. Like almost everything you can think of is distributed in this way, it's really weird. This paper was 1.7 million lines of code outliers, so it's not just a tiny example, 7,000 classes, roughly, in Java. And one thing they didn't mention is the number of references to a class. This is also power law distributed, both in terms of the number of times a class's name is mentioned in the code directly and in terms of the number of times a method is called on objects of a given class. So we're getting to you in just a second. Why do we care about how many times the class is referenced in the system? Because the more times you reference a class, the more risk it presents in the face of change. If that highly referenced class changes, then everyone who's coupled to it is screwed. So we would like to find out what our most referenced classes are, just so we understand the risk of the system. Okay, so step one, find a list of classes. That's obviously what we need to do first before we can find out how many references there are. So I'm going to define a function called grepRuby that finds all the Ruby files and then just greps them with some arguments so I can grep Ruby for anything without having to repeat myself. We're going to start off by writing a really nasty Unix Regex. It has a terrible Regex format. It's truly atrocious. We're going to grep dash H, which means print out only the matching content, no filings. And we're grepping for any line that begins with zero more spaces than the word class or the word module and then a word boundary. So these are all the lines that are defining classes and modules. We will remove all the leading space and all of those. So now we have a bunch of lines of code that say module something or class something, smack up against the left. And then we will use cut to take the second field white space to limit it. So in the line module foo, we will take foo. And if we do this, we get an output like this. And this is the list of classes and modules in the storel software's Rails app. Not exhaustive, of course, because it's scrolled off. So now we've got a list of classes, and we just need to count the number of references for each of them. So we will take that thing we had before and type it into a loop, reading the class name each time. For each class name, we will print out the result of grepping the Ruby code, counting the number of files. So grep ruby dash l is showing me only the matching lines that match the Regex word boundary, class name word boundary. So that's every reference. And then pipe that to wc dash l to count the number. And then at the end of that string, we put the name. And that loop sort that thing together. And you get this. It's a small app, so the numbers are small. But this is the actual number of times each class and module in the storel software is referenced. And if you plot this on a linear, linear axis, you get this. Even though it's a tiny sample, it's actually as good a fit for power law as those ones from paperwork, which is kind of surprising to me. And I may have gotten lucky because application controller is such an outlier. And if you plot it on a log log axis, you get a roughly straight line. So this is actually another thing that's very useful, because you want to know where the dangers in your system are. I consulted on a team last year that had a system of 20,000 lines, brand new system, 6,000 of those 20,000 lines referenced a single global object. Now, in that case, we kind of knew that that problem existed. I didn't do it, so. But you want to know, right? You want to know where these risky parts are. And I bet that you cannot name the top five most referenced classes in your system off the top of your head. So it's a useful thing to do. Now, when I wrote this at the shell, but like this, I showed it to you in sort of a less interactive form, because it just takes too long, and I only have 30 minutes. But when I, when I do this stuff directly, it looks much more ugly. So once again, half past is okay when you only need half of an ass. And a power law distribution is especially amenable to half past work. Because the outliers are so far out there that if you are 50% wrong in your calculations, you're going to get pretty much the same outliers anyway. So half past answers work so well so much of the time with so little work. It's kind of amazing. And now I'm going to philosophize. First of all, the first thing about that to philosophize about is pipes, man. Did you see all those pipes? All that data flowing through there? I wrote that thing in like a minute and a half. It's amazing how fast you can compose these data migrations with pipes. And so here's some random stuff you can do with pipes. You can curl a script and execute it directly. This is the way the recommended way to install RVM and homebrew. I'm not sure it's such a great idea. But it shows you how powerful the thing is. Likewise, you can curl Google directly to them. If you colon W, it's probably not going to do anything. You can't write Google over the internet. But you can still at least get it in there. You can generate a diff from git and pipe it directly into Recurial. And I did verify that this works just to be sure. Recurial knows git's diff format. Git doesn't know Recurials. Git's kind of a bit of a hipster. But this does work pretty well. And of course, you can cap directly into cap. And you can do that actually as many times as you want. You can do it enough times in a way that happens. Things are a special means of combination in Unix. That is not present in normal language. You also have means of combination like loops. And loops are special in Unix because they have standard in and standard out. You can shove data into them. And then they do whatever they want, printing stuff out. And then you can get that data flowing out the other end. Very different from normal loops. And subchels as well. In the first example, I put some stuff in parens with the set-e. And that tells the shell to fork a new process and do that stuff in there. So if you set options, if you set variables, if you cd, it's all isolated within there. So very useful for combining things with the different states. Now if you start, if you take the primitives in Unix and then you start adding this stuff in, you have sort of a composable shell system. You can put pieces together, but you can't build a new thing and put a name on it yet. And this is, I think, where a lot of people are with Unix, where they're good at putting stuff together, but they're not building abstractions yet. So I'll do another example. This is a piece of the man-page for LS. Exactly, right? Yeah, no idea. It is. It is amazing how many options it has. I think I know like four of these there. I came across this during my preparations for this talk. And I saw this, and I looked at the lowercase letters, and I thought, what letters are not LS options? And I could just read through it without you boring. So when I did it instead, you can pipe manda. So if you pipe it to cat, you man-page, unsurprisingly. But there's a magic thing going on here where man pages actually contain backspaces, which is unfortunate, because they break program, programmatic interoperability. So if you ever want to do this, if you want to manipulate man-page, pipe it to call dash b, don't worry about what that does. Just pipe it in there and everything will be fine. So we can, we can grab that for, so it comes out the same. And we can grab that for LS to try to find that usage line. And we get a ton of stuff out. But we want that second line there, LS open bracket, bunch of options. So let's grab for zero or more spaces, LS open bracket. And if we do that, we get just that line. Now by doing this, we are robust against any unforeseen formatting changes to the man-page. Very important stuff. Now we can use walk to split on square brackets and take the second field. So that's everything between the first square bracket and second, which is the options. And now I'm going to wrap that in a function. So I can use it later. And I'm going to say the bad word again. So don't freak out. We use Python to print, I know, print a set of all lowercase letters, minus the set of that bash function, bash function keyword there. And we get yj, z and b. So those are the non, the lowercase letters that are not LS options. Very important result. Now, of course, I could have directly inline that blob of bash code straight into the Python with the backticks and get the same result. Now, the first thing to note here, not have asked enough, far, far too much to ask. You're going to roll off the end of my ZSH history in about three months and never be seen again. There was no reason to do that other than it's fine. So now I'm going to glossify again. Talking about ass is really not glossifying. Here, what I did by adding a function is I use one of the shell's means of abstraction. And like I said earlier, the functions in the shell have a standard and a standard out data flows right through them. They're more special than a function in language like Ruby. Because in Ruby, you get arguments and you return a fixed value. It is not lazy. In the shell, you take arguments, you also get a streaming input and you get a streaming output that's standard and standard out and streaming standard error as well. That's fairly unusual to use for crazy one liners. So you have functions also as scripts, that's a means of combination or a means of abstraction to put pieces together and put a new name on it. And if you start using this, you now have a fully programmable shell. You are using the shell in its sort of its full capacity as a programming language as well as an interactive interface to units. But also, in that last example, I was meta-programming. I was using bash code to generate a thing that went into the Python code. And you can generate any language named other language you want. The most common ones, of course, are being your shell language, which is probably bash or CSH. And Python, Ruby, and Perl, not in that order. So you are free to combine programming languages. And this is, I guess, sort of the level above the three levels of programmability. Once you've got these abstractions down, you start generating Perl, but then you just kind of go crazy. So I'm not sure that this is such a good idea in the general case. That is the end of my third example and my third piece of philosophizing. So now, I will give you some unsolicited advice that you probably don't want. First of all, two things to avoid in your use of any powerful tool. Unix, Vim, Emacs, whatever big scary tool. First of all, there's this target of a mediumcy that should be avoided. People learn how to do something. They find a solution to their immediate problem. And then they do that for 40 years. And if your media solution was 10 seconds slower than my more fine solution, and you do it for 40 years, you are losing. So it's important to reevaluate the way you're using your tools, and especially to just interact with people in work, like, para-program, maybe, with other people and just see how they use their tools. And you will absorb the better methods so fast. The second thing to avoid is proficiency fatalism, where you look at a master user of the Unix shell or a Vim or Emacs, and you say, that person is so good at that that I can never do that. There's something special about them that's not special about me, or it will take me so long to get good at it I will suck for so long that it'll be a huge net loss. And especially that last one is just a complete lie. If you start using Vim at 9 a.m., by 5 p.m. you will be reasonably competent at editing tests. You will be about as fast as you are a notepad or text editor or something like that. A month in, you will be faster than whatever editor you came from, unless that editor is something truly powerful like Emacs if you were really good at it. Not yet, not yet. The third part of that statement, I'm so sorry to be flat. The third part of that statement is, a year in, you will be faster than whatever you came from, even if it's Emacs. I'm allowed to say that because I used Emacs for years before Vim, so I know the dark side. Okay, so those are two things not to do. The tarp hit of immediacy and proficiency fatalism. Two things that you probably should do, two recommendations I will give you without you asking, are number one, use more pipes and functions. Do stupid stuff like the LS thing I did just to learn how to do it. I probably wouldn't do that if I were doing doable work because that's not really the most useful thing for my clients. But on my own time I'll do crazy stuff like that all the time just to learn new things and to ingrain it in my brain. And especially, the first example I gave you about running test over git commands, I actually re-typed that fairly regularly from scratch because it keeps it in my fingers, I'm not going to forget it, and it only takes like 30 seconds anyway. Second recommendation, pay attention to how much ass you need. There is this whole spectrum and we have modern movements on both ends actually. You have the craftsmanship movement, which is very concerned with quality, and that is a very high ass vocation. I'm involved with the craftsmanship movement, so I'm not insulting you by using the word ass. On the other hand, you have the thing startup, which is not about low quality, but it's all about sort of like titrating that ass one drop at a time, and just getting the middle of the quality you need to answer the question. Right you guys, remember what titrating is from that ass one drop? You can actually operate at a high ass level. You have to achieve the full range, but that is sort of one of the things that can make someone truly deserve to be called a master software developer, is they can operate from the tiniest drop vast to the biggest ask me. I say that, I do not consider myself a master software developer, I just want to say that because I don't want to sound even more arrogant than I actually am. Now I want to make it clear that I've used the word ass throughout this talk because if I use the word quality or business value or something, that brings all kinds of baggage, but ass just makes you laugh and it doesn't contain a bunch of preconception. So really what I'm talking about is quality I guess, but that word has all kinds of weird stuff associated with it. So that is all I have to tell you about Unix. The thing that I do is a company called Destroy All Software, you saw a list of its class names earlier, and Destroy All Software makes screencasts for serious developers. Topics like Unix, all of this stuff, but also dynamic languages, mostly Ruby, get distributed version control, fast tests, one millisecond per test, you saw me do a lightning talk about that yesterday, test room development, OO design, using them effectively, all this kind of stuff. If you're interested in destroyallsoftware.com and actually as of this conference, this is now my full-time job, so perfectly so that now you have to clap for me again. They're randomly generating Unix commands, they're right here. All of those, all of them are user-friendly. Yeah, number one way, like I said, is sit next to somebody. Oh, sorry. The question is, how do I end up getting more comfortable with all of this Unix stuff that I showed? And the number one way is absolutely pair with someone. Just find somebody who knows it and sit next to them. That's how I learn most of this stuff. But actually this, but aside from that answer, there's not a good answer to this right now. And this is one of the reasons that I've created destroyall software, because I want people to know these things. And I also would be nice to make a living doing it. But screencasting is the best venue I know of if you can't sit next to someone who knows it. So it doesn't have to be destroyall software. P-code has stuff about these topics, and there are plenty of screencasts to be found around the web. So you can try that avenue. Yeah. If you're in the Seattle area, you can come to Seattle RV every Tuesday night from 7-9, and we're happy to pair with you. Wonderful point. So if you're in Seattle, go to Seattle RV, sit next to Ryan Davis, and you can learn all kinds of stuff like this from us, from Tim, and all the other smart people there. Seattle RV is at Devaché on Broadway. And it is, let's see if I can remember this, Tuesdays at 7? Yep. Yep, Tuesday at 7, every Tuesday at 7, at Devaché on Broadway. Can you give an example of how you use that strength on a Windows box? In step one is install VMware, and then step two is Linux time sucks, actually. I don't know. Sorry. So this is, this is a open subscribe for Unix Utils. What's that? Unix Utils brings in some of the Unix utilities like LS, right? And there's also Sigwin, right? So you can install Sigwin in it, and that does give you a bash prompt for when you get to DSH. It's kind of slow, but that doesn't really matter. Yeah. So you're going to turn it in close by talk with Wilton Hock out. In pass go? Yeah. Well, the burnout book was written in 1976, and it's a good, impactful program that you might be about. I've got a book you see, that's on the Unix, and I've got this book. Basically, the first chapter is a live-line program in pastel called Catwalk. And it tells you you're going to read a character to the character. And he's very close to the chapter he deals with another school, but he shows you the other program that's a little worse than head, tail, breath, and he explains why he feels why he's building a unit where he did a little bit of development. He always told those men that he can only get their heads up like a dude. And it's like, one program in a five-line is what has the end of hell. Seven chapters you take, you put these together, and you've got something that can do. But it's basically what you're trying to look at on that. Right. And what's the name of the book? Software? Software Tools in Pascal by quick-turning MS and K&R. Yeah, may I check that out? Yeah, we've got all of that. And the other book is the online operating system. Right. But what are you talking about with all the rules and ask how they use the second three chapters on the side? Yeah. Yeah, I'm going to be comfortable with that. Right. Yeah, it's sort of hard to learn, to really understand the philosophy of Unix from outside. You really have to be a burst in it, or have it explain you in a way like that. And, but once you get it, man, it's so good. We go obviously, right? All right. I think I'm out of time, so thank you guys again.