 Okay, so I'm Ryan Tomego and this is the Shell Haters Handbook and Josh told me we were going to have some sick AV here today and so I figured I'd take advantage of it and do the monster with 20-foot ASCII art, so I don't know, I think it turned out pretty good. I'm here today to talk about the Unix Shell, which may seem kind of strange at a Ruby conference, but there was no Shell conference and so, no, when I think about the technologies that I use and the languages that I use and get the most value out of, not maybe the ones that are most interesting to me or the ones that I want to learn, but the ones that I actually use the most to produce, the Unix Shell ranks up there kind of strangely very high. In fact, I'd say after Ruby and maybe JavaScript, I'd probably use Shell more than any other language. Thank you. Or Shell Haters, remember? It's a boom, actually. And so that's a little bit disturbing for me, but also really interesting and so I started thinking about that and I don't, you rarely see people talking about Shell programming, but it's a very big part of my programming tool chain and so I thought it would be interesting to talk about. And I'm not a sysadmin, I guess I'm an application developer, I work on Sinatra and Rack and some other Ruby projects that Josh mentioned and I work at GitHub where I do product work and front-end things and so I don't really use Shell to, I don't deliver products in Shell that often, but I really couldn't imagine delivering them at all without or as quickly without Shell. And I think that's kind of interesting, I think it makes it worth talking about a little bit because it is a very much a misunderstood programming language, I think. And so just to hone in a little bit more on specifically what I want to talk about today, the Shell has kind of two modes of operating. One is the interactive command line, which I'm sure you're all familiar with. It's, you know, you put in a command or some stuff and then a command runs and some stuff comes out. If you're lucky it's like a cow saying words. If you're not lucky it's like removing all your files or shutting down the internet or whatever. And so this I feel like in at least the Ruby world is pretty well known and used, you almost can't do Ruby development without some use of the interactive Shell. And that's not what I want to talk about today. The Shell is also a programming language and this is where I think it's used much less. And I actually think it's a really interesting programming language. But it's hard to approach it and there's some problems with it, some kind of fundamental learning challenges with coming up to speed and Shell and really kind of understanding it. And so that's kind of the rationale for this talk. Has anybody heard of this project? Yeah? Yeah, this RBM came on the scene like maybe a year ago now. I don't know if you even know if it's been that long. And it's really just lit the Ruby world on fire. I mean, it's pretty much just assumed that you use RBM now. Everybody's doing it. If you haven't checked it out, please do. And a lot of the reasons that RBM is able to do a lot of the things that it does is because it embraces Shell and uses the Shell to its fullest. You just can't get RBM's features without Shell programming. And a pretty large chunk of RBM is Shell. So if you want to work on it, if you want to hack on it, it's good to know these kinds of things. Another project that uses Shell a lot, a lot of people don't know this, but Git originally was mostly a series of Shell scripts. And then some of the core utilities, the plumbing commands were written in C, but a lot of it was written in Shell. And then later those would move to C. So it was kind of this classically designed Unix program where you start off with a single command and then you add more commands. And the Shell really kind of defines the way that Git would evolve. And the reason that it has so many commands is largely due to the Shell and the way that it kind of forces you to program. So that's another piece of software that's near and dear to my heart as far as GitHub goes. Oh, this is like my one funny slide and I just messed it up. What I was going to say is that a lot of people, whenever Git first came out, went and looked at it and heard about this new interesting distributed version control system and everything and went and looked at it and saw that it was a bunch of Shell code. And they thought it was a joke. They were like, oh, you can't actually use this to get anything done. That's just silly. And then, of course, haters are going to hate. And we're going to do a little bit of that today too. And I think it's important to do that with Shell because it does have some issues. And so I think one of the keys to understanding Shell and really getting your head around it is to understand that it's not a general purpose language. And if you're looking at it as a general purpose language like a Ruby or a Perl or a Python or a C or any of these kind of languages, then you evaluate it that way and you're going to get a bad result. It's actually a special purpose language that's designed to assemble commands. And I want to kind of illustrate this through an example. So here we have a very simple Hello World Shell script with a twist. But basically, it's a Hello program. It takes a single argument and then echoes back Hello, whatever name you put in. So if you say Hello Josh, you'll say Hello Josh. But it has a twist in that if you say Hello World, then it'll tell you how cliche you are. So it's a very simple bit of Shell code. But I think we should just take a moment and just hate on this a little bit, right? Like, syntactically, Ruby were used to a very aesthetically pleasing language, a very good language. And so I look at this and I see some things that I really immediately don't like. One of them is the square brackets for the conditional part. I mean, that just seems strange to me. And then there's the superfluous then. We all know that you shouldn't really need that. There's ways around that. And it's just taking up lines. Also, the line or the end of the if is fi. And so there's a lot of these kind of things that are just kind of annoying and kind of heartbreaking. You just don't feel right programming this, right? It kind of feels dirty and gross. But I think if you can set the sort of superficial syntax differences aside, you can see that it's actually not that much different. You're still able to express the kinds of things that you would express in a language like Ruby. You know, you're expressing a basic conditional. And so that's interesting. It's important that programming languages have that. So now we're going to make it a little bit more interesting. Let's say the boss comes in and he's poking around, and he sees this Hello program, and he's, you know, clever guy. And so the first thing he does is he runs it Hello World. And of course, it tells him he said cliche. And he's got kind of thin sin and a big ego. And so he's not happy about this. He's steaming mad. And so basically, it's like I want that removed from the script, whatever. And so you don't really want to remove it because you like it. And instead, you add this conditional. And you would say, well, if the name is world and the current user is not the boss, then do the cliche joke. Otherwise, just let him say Hello World all day. You know, I mean, who cares? And so here we see another little bit of syntax we're able to do ands in conditional spray. But again, it's like dash a, it's just kind of ugly, like why would you do it like that? What you're one character away from and or do double ampersand or something sensible, right? And so these things just kind of great on you. But still, I think it's a it's again, I think a superficial syntax difference, you're able to express the things that you need to express. So now let's go a step further. And now you're at lunch and you're talking and you're like, you know what I hate? I hate people that name their computer Mordor. Like there's like the thomus thing at like, why is that like half the people I know their computers, how many people here their name is their name in their computer is Mordor. Oh, great. So nobody beautiful. And this, this would do nothing that okay. So but anyway, we want to add another thing that if you're if your computer name is named Mordor, then just always call that person cliche, no matter what they put into this program, right? And so here is where I start to get like, flaming mad about this language. Because now we have an or that's totally different from the and now we have an and that's a dash a and our or is like this double bar thing and not only that, but we have another new set of brackets. And so if you're learning this language, the way that I learned languages, which is you go when you look at a bunch of code and you try to use what you know of existing programming languages to reason about this language syntax, it's extremely aggravating. So let's keep going with this though. Now we have we notice there's a bug in this program. And the bug is that if you just say hello with no name, it doesn't really give you a nice message or tell you that you're using it wrong. It just says hello. And that's not right. We wanted to give an error message. So we do an else. And here again, we see another kind of weird syntax thing with the dash Z that means zero length. We're testing if the string is your length, if it is, then we want to say a usage message and exit nine zero. We see more brackets. It's mostly consistent with the stuff up there. So not that bad. But then again, another then like 10% of our code at this point are thens, whatever. But let's keep going. So now let's say the boss comes in again. And he's like, you know, I really don't like this whole thing where you can say hello to just anybody. Like I'd rather it be like if they don't have an account on the machine, then they should you shouldn't be able to say hello to them or something. And so you're like, okay, whatever. Go ahead and add that. And so you check the Etsy password file to make sure whoever you're saying hello to actually is a person on this box. And if not, you fail. Now this is where I get just insanely mad at this language. Because now we have another and but it's different than the one up there, right? Like up there, it's like dash A. And down here, it's two ampersands. And not only that, but there's just a grep. And like there's no brackets around it. So we've got stuff in brackets, we've got stuff outside of brackets. There just doesn't seem to be any rhyme reasons like how this language is working. And so this can be extremely frustrating. Now I think that at this point, you really have two options. You can kind of throw your hands up and just say, you know, screw this language. I don't even care. I'm going to rewrite this in Ruby or something sensible. Or you can, you know, go try to look in the documentation. And I'll admit that for a very long time, I just, I took the first option. I just like, I don't charge, I could rewrite this faster than I could learn how it works. But let's say you do go and look at the documentation. So I'll say, man, you do this on Linux. It's like no manual entry for it, which that's actually not that weird. It's part of the built in shell language. So you're probably looking under the shell interpreter. On Mac, if you do man it, you get something slightly better or maybe worse. I don't know, just use out like all these commands and like stuff about built-ins or whatever. And so you really don't learn anything there. But I guess you learn that it's like built in or something. So then let's say you happen to know that Bash has help built in. So if for anything that's part of the primitive or part of the base language, you can say help and then that language element, it'll give you a short synopsis. And then you'll get this. It's like if commands then commands, else commands then commands, else commands, five. So it's like, at first it doesn't seem too helpful. You read in the description, if commands list is executed, it says zero. It doesn't really seem like this is helping you that much. But the thing to notice is that, look at how the commands is the same, right? So like it's if, commands, and then commands. So that seems to indicate that they're they're the same thing. The thing that comes after the if and the thing that's in the body of the if are both commands. So that just seems kind of bizarre. I mean that looks like language syntax and the brackets and things like that. And so let's just say you do man bracket just out of you know sheer anger. If you do this on your machine right now, you'll actually get a manual page. So you can do man bracket and that's because bracket is a program. It's the test program. And you can see here under name you have test comma bracket. So it's like shell has this alternative syntax or something built in the special way of saying text. Or does it? If you go if you say which bracket, you'll see that it's actually a program on your file system. There's an executable file under bin or maybe user bin depending on what kind of unix you're on. If you ls it, you can see that it's just a normal file executable. If you dip it with test you can see that they're identical. So it's actually not a special syntax at all. It's that you're running a test program and the whole if part were commands. And so we can run this on the command line and just get a feel for how test works. Test one equals one, echo something, test one equals zero. So the actual evaluation of an expression in shell isn't even built into shell. It's a separate program. And that's because shell really doesn't do anything at all except for assemble programs in different ways and do things based on the exit status of those programs. And then the bottom one, I love that just for fun, run the bracket. See that it does the same thing. So if we were to rewrite the final example and instead of using brackets, use test, I think we can see it may not be aesthetically as nice, although I mean it's horrible no matter what you do. But this way I think it's a lot more evident of what's going on, like how shell actually works beneath the scenes. You have if and then you're running a test program with those arguments. And test is what's processing them and determining whether that expression is true or false. The or is shell syntax and then you have more tests. And so you can kind of, it makes more sense now. It seems like the language is almost consistent with itself. Question. The closing bracket is bullshit. There is no, I actually had a slide in here where I left the closing bracket off, but you can't do that. It'll actually validate that the closing bracket is there even though it doesn't do anything at all. I mean like it's completely worthless. It's just to make the language look like a general purpose language, which I really think is a shame because it's not a general purpose language. It's a special purpose language meant for assembling commands in different ways. And so just to highlight, here we have the shell grammar. If and then and the ampersands and the or there, those are all parts of the shell. And then everything else are commands. And the shell doesn't really know what they're doing. It just knows that they're exiting or more or less doing different things. And so this is used throughout shell. You'll see while true in same programming language, you think that or in almost every programming language, true is some kind of primitive value, how you mean something. In shell, it's a command. It does one thing. It just exits with zero. False is the same way. It just exits with one. And so you can use these commands as part of language elements. You could run a while loop like this. And you can use any commands there. So you can write your own commands. You could write true in Ruby. It doesn't really matter. The shell doesn't care. It's just executing commands and doing things based on it. Here we see a more often used example of test. Again, that's not a special syntax. It's really running a command. There it is the other way. So again, I just think it's kind of mind blowing that shell looks like a general purpose language, but really it's a special purpose language where the fundamental abstraction is the command. Almost everything that the shell knows how to do is either about executing commands, assembling them into pipelines, assembling them into and or less things like that. And that's really all that it does. It doesn't know anything else. And there's a couple of different kinds of commands. I think this is kind of insanely powerful considering that shell was developed almost 40 years ago. At least the first shell was developed through the 70s. And you have built in commands which are implemented in the shell. So those don't actually things like test and true and false are actually today built inside of the shell. So they don't actually execute those commands on this for performance reasons of having not having to sort of the other process. But they work exactly the same as external programs. So all the show show programs that were written at the time when they did execute this you could bring those into built ins and they still work great. Shell also has functions. They work a little bit differently than functions as we know them. Then though they're basically little mini programs. They take a set of arguments and act exactly like commands. In fact they have the exact same semantics as commands. You can do anything in functions that you can do in a shell script. So you can have if blocks and while blocks and pipelines or whatever you want. And then you can use that function inside of a pipeline or you can use it as the conditional part of an if statement. The semantics are all the same. Later you could take that function, move it into a file, make it executable, remove the function and all your programs will continue to work. If you ported that shell script then to Ruby all your programs will continue to work. So there's like an insane amount of decoupling between the shell language and the commands and how they're implemented. I think that's really powerful and interesting. And lastly ALIS is also work exactly the same way. So if you have a shell ALIS you can assemble that in a pipeline and do all these same things with it. And they can change any way you want. And so the title of the talk is The Shell Haters Handbook and I really just wanted to mostly hate on shell. But I figured I didn't want to like make the conference overly negative and stuff. So I figured we'd do at least one fun thing with shell. And I think everybody really likes pipelines, right? That's one part of shell that I think most people really appreciate. And I just wanted to go through building pipelines and kind of the process of developing them. And so we have a simple problem here. Let's say we have the we have a ebook. It's Jonathan Swift's a modest proposal. And so I'm assigning it to a variable there because the URL was too long. It was like messing up my graph and stuff. So anytime you see URL it refers to that. And so usually what I do when I'm when I'm working on this stuff is, well, I should say the problem. What we want to do is we want to know a word frequency count of what words are most used in this work. And so because I mean, I want to know that all the time. So what we would do is we might start by just grabbing the URL with curl. And now we have that text on standard output. And now we just want to slowly move it toward trying to get that unique frequency count. So we would start by adding a pipeline and maybe we'd use the TR command and use that there's a lot of ways that you could do this but you said you pipe that to Ruby, you could pipe it to Pearl, whatever. I use TR on purpose because it's a standard unix utility and it's something that also Ruby has, right? Like Ruby has a TR method on string. And the TR command has roughly the same functions the same way. It's actually something Pearl borrowed from Shell or the standard unix utilities that Ruby borrowed from Pearl. And so here what we're doing is we're saying anything that's not an alphabetical character replace it with a line or a new line. And pretty much that's it. The C means not or it's a compliment argument. It means anything not in that first sub arguments. So we do that. We can see now that our text is now one word per line. And so we can see though that we have some empty lines. We don't want those in there. So maybe we'd like that over to grab dash B. So we're kind of just taking it one step at a time and just moving the stream closer and closer to our final destination. And what happens here is now it's starting to get a little bit harder to maintain because I'm this command line is starting to get a little bit long. And so there's a really cool trick if you use bash with the standard line innings and I think ZSH works the same way. You can hit control XE and that will actually open your current that'll open your current command line in your editor. So in my case it's them. You can see by the sidebar on the left. I tried to make that look like an editor that's my line numbers. And so now I'm in an editor and now I have a little bit more control where I can iterate on this program a little bit more. So the first thing I might do is just break it into multiple lines. There's a couple reasons to do this. One is just because it's a little bit more readable and two because it's easier to insert things into the pipeline at any point. It's just a matter of adding a line and adding something to the pipeline. So I've also decided to sort out the output. So that's probably a good start. We can start to take a look at what it might look like sorted. So if I run this now I see that I have each word on a separate line with we can see all the A's there and then actual and agree and so it's going good. If I do control XE again I'm back in here. Now I might pipe that over to unique which takes a sorted input and condenses it down to one line per unique line that's in the input. But there's also a C argument that says show the frequency of that word. So how many of them were there? So now we're getting really close. We can see that there are nine A's and one actual. So we're almost there. We just need it to be sort of the right way. So we can add this sort dash Rn and boom we're done. So there's our top lines and this is extremely useful in all kinds of ways. Log files and database dumps you can really rip through files really pretty quickly and easily at the command line like this and they evolve into programs of their own. So the next thing I might do is move this into its own file, make it executable, give it a name and allow it to take an argument. And so that's the way it kind of shell programs you get from the command line into shell programs and these get more and more complex as you go. And so enough of the lovey-dovey pipeline stuff. I want to get back to the hate a little bit. The other thing that's really, really rough when you're starting out as shell is there's a big documentation problem. Basically it's mostly garbage. And there's a couple of reasons for this. One we saw earlier the man and help commands don't always work when you'd like them to. So the reference system that's built into your computer probably doesn't work the way that you'd like it to. Another problem is that there's just a lot of shells and I want to run through these real quick just to give an idea. You have the early shells. These moved pretty much in a line. So you had Thompson Shell was the original Unix shell written by Ken Thompson himself and then that was improved by Maschischell. Everybody moved to Maschischell and then that was improved by Steve Bourne and everybody moved to Bourne Shell. Now Bourne Shell isn't batched. That's an important distinguished thing to distinguish. We'll get to bash a little bit later. But Bourne Shell was the first shell where you could really write fairly robust programs and it actually kind of set the style of Unix programming of the time because you didn't really have general purpose languages like we do then. You had C which was general purpose but lower level. The types of stuff that you didn't see were build special purpose languages to hook into the shell to do something. Just quickly, if we could go back to here, look at all the special purpose languages in here. You have the shell which is one and then you have TR. I mean that's really a special purpose language, right? It's a program that processes a language that says a set of characters and replacements to make on. BREP is definitely a special purpose language and then you have other languages like AUK and SED and all those were special purpose. And so that was kind of the way that you wrote a lot of programs at the time. And then there was another round of shells and this is where people started to innovate quite a bit. So you had CSH that broke with the basic foreign shell syntax and added some interactive features looking more like C. Foreign shell really took foreign shell to the next level and then you had some other shells that were focused on performance or part of other operating systems, things like that. And then that brings us to the modern shells, the shells that are most typically used today. And probably the most popular shell I think by far is Bash. It's a default shell on Mac. It's a default shell on most Linux systems, although not Deviant. The last shell is. But CSH is also popular. How many people here use CSH? It's a lot. OK, that's a good number. CSH is really popular in the Ruby community. It's a great shell. It's interactive, and its capabilities are definitely a lot better than anything else out there. But a lot of what these shells did is they added a lot of those types of features. So these shells focus almost entirely on making the command line aspects of the shell better. So read line and command completion and all these kinds of things were really developed during this era of shells. And the problem is that in order to do that, they had to add a lot of new stuff to the shell language. Especially Bash has just an insane number of new features that aren't included in the earlier shells for things like arrays and string functions and all these things that were previous to Bash punted off to individual special purpose programs to do. Now those are in Bash. And so in my opinion, this creates sort of a problem when you're trying to learn shell because it takes what is really a fairly simple thing. It makes it complex. Here we have the Bash man page which is roughly 40,000 words. And the Dash man page by comparison, which is a much smaller shell, much lighter. It doesn't have a lot of the interactive features. It's more of a pure shell implementation. It's only 10,000. So Bash is 40,000 or 1,000 times. Four times larger than Dash. And so that creates a problem. I think when I was trying to learn shell, I'd go into the Bash man page and it's just like a big mess of stuff that I'm not really interested in. And it seems like it's important because it's there, but it's almost annoying. And so this is from the Bash man page. They even know this. I mean, it's too big and too slow. And some people, I personally, I don't like Bash. I like it as an interactive shell, but I hate it for these reasons that I feel like it's really kind of made it a lot harder to learn shell and approach shell. And people say it's too big and too slow. There's some like resource issues there. I don't really care about it as runtime. I used it to run my scripts. I use it too as my interactive shell, but I don't use it for documentation at all. What I do use is POSIX shell. And I just want to quickly get to this running over. But so POSIX shell is kind of weird. It's not an actual shell. You can't go and download POSIX shell or install it, but all of the shells that I listed on the modern shell page are POSIX compliant. So they all know POSIX. And what POSIX was is it was they went to standardized Unix because there were like 10 of these things. They were all proprietary and all the different Unixes were going in different directions. So POSIX came along, the IAAA and the open group and like 10 other groups and it's a real complicated history and standardized all these things. And one of the things that they standardized is the shell. And they did a really good job of it because what they did is they took like almost a top down approach which hadn't been done before. All these things are just kind of evolving. They looked at the shell and they looked at the standard utilities and they said what is kind of the minimal amount of stuff that we need to have like a working operating system and that's also the most common and portable between operating systems. And so at the time it was all about portability but today I don't really care about that. I mean I'd like my shell programs to run on Mac. I'd like them to run on Linux but I don't really care about portability the way that they did. What I care about is that I have good documentation that's minimal that I can go in and understand and discover how to work in this programming language. And so that's just great about POSIX shell. The problem with it and this is the thing that I hate again about shell programming is you can't find the thing. It's like impossible. It's like one of these weird standards that like you go here and there's like a bookstore or I'm sorry like you have a shopping cart and all that stuff and it's just kind of a mess. And then but there's a free HTML version so you go to that and then you get like one of these forms and it's just like what is going on? I just want to see like some documentation and in the meantime if you go to Google and search for shell there's like all these bash tutorials that are talking about their string functions and stuff and that kind of information is really easy to get at. And so that's basically another time that I just give up. You know what I mean? Screw it. And so I'll save the trouble. That URL right there is to the POSIX specification. I'm not going to open it. It's nice. It's like it almost looks like Ruby's documentation. You have the shell language, the standard shell utilities. There's only about 40 of them. That's another big problem. You have thousands of commands on your machine. Which ones are the like poor ones used for shell stuff? And the table of contents is a little messed up. So I threw together a page at this URL shellhaters.huroku.com slash POSIX. And I tried to basically, these linked to the POSIX documentation pages. And I tried to just do like almost like a cheat sheet or an intro. So these are shell building commands and then all your standard POSIX shell commands. And I can't recommend enough using this document if you're running a shell programming or you're interested in shell programming. Use this documentation because it'll save you a lot of time and it's just introducing you to the things that are really important about these programs instead of being a big mess of stuff that people have added over the past decade to bash. And so thank you guys for the talk. I'm Ryan Tomego. Thank you very much, Ryan. We can do one question. Yeah. There's actually a lot of things. I don't know if there's any like SysAdmin's or shell programmers. Do you have a good question? What's that? Repeat the question. Oh, I'm sorry. He said, how can you hate on bash without talking about string escaping? And there's actually like a lot of things that I could have hated on that I had to cut out. I ran a little long. The first time I put this together, the first time I rehearsed it, it ran for, I think an hour and 45 minutes or something. So I was just like, I was like, man, so I tried to cut it down and some of it was pretty good. I had some string escaping. I didn't even talk about set E, which is like a major, major thing in shell programming. The first line in your script should always be steady. It's basically like act saying thing. Look that up in the posix documentation. But it just kind of shows how many things that I didn't get to because that's kind of a critically important part of it. But so I'm sorry. There was a lot of things like that that I didn't get to. Thank you very much. That's it. You can talk to Ray Moore in a break.