 called Dev Structure and we work on configuration management software primarily with the package called Puppet. And we built a tool called Blueprint that figures out what you did to your server so that we can generate Puppet code for you. And I'm here to have a little bit of a survey over a lot of sysadmin-like tools and try to understand why they're written in Ruby and what patterns we can extract from that. So we're going to start with Puppet the thing that I know the most about and tell a little story about Luke, the sysadmin, who was very dissatisfied writing CF engine installations all day every day. So he set out to write a new configuration management system that was data driven and relied less on diffing of files and applying patches and more about managing whole resources. And Ruby ended up winning out as the implementation language because Pearl was already kind of over the hill and Luke couldn't figure out how to get the auto-loading behavior he wanted out of Python. The auto-loading behavior in Puppet is very, very pervasive and works roughly as follows. When you ask for some new, some type or something that's not loaded, it globs over source files and figures out where that is. And each of those source files, as they're executed, uses an internal DSL to register the code blocks or the classes or what have you that it's creating. And it might look a little bit something like this. So this is how the package type is introduced to Puppet and without too much stating the obvious, you call a new type and you give it a block, which is evaluated in the class scope so that it can define methods like INSYNC and there are many others that a type might need to define to do its work. Now, that internal DSL, which is, it exists basically because using const missing wasn't a viable option in, I believe it just in Ruby 1.8 in general for getting the auto-loading behavior where if a type hadn't been seen yet that it would go find it. But by using that DSL and gaining the obvious context of we're introducing a type, we're missing out on the context of exactly what scope we're executing in. So it isn't very clear that this block with nothing else in it is someplace where I can define methods. So I think this is kind of a drawback in one of the things that makes Puppet's internals very hard to deal with. So specifically, I shouldn't have to ask someone or consult documentation as to whether I can define a method someplace. I shouldn't have to think very hard about what the name of a constant I define is. If I define Puppet type, new type package, I shouldn't have to wonder what the end result constant is called. And an extra that comes up a few times is because blocks are being defined at the top level, you can't always return from them. And that can be very frustrating as a programmer that likes to short circuit out of methods. So here's a couple of Puppet resources in Puppet syntax. Puppet is not a Ruby implemented DSL. It is an actual language with a grammar and a parser. And these two resources, a package that's ensuring Ruby is installed and a file that's setting up at ccdl.com. These are orthogonal resources. And so when Puppet executes them by way of auto loading the appropriate types and providers and figuring out what the current state of the system is and how to bring it in line with the desired state, these orthogonal resources don't have any relationship to each other. So Puppet can make a best effort and work through failures. And the problems that come up when you do that, when one code path doesn't cause things to come after it to fail, is that it's very difficult to identify those failures. So Puppet uses a very extensive logging system that's accessed by a number of command line options. If you start to see problems, ask for verbose, if you really see problems, you can get all the way to stack traces in line with the output. Not that they stop the execution, but that at least you can see what's going on. And all the output is tagged using syslog terminology. So you can gradually build up from info only warnings all the way through debug, or turn it all the way down to just seeing errors. And everything gets color coded if you're running interactively. So this is, I think, kind of the Cadillac of logging, that if you need to know what's going on in Puppet, it's easy to pick out the pink error line and work from there. So now I'd like to switch up and talk a little bit about Chef, which maybe more people in the room have experienced with. Chef is a very similar system to Puppet and has a lot of religion on both sides of it. I'm not here to start that religious war again. I am here to make a couple of comparisons. They're functionally very similar packages. Both of them are about systems administration tasks expressed as code, but they have very different philosophies about how to implement it. Chef, in contrast to Puppet, uses an external Ruby DSL, so the Puppet resources I showed before are actually implemented in Ruby. And inside, rather than using a DSL, it's actually implemented as I'm going to call today Ruby Ruby. That is, it's going to use real classes and real methods. So here's a couple of Chef resources. In fact, the exact same resources that set up the Ruby package and set up Sysctl. And obviously, they are Ruby code. It's deval block so that the methods can affect the whole resource. And here's the inside code, which is nice, plain, clean Ruby. Now, the disadvantage here is that the auto loading behavior that's so awesome about Puppet isn't available because of how constant loading works. And if anyone wants to tell me that, I'll give you a drink ticket. Tell me how that works. I'll give you a drink ticket. But it's really nice to be able to know with just a glance at this file that I'm in a class scope that's nested in the Chef resource namespace so I can define methods and constants. And I know what those constants are going to be called when I need to reference the player. So already early in, I want to make a call about a couple of things. The external DSL, whether you have a grammar and a parser or implemented in pure Ruby, I think the point is to do what's right for you that it's not inherently better or worse one way or the other. However, the internal DSL, I would like to think you should shy away from it and stick with the Ruby Ruby implementation. It's not worth my time to kind of reverse engineer someone else's thought process of their internal DSL to figure out how to contribute code to their package. I still think Puppet's better because I like the external interface to it, but they're clearly pros and cons to both of them. And this is my plug for I'd love to talk more about this with anyone who's interested or anyone who's an expert and has different opinions because I'm always open to changing mine. And now I'd like to get back to it. Both of these packages, despite pros and cons, are about codifying systems administration. And this is something that in the old times, people used Perl to do when shell wasn't good enough. And of course, everyone's heard that Ruby is a better Perl. And things like having first class regular expressions and backtake operators and relaxed policy on parentheses makes that a pretty good case. It's a modern take on the idea of a language that you can do most anything in. But then there are features that really make it a better Perl, like blocks, which are clearly so much better than backslashes and subroutine references. And real honest objects with syntax, not the way Perl objects always felt to me, which is a lot like trying to implement object-oriented code in C. And of course, Ruby gems is a lot like C-Pen, with all of its benefits and drawbacks. Because I'd like to dwell on the subject of release management, because there isn't a central feature freeze date for the next iteration of Ruby gems, the way there is for the Debian Archive. And that means that we as package maintainers need to take a little more initiative on that front. So to keep Ruby gems useful and stable and indeed even usable for people who are on call, the release managers for the gems really need to think about regression testing and kind of head-in-hand with that versioning, so that it's very clear when breaking changes are being introduced to APIs, it's also very clear when no breaking changes are supposed to be introduced. Makes it very much easier to test compatibility with existing code bases and gate upgrading of gems on whether it passes our own applications test. And a slow and steady release cycle kind of aids in that as well where we don't have to, well, I'll just tell a story instead of explaining that. Once I wrote the Flickr uploader and I found a bug in it, and the day of the release we found this bug fixed it and released again only to find out that evening we've released an even worse crash level bug. So the slow and steady release cycle along with healthy regression testing is going to make your Ruby gems much easier to maintain and make use of by people who are on call. And that's good. Now, most people build Ruby gems somehow through Rake, and Rake is being a build system, a very common tool used by systems administrators, and it exhibits a couple of patterns that I'd like to talk about. It's an external Ruby DSL coupled to an internal Ruby-Ruby implementation much like Chef is, which is to say I like the pattern that its interface and internals follow. And it came at the time when Ruby was finally the language of choice that was able to implement GNU Make style interface that looks something like this, where you have named tasks that may have dependencies. And Rake's job is to resolve these dependencies and work through them in order. And this dependency programming is a really powerful construct for systems administration because it boils down to convention the order and failure case when you're dealing with a complex system. You give it a goal, say I gave it file.c, and it works out that the prerequisites are file.o, or reverse that. And this dependency tree can be walked out recursively so that if anything fails, we immediately stop and can deal with that. And this works really, really, really well for interactive, fail fast build process type of jobs. And because this is something that's usually used interactively, we don't need the same kind of really aggressive logging that you might see out of Puppet. But then when we take the ideas that started, well it didn't start in Rake, but we're introduced to Ruby in Rake and take them into parallel with Capistrano where we have the same sort of dependency model, but it's being executed on multiple hosts at the same time. And more interesting and subtle failure case comes up. And this is where we kind of resort back to the Puppet style of logging where we make it very clear what could have gone wrong and let the other hosts in parallel continue on. So this dumb little output just shows how easy it is to pick out standard error output from standard out and sort of standard in to commands being executed. So the expectation, at least in my experience, is that when you're running Capistrano jobs across a whole cluster, it's very easy to see an anomalous host that's failing for no particular reason versus a catastrophic failure of every single host where everybody's printing the same problem. And this is an aiding in debugging and again is great for use in interactive systems and administration tasks. Then the marionette collective is a package that kind of takes off, takes where Capistrano left off and has taken the sysadmin job interactive sysadmin work to a new level by publishing what work is to be done over a stop server. So all of your agents running a Ruby daemon are listening for jobs to do and are publishing back into the queue what their result is. So this is kind of a more scalable implementation of the same idea and it allows the same sort of anomaly detection as Capistrano does. The pattern here that I'd like to point out is that the author of this tool, Ari Panier, described it initially as a framework, a middleware for systems administration, not a tool in and of itself. And the important thing to note here is that this informs its design in that it's not one monolithic tool, it's not heavily integrated with any particular development workflow or production deployment strategy. It's a bunch of small tools like MC Ping and MC Find Hosts that do one thing well. MC Ping probably obviously sends an echo message over that bus to all of your servers and reports back on what they said. MC Find Hosts allows it to plug into Puppet and select a subset of servers from your infrastructure all based on who replies to the messages on the bus. And these little programs are very useful to compose into larger programs where you grab a list of hosts from MC Find Hosts, ping them to make sure they're up, and then do something totally different and unrelated to MCollective. And this is a wonderful pattern and something that is inevitable when all of your programs start out as scripts, they start out as small, perhaps even interactive commands where you string five commands together as a pipe, and it slowly grows into something more generic and useful to you. And MC Collective started the same way as a simple solution for a problem that SSH in a for loop was not scalable enough for ARRI. So what do I mean by scripts? That they're those small command lines, things that you don't even think about that are so trivial, you don't need to test them. And when they turn into real programs, there's a few things that we can do to make them easier to maintain and more robust. And the first is to think heavily about add-in potents. Any command where you can run it again and nothing bad happens is a good command. Likewise, anything that you can run, notice a failure and run again without argument changes or code changes is a good thing. And for a web analogy, this is thinking about things in terms of puts instead of posts. And in the configuration management world, this is like thinking of managing entire files at once rather than patches and dips against existing files. And that's a particular example of a good API design principle. The principle of least surprise. Additional considerations that I think are important to make when building tools for systems administration are proper namespacing. You saw that M Collective named all of its tools MC, whatever, so that it's very clear by the suffix what's happening and by the prefix how it's happening. And the fact again that it's using a message bus to form the substrate that all of its other actions work on top of is a perfect case for a generic solution that's just generic enough to solve the problem and allows further work to come on top of it without being predetermined. So we've been talking for the past couple slides about shell programs and shell interfaces and command lines. Does this matter to us in Ruby? And I would argue it does. And at this point, this is a slight diversion but I think we'll get back to it. In Ruby, we have say two high level structures of programs, data structures programs or Unix programs. And the spoiler is that Ruby can do both of these well. Unix programs I like to think of as being heavy on file system manipulation and on making system calls either via Ruby, via C or direct kernel code or by the section one manual programs like Jamad that are exposed that basically perform one specific system call from shell code. These kinds of programs work really well in shell are very easy to implement in C. The data structures programs are much more difficult to implement in shell because generally you need more expressive power than the scalar variables you get out of standard POSIX shell. You probably need more than the single level of hierarchy you get out of bashes one dimensional arrays and associative arrays. And you probably want to take advantage of modern code constructs like classes and methods. You might want to do network communication. You probably want to have some notion of concurrency and parallelism. And so you might need these concepts that don't exist in a shell program. Ruby can of course do both of these things and if you're going to write a Unix program in Ruby, you probably should read Unicorn as Unix and you should probably read the documentation for file, dur and file utils which provide a lot of the file system manipulation tools that you might want. And then there's Process and Etsy which provide interfaces into user IDs and permissions as well as Etsy password and Etsy queue entries so that you can fully take advantage of all of the intricacies of Unix users and safety and whatnot like that. Now Ruby editorializes a lot of loopsy and changes, for example, set UID to the setter UID on the process class. So there's a little bit of digging you need to do to map from a manual page to Ruby but once you do it starts to be very, very useful. In data structures land I think it's fairly obvious to any Ruby programmer that hashes and arrays can turn into anything you like and being able to build a class from these allows you to describe your data types in excruciating detail and build them up in whatever level of hierarchy you like. And all of this code that we've built in Ruby be it Unix or the data structures work can be composable just like those M-collective fine host programs can be in shell land. And I think this is a crucial pattern to be able to start solutions, solve one problem and then be able to reuse the core of that to solve another problem later. And there's one thing I'd like to specifically harp on with regard to this and that's that within Ruby code I really think you should declare all inter-file dependencies so that it's very trivial to pull out one file, one particular piece of functionality from any old file. Here is an actual I apologize for this example from a real live running web app. All I wanted to do was use active record and specifically because this was a Rails app that used device I needed device slash ORM slash active record almost at the bottom there. All of this other stuff is because device uses a single entry point and expects everything else to kind of flow from there. This is not a dig device, there are many many other examples but this was the most accessible and was the perfect size for the slide. All I wanted to do was this and I would encourage everybody to go look at every gem they maintained and make sure that I could do this even if it's an implementation detail it's okay if I shoe on myself into not being able to upgrade but the code should stand on its own. That is to say explicit is better than implicit. I apologize for bringing Python into this but I think this is a principle that would be worth stealing. Because when you have a file that depends on some other omnipotent loader to make sure the world is in the right state for it to come into being, then you're in nerdy C.S. terms said to be coupled to that and the coupling behavior is particularly bad in cases where you can't see it unless you execute the code and see the failure. So the more required statements we can put in place to be very explicit about what it takes to have the environment set up for a particular file to run, I say the better. This is a form of magic and magic is a little bit like implicit but a little bit more showing it sparkly. The summary of all of this, magic and the coupling being explicit is that operable maintainable code is easy to look at and understand what it does and this is kind of a form of simplicity and it's really ultimately okay that your code looks like code and has required statements. That's what we're all here to write so it's all good. So I've gone faster than I thought I would but I have a few more interesting notes about the patterns and anti-patterns here and then clearly there will be time for questions. So the external DSLs are I think a great pattern going all the way back to Puppet & Chef where Chef uses an external DSL that's written in Ruby. It's great because of the brevity that it gives you as a user of that package and it can be implemented as we've seen a couple of times today in a very effective way that it's clear what the Ruby syntax does and it's clear what the syntax does within the semantics of the program that's using it. However, the internal DSL I can't speak so favorably of. It tends to obscure what's going on in the Ruby code and it makes it very difficult to jump in quickly and maintain without a lot of time spent reading the entire code base. Dependency programming like we see in Rake and Capistrano is I think a great pattern for very convention oriented failure handling that works especially well in straight line execution of a linear or a tree of dependencies and it works very naturally for an interactive task like a build tool or something that if you squint just right could look like a build tool. Item potence is a great way to think about API design in an environment where sys admins might be up late at night working on trying to fix something if they don't have the fear of running your code over and over again with external changes trying to make things work. It's going to be it's going to be a better time for them and it helps you I think to work within that framework because it forces you to write smaller more clearly defined tools. And of course I'd love to say that magic is a little bit of an anti-pattern that the more intimate knowledge you need to maintain a system the less likely I and anyone else with a kind of operationally focused mind is going to be to use it. The philosophy explicit is better than implicit all about reducing magic and coupling it's all about promoting the reuse of small pieces where appropriate and as Rubyists who have the wealth of Ruby gems at our disposal I think there's even more that we could use from within those gems if they were easy to tease little behaviors out of. And last up the sys admins always looks for the generic solution not the maybe maybe late at night you look for the solution that can make you go back to sleep but in general when looking for stability and long-term and capability generic solutions without lots of special cases are how you're going to get there. So, like I said I went a little bit fast so we got plenty of time Any questions? A comment and a question you were talking about make your library more usable I'd say if you depend on a giant loader to set up all your little pieces maybe some of those pieces should be part of your main gem they should be split out into a separate library so that if they're what appropriate. That is a very excellent point I will repeat it because it's great if you have a giant loader and a large gem that has lots of complicated functionality odds are some of it is at least sort of unrelated so that you can take it apart and packages as separate gems all functionalities available in separate gems Right? He makes a good point Anybody else? Everybody just wants to hear Tenor Love some more