 about scaling puppet. Hi, I'm Stephen. I'm a systems administrator with Anchor, along with apparently 18 other people from Anchor this year. And I'm going to talk about using iNotify to help. So for anyone who doesn't use Puppet or isn't familiar with it, it's a centralized config management system. By centralized, I mean that there are a number of nodes which run a puppet agent and connect to a puppet master, which tells them what their configuration should be, and then they go and apply that. To give some background as to why we made the decisions we did, we use a single large production environment in Puppet with close to 1,000 nodes. We also, in addition to those nodes, we also have some global virtual resources for things like monitoring on nodes that we don't run puppet, which are then collected by monitoring servers. And we tend to use Puppet such that we make very small changes, often specific to one node. And we want to see that change take effect straight away. And that's a very slow workflow with Puppet. And I'll get to explaining why that is in a moment. So ultimately, our goal in all this is when we make a change in Puppet, we want that to apply immediately when we roll out. And until a few months ago, we achieved that by, as part of our rollout process, restart the puppet master, and it just repasses everything. The problem and the reason that this is slow is because the puppet master takes a long time to parse manifests into code internally. The time taken is negligible with a few manifests. It's probably somewhere on the order of a second for every 20 manifests. But when you start getting close to 1,000 nodes, that starts getting significant. So when you combine our nodes with the global virtual resources I mentioned, we have 1,300 manifests that get parsed on every startup, and that takes just over a minute. And to reiterate, that's a minute on every puppet rollout because we're restarting the puppet master on every rollout. The reason that's a problem is because the puppet master can't do anything else while it's parsing those. When you do a rollout and then you do an agent run, that agent run just hangs for just over a minute while the puppet master's not actually doing anything but reparsing code. Puppet itself has a very, very coarse internal caching. Puppet has the concept of environment, and you can only expire code with upstream puppet on a per-environment basis. Since we use a single large production environment, that is just as bad as doing a full restart because when you expire the environment, you have to go and repars all that code anyway. There is the option of breaking up the nodes into separate environments and having, say, 10 or 20 nodes per environment. But we used to do something similar with multiple puppet masters back in the days of Puppet 2, and we found that that didn't work well with our workflow primarily because we have a very large set of standard management and it works best for us to keep that all in the one tree. So given that Puppet's caching doesn't suit our use case, we looked into what solutions we can implement ourselves. We could have Puppet just pull the file system and see when files have changed. One idea that was suggested was to have, because we manage all our puppet manifests in Git, when we do a rollout, we can have Git say to Puppet, these files have changed, please reload them. And there was the option of having the puppet master itself listen for files that have changed using iNotify. If you're not familiar with iNotify, it's a subsystem in the Bill and Nick's kernel that you can say please tell me when this file is modified or deleted or renamed or something and the Linux kernel will then put notifications into a queue and you can then process those notifications once you're ready. The thing with all three of these options is that they require Puppet to be able to remove code on a per file basis, which as I said is something it can't do only on a per environment basis. So naturally the first step was to have it be able to expire code. This diagram is, it's sort of a subset of what things look like internally in Puppet. You have a bunch of agents that connect to the puppet master in an environment and environment contains a type collection. Each type is, the name type internally to Puppet can refer to a node, a class or a defined type externally to Puppet and each type contains a code block which contains individual bits of code which might be function calls or strings or even other code blocks. Like if you have an if statement, the if statement will typically contain another code block with more bits of code in it. So that's kind of the structure that we were dealing with and we needed to be able to expire, first of all, if a type was defined in only one file we needed to be able to remove that entirely and if a type was defined in multiple files which typically only happens with the main class. If you're not familiar with Puppet internals the main class is basically everything that's in top scope. If you don't put anything inside, if you don't put something inside a class it's in a class with the name of an empty string which is called the main class. So we needed to be able to expire bits of code as well. So we implemented a general purpose file exploration mechanism that does just that. There's a bit of it in the code for a type to remove bits of code and there's more code in the type collection to remove entire typos and it actually worked out that we could do this without changing too many of Puppet's internal APIs because each bit of code already has associated with it file and line information. So we could just use that information to tell which, to associate each bit of code and each type with a file and expire those as necessary. Because of the generic nature of the expiration API this actually has nothing to do with our notify yet. This could be hooked into any external source of file changes or even a file polling API or something. So the first option we had to consider for what to hook into that API would be file system polling. That's by far the most portable but it's quite slow when you're having to poll thousands of files to figure out which ones have changed that's not particularly efficient. It could be implemented as a fallback mechanism but we wanted to take advantage of the fact that our environment is all Linux so we didn't implement this but it probably wouldn't be too hard. Asking Git for changes sounds very clean and efficient in the sense that Git is kind of the definitive history of what's changed. It ties us to a Git-based deployment model which is not that big of a deal because we're not likely to stop using Git anytime soon. It requires us to queue changes ourselves to expand on that. Currently our rollout process doesn't have any locking in it. Basically if two people roll out at the same time it has the same effect because the Git repo it just pulls from the current head of the Git repo regardless of who was rolling out at the time. If we wanted to queue changes we'd run into things like if two people roll out simultaneously how do you know which changes have already been loaded in if the second person who's rolling out is basing their changes on a commit before the first person that's not rolling out and it gets complicated. And because of those issues it's very easy to introduce bugs. And the last thing we wanted to do was accidentally introduce bugs that think changes have been loaded in but haven't and then end up with manifests that aren't actually being applied and then people are staring at the puppet manifests wondering why their code isn't actually taking effect. Third option, iNotify, ties us to Linux puppet masters which isn't a big deal either. Doesn't tie us to Git. The main reason why this option was appealing is that the iNotify code in the kernel has been around for many years and had a lot of testing done and everyone pretty much knows that it works so we didn't have to worry about any sort of queuing ourselves. So we evaluated this as having the least risk of introducing bugs. So we went with the iNotify option. There are two places inside puppet where manifests get loaded. The initial importer is historically it's imported a single file but now you can make that a directory and that does an initial import of files that should always be loaded. So we have those be repulsed when they change because they're not gonna get loaded any other way. The autoloader is the easy bit. We can just remove that code and the next time something references, if you're not familiar with the autoloader, in puppet if you use a type or a class name that puppet doesn't know about, it will go and try to find that type name. In a module based on file naming rules and that's easy because it will just autoload it later so we don't need to worry about reposing that. And this all happens at the start of a catalog compilation. So the puppet master sits there idling waiting for an agent to connect. Files are changing under it. Agent connects. It processes the iNotify queue, loads in all the changes and then passes a catalog onto the agent. The difficulties we encountered. The import function in puppet which is used to import other manifests has been deprecated for a while and it actually makes things more complicated because it's kind of a third wave files can get imported and in order to track that properly you either need to repulse everything that changes or track the complete graph of all imports so that at any one time you know whether a manifest has been imported or loaded and that's just annoying so we just didn't bother with that. So we wrote a patch that implemented that behavior, rolled it out, initially rolled out to a staging environment and we found a pretty consistent 70 second speedup. As I said, it takes puppet just over a minute to pause those 1300 manifests at start up. So by rolling this out and not restarting the puppet master we immediately saved 70 seconds on every agent run after a rollout. But when we rolled this out to production it had the additional effect of reducing the overall load on the puppet master so that it could generate catalogs more quickly. I don't have any graphs of this because by the time I decided I wanted to do a talk on it it was already more than 30 days after we rolled it out and the puppet dashboard that we have only keeps data for 30 days but if anyone else wants to use this code and has any interesting graphs to send me please do. But anecdotally, the best speedups we had were five minutes on nodes with complex catalogs. They were taking about eight minutes to run puppet and that got down to two or three so that's quite a drastic speedup. Of course it doesn't help if in cases where the slowness is on the agent side for instance when using types that are particularly slow to do that thing it doesn't help there because it's purely a master side speedup. The limitations, our initial implementation doesn't support the future parser. There's no technical reason for that aside from the fact that it wasn't a use case which we needed to consider and it wouldn't be hard to implement that. Reopening a class in a different file is not supported. We had a bit of a discussion about this and I weren't sure if this was ever intended to be supported or if it worked by accident but we only support that for the main class because by definition everything in top scope can be in any file but in upstream puppet you can define a class in one manifest again in another manifest and your resultant class will be the merger of those two. Supporting that is complicated because we would need to track whether or not a class has been loaded from multiple files and if it hasn't then expire the whole class otherwise expire code from within the class and it was just easier not to support it because we don't use it and I'm doubtful as to whether anyone uses it as I said the use of import is not supported and that's something that wouldn't be worthwhile supporting because it's being removed in puppet 4 anyway and native Ruby code doesn't get reloaded and that's another thing that wouldn't really be possible to support because the way that native Ruby code gets loaded is puppet just says here you go Ruby interpreter do whatever you want to do with the Ruby code that's in this file which means we'd have to then hack the Ruby interpreter to be able to expire Ruby things and because Ruby's not declarative as puppet is that is a lot more difficult so the way we handle that is just if any native Ruby code has changed on rollout restart the puppet master anyway to pick up the new changes and you can get the source on GitHub there's also open document sources which you can use in the WTFPL if you want and there's also a link there to the Ruby iNotify bindings that we used in order to use iNotify the source that we have there depends on a yet unreleased version of the Ruby iNotify bindings if you try to use it with the current version of Ruby iNotify it will seem to work until you have an iNotify queue overflow at which point the Ruby iNotify library will raise an exception and by an exception I mean the Ruby class called exception which is not too easy to catch because then you end up catching every possible exception that would ever get raised so we've submitted a patch for that which has been merged into master but not yet made into a release so if you're going to use that code just get the master branch of the RBI iNotify gem and build your own gem from that so that you have that fixed. Okay, that's it for me. Does anyone have any questions? What about five minutes for questions before the break? Any questions? I'm familiar with iNotify and I think it's the most wonderful facility thanks to all the kernel hackers who made it work so well. But I'm not familiar enough with the way the Puppet Master loads the information from these files so basically what you're doing is you can tell the Puppet Master to just reload only those things that have changed without sort of restarting the whole thing. Is that what's going on? It's like the Puppet Master's got to read the whole manifest if otherwise and you just want it to only load the things that have changed. Now, is that what's going on? Is that sort of like with Bind or DNS, say just reload this zone and leave everything else as it was? Almost, but not precisely. What we're doing is to expire the entire contents of a manifest and then reload that whole manifest rather than reloading all 1,300 manifests. So we're not doing partial expiry of code within a manifest. Does that answer what you're... So I'm still not familiar enough with Puppet Manifests to be 100% clear on that. But what you mean then is that there are lots of manifests that the Puppet Master needs to reload and what we're doing is figuring out which manifests to reload. Is that what's going on? Okay, I think I'm with it now. Thank you. Anyone else? Okay. I don't have a question. I just wanted to see you run up the stairs. No. I have one question and possibly a follow-on, depending on your answer. Is this a patch against the core Puppet code or does this end up being a module that you effectively load? It's currently a patch against Puppet 3.7.3. We haven't tested it against Master yet and it might require a bit of work because they've just completely removed the current PASA from that code. So my follow-on question then is, have you or are you considering pushing that change back to Puppet Labs and if so, have they been amenable to that? Probably not the code we currently have because of the caveats that I mentioned. Once we've revised it so that it works properly with the Puppet 4 PASA and whatnot, then we'll be looking to see if we can at least get the per-file exploration API accepted and maybe a generic file system polling API because I know the Puppet Labs guys want to support more than Linux. Thank you. I look forward to hopefully seeing it in the future. Okay. It's afternoon tea time now. Please join me in thanking Stephen.