 Welcome everybody. My name is Malone Boer. I'm working as a system engineer for the social networking site Hives. I was wondering before I start how many people really know the website Hives. Okay, quite a lot. That's a good thing. Okay, first I want to explain what Hives actually is from a technical point of view. So Hives is a social network much like MySpace or Facebook. Currently we use three data centers all around Amsterdam all connected to the MSX internet exchange. The department consists of 12 full-time units at the moment. It's just for the system engineers. So we have other departments as well that do the front-end programming. Currently we're doing about 25 terabytes daily external traffic. So not just the external, internally we do a lot more. We have about 200 million page views daily. That's quite a lot. And at peak moments we do about 80 million page views a year. To be able to serve that much page views we have about 2,500 servers at the moment, which are only managing and with 12 people that is a problem if you don't have a thing like Puppet installed. So what is Puppet? Puppet actually is a system configuration framework. You can describe parts or you describe what you would like to do in manifest, in Puppet manifest. Actually you could say I want to have Apache started or I want to manage my ETC password file or my host file. It's entirely written in Ruby. So yeah, for some people that could be a problem, but yeah you have to install Ruby to be able to use Puppet. It's created by Luke Canis. He was entirely fed up with C-vention and all the other tools because they lacked the ability to use the abstraction layer because you have to write everything for the specific environment you're in. You can't use it on BSD as well on Linux easily. With Puppet that's much more easily to do because it detects on which Unix system you are and you can almost use the same code. Much of the deeper info is available at Reductive Labs. I don't have the time to go into depth. So just want to show you how Puppet works. It's a client server model. So you have a central server where all your Puppet code lives which you write and all the clients connect to it. And the server compiles your manifest. Make sure it's unique for your host. And then the client receives it, executes it and makes sure it gets run properly. To ensure you're speaking to the right server, some people are concerned about security or perhaps even want to run Puppet over the public internet. It's using SSL certificates so that way you can make sure that you're connecting to the right host. This is an example of the site pp file. This is actually where in Puppet you declare all your hosts which you want to manage. In this example I've called the host some node. Normally this will reflect your DNS host name. You're allowed to use any kind of variables inside your node declaration. In this example I've used the operating system solvers and I want to manage SSH on this specific host. So I'm importing the Puppet manifest for SSH. In the actual SSH manifest you can see we're managing the SSH deconfig file. And here you can see the benefit from being more abstract. You can actually check on which operating system you are. So for solvers you can have a different path as you have for the default environment which could be a Linux or BSD machine. In this example I want the file to always be owned by root, the group root. I should have the node 644. Actually when a user changes this on a system and Puppet runs again, you can set intervals at which time Puppet runs. It will correct your changes so if somebody changed the file to nobody, nobody, first time Puppet will run it will change back to root. So it's also a tool to detect errors users might. The last part is the service. You actually, in this example, you're subscribing the file from above to the service SSHD. So if anything changes in the SSHD file, Puppet will actually restart your SSHD daemon with the new settings. So you don't have to do it manually. In some of the Puppet features there are more but for our environment these are the most important ones. Puppet is able to run in every Unix architecture that can compile Puppet, sorry Ruby. So in theory you could even run Puppet on your jail-briked iPhone. What really is handy in our environment is that we can use templates. This reduces the amount of files we have to control. So for example, if you have different locations and you want to have different firewalls or different resolve entries, you can just put in the variables like you saw in this IPP and the template can find out on which location you are. With Ruby you can even put in for loops or other things to automatically generate the content of that file without having to have 20 files in a specific directory and include the right one. Other thing Puppet has is the factor library. That shows you for example the amount of cores a machine has, how much free memory, how much interfaces it has and can show you which IP addresses are on that interfaces. Much more features but mainly we use those to spawn the right amount of Apache instances or generate our firewall for specific IP addresses. Puppet also supports types and functions, if I go back one sheet. A type is in this example will be the file type and the service type. There are a lot of more. You can use the contact type for example or the exec type. The contact manager contact entries on every UNIX host that supports it. Functions for example we use a couple of custom functions we wrote around. We can ask in any given time which versions of software is running or IP addresses we can bind on specific interfaces and we wrote custom functions for that. The only problem with the functions are that they only get executed on the compile host so that's the main server. If you want to use it on clients you have to do something with types because the types are executed on the client side. Other feature we use from Puppet is the database backend support. All the facts that are available can be written into a database backend. We use MySQL for that and the way we can run out all kinds of statistics. We can see which servers are on which version of the kernel or use how many cores, how many memories are in it. So it's very quick to just get an overview if you have a large server bar. Other thing Puppet supports is the external node definitions. If I go back to the site pp if you have a really large number of hosts and you already have another system for your asset management from your service you can write your own script to generate some of this content without having to take the rest of the server bar. Now I want to go to the point where we use Puppet.hives. We started out with the SSH for loop as most of the system engineers do. We found out that doesn't scale so we tried to figure out something for that and we came across Puppet. Mainly what we do is we install servers, we rip open the box, press on F12 to do a PXE install and it puts our quick start the first time, sets the hostname and in Puppet we put in the right node definition and Puppet will do the rest. So that way we can be operational within 7 minutes for example for our main web servers or front end servers we rip open the box, put it into a rack, push the F12 button and we're up and running within 7 minutes so I think that's a pretty good time to get your server operational. What do we actually manage with Puppet in our environment? Like I said before some of the DNS entries or the resolve and the NTP servers, firewalls are all location aware if we had to do that by hand it would be hell of a job to keep up. We use Gen2 on our servers and we don't want to compile all the packages on their own house so we have a centralized package system and Puppet can fetch packages from their own install if we're update if needed. Puppet also starts the required services. So for example if we have the main web class to make sure Apache is started, the SSH is started and the monitoring services. The group we use a lot is the database backend and of course that would start my scroll. The other thing we use it for is to push updates. That can be packages or config files if we have to roll out security updates we can use Puppet for that but Puppet will restart all our services to update the config. Other problem we climb across when we use Puppet is that it doesn't scale well behind over about 800 servers in the web brick configuration which I think most people still use. The problem with the web brick is that single threads so even if you have more cores in your machine you can't use them. So what we did we used the Mongo Ruby implementation for HTTP and split off the SSL part 2 SSL capable proxy. In our environment we use Nginx but you can use Pound or Apache as well. This way you can low balance some spawned instances of your mongrel and that way you can use multiple cores in one machine. If you get to the point that that doesn't scale in one machine anymore you can build your own trusted SSL chain so Puppet clients can connect to multiple services instead of one. Other thing we do we have passive clients. Normally the Puppet clients run in a defined interval and just connect to the server but if you have a lot of servers you will create your own natural dust attack in your own network so what we did we push all our clients are passive and we push all the updates serialized throughout our Puppet master to keep the load low. How we do that? You have a program that's called Puppet Run. We have a script around that just in batches of 10 servers calls the Puppet Run. The Puppet Run on the master will connect to your client to say you can run now so then it will connect to your server. That way you can easily reduce the load and don't have to think about the interval the Puppet clients will start. Okay, if there are any questions you can ask now. Okay the question was if we used DB backend to feed them to Nagios or any monitoring tool. No, we have our own other database which we actually started with for assets management and we use that database to create our Nagios configs but we're planning to merge them or do something else with it. Okay, sorry, time's up. If you want me to please come outside.