 So thanks for coming and thanks for getting up after yesterday's party Welcome to my talk keeping track of state for infrastructure and overview over the infrastructure at inner games and the tools we build to run it Depending on who you ask I'm space or I'm Patrick I'm a this admin at inner games. I Used to be the primary system administrator for the game the West and now I do mainly infrastructure software development for inner games In games, we don't know it is a mobile and browser based game company With more than 200 million registered players from Hamburg in Germany and At inner games we host our games using thousands of VMs with only like it doesn't or so often not hosted on our own infrastructure In this talk, I want to talk about how we do that without AWS or OpenStack or any of the tools you probably use for that The primary tool I want to talk about is software admin, which is a configuration management database We have written ourselves to manage all of it The reason why why we have written server admin mainly is historical because Back when we have written server admin there just was no OpenStack. There was no net box So we have come up with something ourselves. I've checked our configuration management database on Wednesday and on that day we had On the API 1.7 million API requests on that day just reading we had 37,000 API requests writing to this configuration management database and that's just one day We also have like 158 types of attributes and 42 types of objects So those are like different concepts like a virtual machine or hypervisor or load balance And we had 21,000 objects, which can be either with real machines or some kind of configuration information And with this talk and with these numbers I want to challenge what your expectation of what a CMDB should be Right These are some of the systems that all hook into server admin for us We have integrations from our own custom virtual machine orchestration software from puppet In our networking, back-upping, DNS, everything's integrated And in this talk particularly I want to talk about some of those integrations which are our virtual machine orchestration Our configuration management and how we integrate puppet into server admin How we load balance traffic which comes from the internet to all our servers And about how we do DNS internally at least Let's start with the virtual machine orchestration example Imagine building your own virtual machine orchestration software like something like OpenStack maybe Using just networks or rec tables as your database back-end, where you just saw all your state you have in there I think you will have a hard time and my reasoning behind this is You will have mostly problems with the very rigid schemas those software provide Networks comes with a like they they have an example of what a data center is what a wreck is what a hypervisor is and so on But when you when that schema doesn't fit your reality you have a hard time changing it For example in my last job I subscribed to some github issues which just recently got resolved Which I'm very happy about because I really like networks, but for example this one is about tracking different types of cabling in networks which took like How long 2.5 years to resolve and they came up with a really cool solution where you have like all kinds of cables Which can end in different types and really complicated stuff. They can cover mostly anything But if you need a solution now, you don't want to wait 2.5 years And one example which is closer to what I want to talk about they didn't have virtual machines when I use What you did what the basic solution was always like yeah, just take like a blade center and make them blades or whatever And that's not really good So So that man is more like it another normal database like an SQL database Which doesn't come with the schema itself at least not with a lot of schema It requires you to create one yourself the first Type of thing you have to care about our attributes attributes and server admin You can kind of imagine like columns in an SQL table or you can imagine them like class attributes and object-oriented programming So you first go ahead and create some attributes here For example, I want to create an attribute CPU model which makes sense for hypervisor But not for virtual machine and I want to create an attribute called game market Which makes sense for virtual machine, but not for hypervisor because we don't couple those When you create an attribute you first have to give it a name always That has to be unique of course You have to give it a regular expression to check the validity of values this attribute can have You have to give it a type and This is also a really cool thing in server admin and all the attributes you have a type check So there are certain basic data types like strings and numbers and booleans and dates which are Fairly straightforward and they do exactly what you imagine But there are also some specific data types which are particularly helpful for a configuration management database like We have Enet which can hold IPv4 and IPv6 host addresses and networks We have MAC address Which can hold MAC addresses we have relations and reverse relations I will show you later and we have some even more Special data types which are relational like Supernet so you can say something like Okay, this host has an IP. I want to get the next bigger network that includes this IP right so Those are the main things you have to care about one for an attribute other than that there's also a Read-only flag you can set there is a Flag you can set if an attribute you have multiple values right The next thing you will care about are server types or object types For example here. I've created the server type hypervisor and the server type VM You can think of them like tables in SQL or classes in object oriented programming Again when you create a server type you first have to give it a name and You have to attach a list of the formally created attributes to it So one server type can have a certain set of attributes and for each of the attributes you add you can set some more Flags like is this attribute required for this server type? What's the default value of this attribute for the server type and? There are some more special ones. I will not go into deeply also you can give the server type You can tell what the IP type of this server is so you can say either this is a host or this is a network or Something else like it doesn't just IP addresses do not apply Of course one attribute can also be attached to multiple server types I Mentioned relations before so you can have an attribute Which has is of type relation and has a Certain destination server type so it will for example the server type the attribute hypervisor will always refer to a server type hypervisor And you can also reverse at relations you have previously created for example you can have an Attribute called VMs on a hypervisor Which is pointing to this relation from VM So you only go to your VMs and define the hypervisor they are running on and the hypervisor gets the reverse information for free So you can ask which what VMs run at this hypervisor, right? This of course means that the reverse relations are always One to many relation basically Using the schema we have just come up with We could have some objects in our database Using the schema so we create hypervisor object And this hypervisor object can then have a certain server type which is an attribute it can have a host name The CPU mod that we previously defined and it can have VMs which is actually not writable on this site because it is a reverse relation It is only writable on the right side on the VM object And the VM object then can have a server type a host name the game market We have previously seen and it can set the hypervisor if you have ever worked with EAV or entity attribute value databases before that should look familiar because this is what we've what server is that it's gone The full name would be entity attribute value with classes and relations Though it's good to note that Our naming is kind of off. We call attributes attributes and values values and relations relations But the entities we call either servers or objects and the classes we either call server types or object types given this Schema and data and so what you don't need to be able to Actually create and query and delete stuff and so what and for this we have three easy ways plus an API basically The first thing is we provide a Python library, which is the most straightforward thing you can do Then we have a command line utility using this Python library We have a weapon to face for manual mass editing of different kinds of objects And you can use the API directly of course So this is an easy example of creating an object in server admin Whenever we interact with a server admin we basically first import this query class This query class then in this example. We want to create a new object So we tell it we call the new object function and tell it the server type and this will go to the server it will check out the Schema information on the server and get back a dictionary of What such a server would look like so which attribute does the server had and what default values does it have and We save this in an object in in Python here And then we can update the information and override simple things with passing another dictionary But only when we call commit on it it is actually safe to the database so we have create now we do reading Reading is Straightforward so you also take this very class and this very class usually takes it takes three arguments in its constructor the first one is a filter and the filter works in this way that you Have a dictionary with the key you want to the key being the attribute You want to filter on the name and the value being the value you want to match If you just give it a direct value This will be an exact match if you give it an extra filter like I've done here with any you will Get certain extra filter things. I will show some other filters down the road But just know that they are different filters available All these attributes are all these filters in in the dictionary will be ended So your servers the servers that will match this query will have to match all of these so both And further This will only consider server types, which even have these attributes For example a hypervisor will not have game markets. So it will never match this The second list we pass to it are the attributes we want to get back from the servers that matched So here we asked for a host name at hypervisor The result could be something like here on the bottom where we see okay We have the host name we asked for we have the hypervisor We asked for it is only one object because only one matched in this query and It's also good to know that we get object ID back, which you will always get back with when using this type in library Object ID is basically the Primary key on the entity table and in the database behind it So even when you change the host name, you will always keep the same object ID Making this a little bit more difficult We can do joins in our query too So check out how we here requested host name and hypervisor and here we still request host name and hypervisor But hypervisor is now in its own dictionary again with a list of attributes behind it So we basically join all matched Servers on their hypervisor attribute and from that hypervisor again get host name and project and In the result you can see we still get the same server back But the hypervisor attribute no longer matches to AWHP whatever but it matches to a new dataset object with its own Attributes in there Now let's look at updating And you want to update you do the same thing you take a query Before I was always directly passing it to list so we get the output of what what is actually on the server Here I'm saving the query in the queue variable and I can iterate over this queue variable and change all the objects that were matching this query and All these changes will only be locally in my Python until I call commit on the query Again here it is good to note that before we were committing the server now We are committing the query both is possible You can either say I want to change something on the server and commit this right away Or you can say I want to make a very complicated change on many objects and then commit the whole thing And it will be committed in one SQL transaction We make sure that it is consistent throughout Also committing this will of course make sure that what you are committing makes sense that It will make sure that it makes sense if we make sure that you are allowed to change it So we don't have ACL currently on reading So if you are a user in the system you can read everything but we do have ACLs on writing So we will make sure that you are allowed to change this kind of server and you're allowed to change this attribute on this kind of server further There's one more safeguard here, which is there are some time between us retrieving the information changing it and then committing it and So often we'll always send back what the admin API the library will always send back the old and the new value It expects and when the old million value doesn't why committing the old value doesn't match what? Server admin know-how in the database anymore. It will tell you your data is stale. Please recommit this. It's not safe So final example deleting Again, we make a query here. I show you the get Function you can call on a query get fun the get function basically make sure there's only one one server matching your query gets back this server Otherwise it will raise an exception if there's zero or more We delete it again deleting. It doesn't really committed. We need to commit it then that was the first and most used way I would say for Interacting with servers and on our site the second one is the CLI client So you can do most of the same things you can say admin API you give it some filter here I filter for one exact host name Then you can give it dash a attribute name for every attribute you want to get back here I'm asking for the hypervisor and having this on the command time is kind of neat because you can directly for example Pass this output to SSH and then you have an SSH shell on the hypervisor where this VM is running Another need to I want to show you really quick as Polius H, which is not who we haven't written, but we maintain now And Polius H is kind of cool because here we can I made a query where I said hypervisor equals Some specific hypervisor which and and this is the texture representation of this query dictionary we have seen before so instead of having a dictionary now We just say attribute name equals the value and you also can use filters in here Which you will see in a moment We get back three servers. We can take this output and stuck it directly in Polius H And we will have get one prompt where we can get information from all these three servers at once Right, so here. I'm showing you what it looks like when you filter for multiple Of course, you now have to make it a string because of the space separated the attributes you're filtering on You can see that I'm using the rack again, and it looks like this if you want to use it Also in this example, I'm getting back two attributes. I'm getting back hostname and hypervisor, and this is what the output would look like Doing one more more example with like all the complex thing you can throw it it You have a query you have attributes you want to get back hostname and state you have been Dash oh which means order by and you again give it an attribute you want to order on in This type in this example. I'm using object ID because since object ID is the primary key and auto increment field It will automatically order it by creation date this way basically And the last thing I'm doing is dash you for update and I'm setting one attribute to a new state And you can see it in the output even that it's now mentioned Yeah, this is basically what to see like who you can do And the last thing I want to show you is server share Which is the web interface we provide with server admin it comes with server admin you can You get two in two inputs here basically you get first search in search You can put the same texture representation of a query. We have previously seen in the commander in utility Once you hit enter there you get all the servers back immediately matching your query And you really get shown some attributes right away for example hostname and internal p server type some very basic things One thing to note here is that this is called intern IP This will be important in the talk ignore this. Imagine. This says primary IP or something. It is not intern at all The second thing you have is the command Field and this command field you can literally type different commands. That's why it's called a shell One of the commands for example is ATT are you can say ATT aren't attribute name and this will talk The visibility of the attribute I'm typing in server type is already visible So when I hit return now this attribute will be will go away Second thing I want to show is we have we are currently matching two servers here In reality, it is totally normal that you are matching hundreds of servers you are working on so You want to be able to select different ones of them and You have commands in server shell to edit them There is in a command to delete the value of an attribute to set the value of an attribute to Add another value to a multi attribute field so a value which can have an attribute which can have multiple values and to delete Things from a multi attribute field But if you type them so I shall we just tell you please select some servers first because in opposite to the command and utility This will not just change all servers matching, but you actually have to select the servers you want to change Default none is selected when I type one here now and I hit return see that the checkbox get checked And there's like you can either type the number so you can say select all that's also possible Unselect all right, so we know we have one server selected. We actually want to work on Now I can use the Dell attribute command with some command name Which will mark The sub project has to be deleted again. It's not committed yet. It is only noted for deletion Another one of these commands would be delete note that I toggled the selected server So now I selected the second server not the first one anymore When I hit enter now the second server is marked for deletion completely But only once I now type commit and return on this the servers really gone and the attribute sub project Is really cleared Again in a single transaction So another cool command is history We keep the history of all attributes of all servers which were changed We have just changed of this first server the sub project one So when we check the history now, we can see that this attribute was actually set to no value At what time by whom and that's really really helpful to to understand what happened here Also, as I said, there is this multi attribute fields and Just like we used Dell attribute before we can use multi at to add a second value to this field The last command I want to show you is graph graph is Part of our graphite integration into server admin There's different ways to get graphite data into server admin But the one I want to show here is that you can define What You can basically define a board of queries you want to see for a server This is just one of the like dozens and dozens of things you will see because I made it so big But for example when you hit when you'd select ten servers and say graph you will get from ten servers They load and immediately get it in the server interface, which is also really neat So finally let's move on to IGVM IGVM is our VM orchestration tool The way it works is it goes ahead and queries for the server you give it for example here VM host a host name It will also create the hypervisor if one is set If you didn't set a hypervisor for your VM, it will just query all hypervisors available If you didn't set one it will then elect a hypervisor, which makes sense for this hypervisor when you're building it Based on is it loaded? Is it even available in the network you want and so on We'll go ahead and lock both the hypervisor and the VM you're working on Again, it does the locking simply via an attribute called IGVM locked in so far, which is available on VMs and hypervisors, so we'll just set it to true and Further while it doesn't create objects in so far, and you have to create it first It updates them so when I use Disk set and increase the disk size it will increase the disk size on the hypervisor But it will also increase the disk size noted in so far And of course this is a command and utility I'm showing him But again, this really is just a Python library, which we have wrapped a command line utility around so you can use all of these things in Yeah, and just like you did with like I showed with the Python admin API example So let's kick off a demo in the meantime really quick because we'll take some minutes This is not very bright Anyway, I'm just running some some Scripts here. I will show you The feedback is bad and my network is gone Well No demo I guess Yep, no demo. Sorry. Maybe I'll show it later So I wanted to show how I built something with this thing actually but it takes like two or three minutes So I wanted to start it now Then let's move along. The next thing I wanted to show is how we integrate puppet into server admin I won't go too much into how we use puppet in general, but I just wanted to show the points where we retrieve and put information from into server admin in our puppet setup the most straightforward way we do it is again, we basically re-implemented a tiny version of FN API in Ruby to use in here and For example here, you can see that we do a query one to get one attribute of a server matching We can give it the same kind of attribute based dictionary to filter on and We get back the database address here And this is a very easy way to just say for this game for this game market for this world for this function I want to get an IP and then I can immediately put it into my configuration on my web server for example The same way there's a bit more complicated than normal query one which behaves even more like The Python version where you give it multiple attributes and you get a list of dictionaries back or caches or whatever this and Ruby and You can use it just like any other puppet data Object and you can you call reduce on it and here I basically flatten the list of available projects and project networks and just get all the project networks in this project and The IPv4 and IPv6 part in a flattened list So this is very much the same as we had in Python One thing we have in puppet explicitly is We built our own here are back and so who here knows here up. Please raise your hands Okay, and I will explain it really quickly. So in puppet You create these classes and classes can have attributes when you include classes you can set these attributes if you want to and a lot of how you abstract the way puppet to work in a way on multiple machines is that you create these classes with with these attributes with these parameters and Then you create here are files, which are usually JSON or YAML files Basically by exact host name or by some other like maybe you say production and staging and then exact host name They're included in this order and then you can set these variables from there Which then can make them behave differently and instead of writing Additionally to this opportunity to write YAML and JSON files. We have also invented our own here are back end For server admin. So when puppet runs on our servers, it first goes ahead and makes this really gigantic Request to serve us and for a couple hundred attributes for the server that is running on It's like asking for its network configuration. It's low balancing configuration and this is just like a very small subset of things we actually carry it's much more and When this the answer is returned it is like normal here are just stuck to this to this class specifically and Then you can access All this information from server admin during the run why are calling Dollar IG colon colon server colon colon and then any of the names here for example intern IP and then I can just use it the last example I want to show you is Nipe command Nipe command is a Custom resource type we have invented for us which We use to to Define Nagios checks. So we have a lot of shared puppet code between projects. For example, we have this gigantic piece of this gigantic set of code for setting up different things in Postgres and all over the place. We are using this Custom resource type to define checks and since it is in the shared code I can just include it and Like the name suggests this basically is where the source for our Nagios checks comes from it does two things first It changes the Nipe commands config on the local machine. We are on but secondly, which is more interesting It uses this change multi attribute Function to add a value To the object so here we see we have IG server object ID this comes from the hero Explain before then we have an attribute name called monitoring checks, which is the attribute we are changing and We then we are adding a check for example This will be the check name when I call this class with Like this for example, I call it with the name CPU steel time and I give it some command path CPU steel time will automatically be added to the monitoring checks Attribute of this server and this is really useful because then I can just go ahead and run puppet on our Nagios server again And the checks will immediately be there and Since these we have these things all over the place in our shared code We basically just like I do three lines of including puppet code And then I get a postgres running plus 20 checks for the postgres already there and here I just used admin API to show that it will show up on this attribute basically, but whatever The next thing I want to talk about is how we load balancing how we load balance incoming traffic to our servers So we have requests coming in from from the internet basically and they will always hit a load balancer at inner games The traffic is routed to our half a load balancer spy a VGP so they announce Certain networks to upstream routers and Which networks are there and which? Servers are in are behind those IP addresses is again coming from server admin The incoming traffic is then low balance to the available nodes Via a pf firewall, so we go ahead and we have this tool called test tool It is running on every half a load balancer what it does is it gets the configuration from server admin It checks it does health checks on all the nodes, so I have I'm running all like I have game web server IP I want five nodes behind it will check this every node healthy And if it is healthy it will it will insert rules into Our pf firewall on previously to forward packets to this host The next thing to know is that all app servers serving a public IP have that IP attached to the loopback interface And this is really useful because this means that low balancers don't have to touch the IP packet at all basically the low balancers get the IP packet in and they just have to Resend an Ethernet frame to one of the application servers which feels responsible for this IP and that makes it very cheap to do I Would call this Direct server return light because that's kind of how we do the way to the server But usually you would do something like this where you directly return from the application servers to the upstream router So you have a string so with low balancers you can make this need set up where you have Asymmetric routing basically where you go in through one way, but you can go back from the application servers to the internet directly and This allows you to have a bigger back backward bandwidth basically you can see we can send more replies because it doesn't have to go through the low balancers again This has if you would run a CDN you would probably run it like this, but we don't we use outside CDNs but we Really like the opportunity to firewall outgoing internet traffic from the application servers And this is why we wrote it route our traffic back through the hardware low balancers But this is just a configuration change basically you can change this by just changing the default route on the application servers Test we can basically test to is the direct server return tool Yeah, we think this is useful for the kind of application server. We are running Test tool and our previously patches to make this work are also completely available on our github So the configuration we might have for What I just showed you is Usually looking like this so we have this low balancer server type which is a completely virtual concept a low balancer also has an Intern IP again read primary IP and protocol ports which say which parts you actually want to forward to the application nodes Vm then has a Relation like it did with hypervisor, but now it has one with load balancer and It can say I want to run behind this public IP here And then the final thing is the low balancer has a relation to health check and health check is a server type which just defines How to check if the server is healthy which can like that's different for web server than for database server Checking some examples. This is the easiest one for web servers for example, so we have a Server type low balancer. It has a public IP. It wants to give Part 4 for 3 and part 80 TCP to the nodes and it has the health check defined Checking the health check on the right We can set Via the attribute HC type on the health check server type. We can say this is a health check for HTTPS HC parts as the service we want to check for the health check is running on part 4 for 3 and The query you want to make is a head request to this specific URL We're just like the back end can then decide what it wants to do it can check read is can check the database connection or whatever and if it doesn't return The correct return code which for example here would be 2 4 2 it will be considered not help you not receive traffic anymore Second example. I want to show you is database low balancers, which Basically the same deal you give it an IP here. This is an internal IP You give it a port for example. I'm here. I'm using 5 4 3 2 default port for Postgres I'm defining a health check again and this has check is interesting because HC type now It's not HTTPS anymore. It is postgres and this means it will really make it will open a postgres connection To part 5 4 3 2 it will authenticate and it will run some select there to see if this node is healthy and this particular Select here is carrying a stored function. We have defined ourselves via puppet on the databases and Basically in the back end it checks some things like It checks the replication state. Do I have replication lag? Do I have nodes following me? Am I the replication master in our replication setup? And only if it is like you can put three databases in it and only the master will ever be considered healthy And only this one out of three for example would then be in the low balancer But the web servers don't have to care who's master. They will just connect to this IP and they will be fine They will always reach the master Okay, and the final example I want to show you is our power DNS setup Power DNS usually when you set it up it comes with their postgres back end comes with the schema kind of looking like this it has It does many more things, but the primary table here is the record table defining some information interesting for a DNS record and We thought that looks interesting we can kind of build something like this on top of the Existing postgres data, so server admin uses the postgres database in the back end and For example here you can see that we have this table called server this Table has attribute hostname attribute intern IP And we check if intern IP is not nice and if all of this is true. We just Select to create a view selected in this view and you can Transparenly use power DNS power DNS things it has on database But really it's just a view on top of server admin So when I commit an object to server admin it is immediately available in power DNS And we can since we already have this record to you we can put even more cool things in there for example We have this attribute called SSHFP, which is an attribute which is Populated by IGVM when a VM is first built so we go we rebuild a VM We note down the SSH fingerprint for this server and put it to some server admin attribute via Puppet or maybe even via the API directly. I'm not sure And since this this information is available. We can expose it here as an SSHFP record type in DNS and Since we also have DNS sec this allows us to then transparently Check if When I connect to some new server I've never connected to before usually an SSH you get this well This is the fingerprint you accept. Yes. No, and we just have disabled this on our server So it will not no longer ask it will get the fingerprint It will check against DNS if it is fine if it is fine It will connect and not ask you anything if it's not fine It will not even ask you it will just deny and say this is unsecure. I cannot connect So I think that's all I have Thank you. I think you killed it This one this one. Yes Please wait with moving out until we're finished with the questions. We have a couple of five minutes or so for questions Does anybody have any questions? I have a question I Really like your stuff and the the question I had was about the workflow. So you have this Server admin is a tool. Yeah, and then so when would you be using these things because? You're gonna put them You're gonna change the server in like the cmdb and then and then do the changes on the server It does like puppet. Yeah automatically update the cmdb, right? So Certain things like I mentioned monitoring checks would go from puppet to saw button But usually the workflow is that? So that one describes the state you want to have so your first day I say I create a local inside create a VM and then I built the VM and then it is just the state It is defined there. So it's yeah, how I wanted space usually Thank you Let me get some exercise Hello I'm Thomas. I maintain open-stack indivian and maybe I'm too much. Sorry. I'm not here. Okay So I maintain open-stack indivian since it existed and when I look at your presentation It looks like to me that you are reinventing it Oh what you have there seems to be already in open-stack. So What made you restart from scratch? And did you look into open-stack into octavia for load balancing into designate for dns as a service and so on? This stuff is older than open-stack We have reasons like It has been most of these tools have been released in like the last two years or something. They're all on github now but this all started Either when when those tools were very young and not really mature at the point we needed them or where they just didn't exist But I still think they They they I don't think we have the big interest in changing away from them because we have big in-house knowledge about how they work like when Getting into open-stack is really like getting your feet wet with that and like really not only getting a VM running But knowing what to do when crap hits the fan that's really difficult and we really know what to do Yeah, and what what we are doing. It's like very modular system. We have built here Fair enough any other questions I see zero hands. Okay. Thank you very much