 So, welcome everybody. My name is Thomas Kiergel. I'm working as a Linux consultant and developer for a company called B1 Systems. B1 Systems is located in Germany and we were founded in 2004 and our focus is mainly on open source software. We are doing consulting, supporting and development for open source software and of course we are also offering trainings and operations and custom solutions for our customers all based on open source software. But let's come to the topic of this talk. In early 2012, I was a member of a little three-man team who was asked to build an open stack installation for a large German software company that wanted to have a public cloud for their instances. And so we were asked to build it. We were working really hard for a few months to get this project going. We were working with the Essex release at this time. Later on, we moved and updated to Folsom in the process of creating the cloud. So that's kind of the circumstances this whole story took place. The setup the customer wanted from us was a little bit custom in forms of they wanted to ask to use LibVit Xen as back end, not LibVit KVM as most people do. So we had to patch some things in Nova because the support for LibVit Xen was not as good in the time then. And so we had to do a lot of work. But the architecture of the integration of the public cloud should be a mapping from users to real users in the company and a mapping from departments in the company to tenants. So we later found out that this wasn't a good idea, but first we finished our installation. The customer liked the installation very much and I went for my long overdue vacation. After about three weeks, I came back and this is what I found. An overlist operation took over a minute to completely return and in the time I was absent, they deployed roughly 1,800 virtual machines in one tenant. So what happened? That was the questions what was given to me and I stood there and had to watch what happened. The first observations were obvious, an overlist was extremely slow, almost all operations on instances were affected so it took also over a minute if you just wanted to stop a machine or start it again and so on, horizon was unusable slow so it just made no sense to use it. You clicked and waited for over two minutes, there was no sense using it. And the last observation was the database service seemed to be under heavy load during an overlist operation and the nova services too. So this was my point to start. In this special case, the whole issue came down to this relation. If you deploy many instances in one tenant, then the relation between the deployed instances and the time nova list needs to return is not linear as expected. It is furthermore an exponential curve. So we hit quite hard scaling limit back at the time. Remember it is faulsome and all instances are running in one single tenant. So if you watch at the same configuration but are using multiple tenants, the relation of time needed to list all instances and instance count returns back to a decent linear line. So that's what the thing is looking today. This measurement was taken on an OpenStack Juno installation last week. As you can see, the time needed to list all instances is increasing up until 1,000 instances and then is staying at this level for more instances. So the whole situation I ran in back then is not a problem anymore today. But that's not the point of this talk. The point is to show you the way how we investigated the problem back then and to show you what we've learned from it. We developed some strategies we are using today to avoid running in this kind of problems again. And we are also using some kind of a little tool set which helps us achieving this. OK. Our old investigations strategy back then was something like this. We first observed the CPU load on infrastructure during this situation. Then, of course, we turned on the log mode of every component to the debug setting so that we could see if there were any traces or errors we were suffering from. But that led to nowhere. So no errors were found, we were not smarter than before. Then we were turning MySQL query logging on and watched the queries the database hit at the time. And after that we started to analyze the nova code which processes the information given back from the database. This led us to these monstrous joints that were created by MySQL Alchemy and the database joint itself was not the problem. But the problem was that the result was passed multiple times by nova during a nova list operation, for example. Let's talk about our solution back then. We, of course, could have asked our hardware guys to give us more powerful hardware for nova. But that's not a very realistic option, OK? So we started to rewrite nova SQL Alchemy code that generates those big joints over the database and more important the code that passes the answers from the database. And we kind of got rid of all the redundant calls which are always querying the same data from the database over and over again. The more obvious solution would have been to reorganize our structures of users and tenants so that the instances would have got more spread over the instances over the tenants than before. OK, this all led us to our prevention strategy we are using today. We determine expected load prior finishing our cloud setup. So that includes numbers which were, what is the maximum number of the expected of the instances running on this cloud? How many build-ups concurrent through instance destroys are expected within a given time frame? So how much elasticity is needed? So the next point is always, we always design for horizontal scalability using active, active HA setups today. And the last point is we build representation, representative miniature of our cloud setups so that we have handy development environment to take measurements and do our development work without affecting the real thing. That is done with Vagrant that leads me to the next section, useful tools. That is our toolkit we use to design and develop new clouds for customers. The first point is Vagrant, Vagrant is a solution to automatically deploy cloud setups for example. You can deploy anything with it but it is very useful to deploy this miniature cloud so I was talking of. So Vagrant is a way to generate reproducible and portable work environments. It has a steep learning curve so it is very easy to setup and use. It is usable for scale testing as for development that does not matter and to accomplish this it is using so-called providers. That is just the virtualization solution that is used to virtualize your VMs that are running your miniature cloud. So you have also the choice between many provisioners. These provisioners are used to configure these VMs running your miniature cloud. You need to install software on them and configure software on them that is done via these provisioners. There is a broad range of provisioners available like Ansible, Chef or Puppet, mostly all that you can find on the internet and open source world. This is my example Vagrant environment I use every day. The hardware consists of a host with eight cores and 32 gigabytes of RAM. More is always useful but it should be at least capable of hosting all OpenStack controller hosts full scale. So what are controller hosts? I am just excluding compute nodes in this term. So the controller nodes are including the messaging queue, the database, NOVA API and scheduler services, so forth, so-called Cinder or Keystone, Glance, etc. Everything should be put full scale into this environment except the compute nodes as I said. I prefer to use VirtualBox as the Vagrant provider because it is very easy to use and there are already a lot of pre-built images available for this configuration. So you can just go to Atlas HashiCorp site and search for a template of your VM running CentOS or Slash or OpenSUSE, whatever. You will find it there and can rely on the work already done to set up your solution. The provisioners I use mostly are Shell, Ansible and Puppet. Shell and Ansible is simply to deploy or install a specific software packet and when it comes to configuration management, I prefer Puppet to inject my configurations into the miniature cloud. So that is Vagrant running and freshly started. You simply have to go into the location where you start your Vagrant configuration files and do a Vagrant up and Vagrant will do everything else. It then creates the virtual machines on your provider and is installing the needed software and is also triggering your provisioner to configure the software. This is a completed setup of a Vagrant environment called Packstack Vagrant. It is built by my colleague, Christian Berent, he is sitting over there and I use it every day for testing and measurement purposes. In this case it consists of three compute nodes, one networking node, one storage node and one controller node running all the NOVA and CentOS services, so the central API services in this case. So if you want to know more about Vagrant, I want to use this situation to point you to the talk from my colleague yesterday. You can find it, oops, you can find it behind this link and there's a lot of more info about Vagrant, but let's move on in our topic. For the measurement which has to be done to the benchmark that has to be done, we use OpenStack Rally. Rally is a relatively new part of OpenStack. It is also easy to use and set up. Many benchmarks templates are already included for the most standard situations. You're completely done with that in the most cases, but if your scenario you want to measure is more complex or complicated, you can use Rally plugins to enable easy creation of your complex scenarios. And last but not least, a very useful point is it already gives you a nice result HTML file which you can show in meetings or something like that. That's also saving a lot of work. This is the small test with the measurement which gave us a graph from the relation today between the time of a list needs and the number of running instances. That's the results page. You will notice that the graph I showed a few slides ago, this is it. If you notice, the instance count is 4,000. How did I run 4,000 instances on a test setup with only 32 gigabytes of RAM and 8 CPUs, by the way? That's where fake drivers come into the game. The fake drivers are a wonderful way to simulate large instance or volume counts and they are completely transparent for the OpenStack infrastructure controllers. OpenStack itself won't notice the difference between a running instance on a true compute node or between the fake instance simulated by the fake driver. That makes us independent from hardware requirements. That's the point that I said you don't need to throw in as much hardware as you need for a complete compute node to make useful measurements. That's not needed. The nova fake drivers are easily configured. You just have to change one line in the nova conf on each compute node involved. You can also run it with real hypervisors side by side as the configuration is done on every compute node. Let's come to a conclusion of all this. This should be the points that the ones of you which are standing in front of the task to build their own OpenStack cloud and these points are intended to help you not running in a situation like we did back then in 2012. First, determine clear specifications. Ask for how big should our cloud be at maximum. How many instances would be running on it? How many volumes will there be? How often will these volumes be deleted or created and attached? Same thing goes for instances and so on. Ask for how many users should be running on it? How will these users be organized to tenants? How will the internal departments of the cloud builder be mapped to tenants? That's really important and you should have spent a lot of thought on these points. Also, you should use Rally to thoroughly test your setup within the specifications you determined above. At this point, you should discover if there are any scaling limits holding you back on the side of the OpenStack infrastructure. You should perform at least one full-scale test when the build-up has finished before going live. That's just to ensure that there's no bottleneck in compute networking or storage backends which will become a problem later. Always use active-active HA setups because they give you the option to scale out your OpenStack infrastructure horizontally. You just put another service there which is load-balanced with other services and then you can handle more load. That's the idea behind it. Currently, there are no fundamental scaling limits to my knowledge as we have seen in the days back then. Everything is open and the OpenStack community did a very good work on solving these issues we saw in the beginning. I hope that's helpful for everyone building their own cloud and it's helpful to get you going. Are there any questions? Can you stand up and use the microphone, please? Sure. Do you have any numbers to share? The scaling numbers with Juno or Kilo? The scaling numbers with Juno or Kilo. This measurement right in the middle was taken with Juno. This was a Juno installation. Putting as much instances into one tenant doesn't become a problem anymore. Folsom couldn't handle this. What type of images these VMs were spawned on? And secondly, what network were you using? Neutron or NOAA network? If Neutron, how many subnets these VMs were spawned across? In this special case, on the original installation, you are asking for the original cloud we built back then? Yeah. We used Quantum and the customer asked us to give each instance four networks, all the provider networks. So we did not have to worry about floating IPs or something like that. We just gave them provider networks. And this test on Juno is basically the same installation just updated to Juno. Other questions? Okay. Thank you.