 First of all from those who doesn't know me, I'm Boris Pavlovich, I'm working at Marentis and I'm a really project technical lead, so I'm leading work on the rally project and today we are going to talk about rally certification and scale with my friend, Jesse. I'm Jesse Keating, I'm with Bluebox Cloud and we are a heavy user of rally in in our day-to-day operations. Okay, nice. So first of all let's talk about today's agenda. So we are going to talk who, why and where is using rally, to show short rally demo, to explain what we mean by cloud certification because it's quite different from the work that is done by Devcore and we'll talk about our current progress and future efforts and as well we'll have section for questions and if you have some interesting question it cannot wait just show me hands and I'll try to answer them instantly. So first of all this is how I was looking up and stack testing before rally. So it's actually quite simple task but for some reason it was implemented in quite strange and hard way. So I mean the using Tempest could help us just to check that functionality works and actually it didn't check that it actually works under expected loads and that there is no races and such stuff and that you can actually safely use your cloud for production. When I was at Rackspace we had an entire team dedicated just to QA'ing of the clouds before we would we call it good and when I joined Bluebox we didn't have an entire team who was just a couple of us and there was no way we were going to be able to do that type of thing so we needed a tool that was much easier much much easier to use. Okay so did somebody try rally before? Just few, no? No? More? And did somebody heard about it before? Okay, nice. Okay so should I explain what is it? Yeah, okay probably. So there are two parts. One is automation of Tempest for those who want to use it or is forced to use it by companies and so it automates installation, automatically configure it properly and run it, parse all results and store to DB and then you can work with results like comparing them or getting 3D HTML graphs and so on. What you would like to do. Another part is separated. It's a rally framework and it can be used as well for functional testing but it can be used as well for all kinds of testing that you have negative testing or stress testing, scale testing, load testing, volume testing and blah blah blah any word testing. Okay and after that all results are stored to DB and you can work with them for example to generate reports and actually common workflow for rally is five minutes so you need to pass credentials of cloud to rally. After that you need to specify rally tasks so to actually say rally of what tests and what workloads to run and after that you are just running single command and after it finished you can run another simple command and generate reports that you can show to your boss and you'll get happy boss quite fast and probably promotion. So the whole thing about rally is simplicity. We would like to do everything simple as possible so every step should be clear and fast and simple and without any bugs and things so to get rally you can use apt-get in Debian and soon in Ubuntu or YUM in CentOS. You can install it from source per single command like distec and there is as well Docker images of all releases and master that is always up to date and there is even morana application so if you know what is morana doesn't know what is morana it's quite popular now really is more popular. So to provide credentials to the rally you can use simple way for humans so search your OpenRC file and run rally deployment create from end and there is a hard hardcover way and you have to do this just on to specify what deploy engine to use and it's existing cloud it means that you are just passing credentials and specify admin credentials or list of users that you would like to use for generating a lot if you pass only admin it will create temporary users for generating a lot after that it will delete all of them but the command is still simple so just one line after that you can run some sample from rally repository like boot bounce and delete VM bounce it means like suspend resume stop start and so on and after that you can generate a report using just single command and it will generate HTML that you can show somebody okay so this is the report it is some kind of performance analysis so it can show you like what it takes to boot server what it takes to reboot server and rescue server delete server so you need you know what to optimize in the first as a first step so you don't need to optimize random stuff and try to fix your cloud if it is required so I can just okay before running demo we should I would like to explain about the format of the task that we have so first line is what workload to use it's plugin and rally and it's actually actions that will be done from users so in this case we'll boot VM then delete VM and booting will will be with this with these arguments so we will specify flavor image and song and runner is another section is how it specify a lot that we would like to generate so constant load means that it will have a fixed amount of scenarios running simultaneously times amount of run context is environment that should be created for this workload like we should create tenants users we can set quotas roles anything probably upload some images or create some servers so and there is the last one section is SLAs so it's criteria of success so for example this one means that there is no failures if there will be at least one failure it will say that this task test failed and there can be another like based on duration or something else that you would like to have and let me just show the short demo I mean to explain that it's really simple so it was not released really it was just master and let me make it bigger okay can you see it yeah okay so here is a rally and I just purge DB and we will just run a rally deployment and it say that too few arguments let's create and it helps you to understand what you should do here so there is some kind of tutorial and we will just source open RC file and because I'm too lazy to write is just on and create from and name testing so after that we can do rally deployment list so really can manage any amount of clouds that you have and there will link the results of tests to some specific cloud which is usually full in future and after that you need to just run to start the task and this is really okay here is really report that task that a task task yeah thank you okay rally tasks start and here is directory ready samples tasks scenarios and we will just run something like keystone create and delete user for example YAML okay so it provides some information and I run it without verbose mod but okay so we get some information here like average minimum maximum duration of keystone create user keystone delete user and then we can just do this rally task list and so all results are stored in rally DB after that we can work with them like for example rally task report out some HTML HTML and now I'll just download it some HTML here okay okay and let me open it some HTML and so we are getting here overview of all tests that we to run and here we run only single test and if we click we'll see these all pretty graphs that are actually interactive and you can do a bunch of stuff with them and if there are some errors it will display them and so on yeah so okay I think this is enough for demo so let's move okay all right so what what blue box is doing in in our production usage we use rally in a few different ways and the the primary way that we use it is in our new cloud deployments we box every customer gets their own cloud and we want to make sure that that cloud is operating at the level that we promise it'll operate so we use rally when we turn over a cloud to make sure that we've reached the service level that we say we will reach this doesn't just test open stack itself it's also testing the entire setup of open stack all the underlying hardware the underlying network all the things that are outside of open stack codes control rally is still able to certify that and we're running rally outside of the cloud we're running as if it was a customer so we're really getting what the customer experience is when we rally against our cloud we also use it as a new deployment there is a as a validation for new releases of our deployment tool we want to make sure that as we make changes to how we deploy and upgrade our clouds we're not introducing regressions in the functionality or the performance of our cloud so we keep using it as we as we develop new versions of our releases in third we want to make sure when we're evaluating new features of open stack or changes to the configuration or new hardware specifications that that are our tool or our clouds continue to work as they had before or better you know does the does the new feature actually do what it says it does is the new feature impacting our performance in a negative or a positive way and that helps us drive decisions in our product offering as to what we want to do and so this was sort of leading to the cloud certification pyramid so we've got Defcore which is all about can you be an open-stack cloud can you use the logo but that doesn't really help when in our case sure we have a cloud that passes all the basic functionality but what happens if you've got two users using it or ten users using it or 30 users using it you know Defcore doesn't really say it's safe to use in a production world it just says it's possible to use it in the production world so we want to build on top of that we want to get an established baseline performance that we can say this is what our cloud does when we turn it over but then we also want to know what is our expected load if we crank that up and really stress our cloud where does it start to fall over because all clouds will fall over at some point and then again at expected scale if we add more controllers can we do more things and eventually we want to be able to test the high availability we want to be able to run these tests and take off part of our controller set and see how the cloud operates in that fashion so that we know what to expect and we can communicate with our customers what to expect in those scenarios and we can validate that going forward that as we make improvements to our processes it is a better thing rather than a worse thing and H8 thing is expected in a library cycle and ready so you'll be able to run simultaneously multiply scenarios so one scenario will be destructive like restarting controllers another will be some kind of lot for example keystone authentication or some basic features like booting VMs migrating VMs and so on before that I mean you can run rally in the background and then do other operations in a different terminal it's entirely possible but it's not automated fully and our goal is to automate this fully okay so how else can we make this simpler you one of the the second task that was described in the first slide is have to define what the task is the blue box has spent some time to figure out what it is that we want to advertise or we want to test our clouds to do so we've picked and chosen various bits and pieces of the sample set and created one large file that does all the things that we wanted to do and we've specified what our quotas are what our SLAs are etc etc that took a little bit of time that could go away if the if we could publish something as part of rally that is this is acceptable base level of operation that saves you a step so it now becomes register the cloud run the certification generate the reports and you're done so in order to do that we need to create a certification working group we need operators like blue box and other people who are interested in using rally to certify the clouds to come to an agreement on some base level type of performance def core is awesome for a base level of functionality now we're talking about performance and operation so how many users tenants computers what type of workloads those types of things and you know also defining ways to modify those on the fly making them variable enough that as you point at different size cloud capabilities you can easily change it without going into the task itself yes so the task and ready our parameters it already so the only thing that we should think is what parameters it should accept so it's some kind of amount of users tenants type of clouds size of clouds probably quality of hardware and so on and using this information will calculate all performance data that we should expect and what test run and so on yeah yep okay so yeah so the the other feature steps along with trying to to form a certification workgroup is also to get more people involved with the rally project rally is now an official open stack project so it's a good way to get your feet wet in opensack development we need more core reviewers there's quite a lot of open reviews out there we need more yeah lots we need more operators that they understand how their cloud works and have ideas of what they want to assure works in their cloud and that can help develop scenarios and test examples and we need more companies that are that are willing to put their name up as we use rally to satisfy our cloud so you have free support from me if you would like to join rally community and try to use it in your company so please ping me in any time right in mail I'll try to reply yeah okay and just about who is using rally so there for all time there is about 30 companies and more than 140 developers and about 100 1200 commits and about two years of development and we see that in kilo it's still popular like 25 companies and 80 developers and about 380 commits and so a lot of different people is working on it and I will be happy if you work as well if you need it so okay this is part about questions so you have some questions if we have more time we could also do more demos as well yeah more demo questions okay question now we already implemented it so there is already functional mixing and you have yourself assert something and it already works we would ask for the remainder of the session to please use the microphone or repeat the question for the sake of the recording thank you the question is there way to use standard unit test stuff like assertion and so on and I say that we already implemented it and there is way to do self dot assert something a crew or something else and it will work so another question okay is there also support for having a set up and turn down on the test so it's all about context so you can specify what context to use and they are running one by one like set up in teardown so but you are specifying decent on the level of task so creating users tenants putting on roles I can show sample if you'd like and one more question okay sure how is it well integrated with the current OpenStack infra yes it's well integrated with okay let's open Cinder for example so something here okay not this one it's specs I need Cinder okay this one okay here is a really job that is easy to add to any project and you can open HTML report and this is generated on each patch that you are putting in Cinder for example and you have here as well not here okay here is really looks that you that you can analyze it's always in for stuff okay so here really looks and here are logs of all services that we have so for example screens in their scheduler okay thank you okay any questions yes so how OpenStack specific is is the testing and how is it is to extend it up and down the stack for example I wanted to run IPMI checks against the hardware or application level checks against the stuff that is running on so we're doing hard work for the last year about splitting really from OpenStack so keeping it simple to use for OpenStack but making it simple as well to use anything else so for now you can actually write any Python code under load but you need to have at least fake OpenStack cloud like just keystone somewhere so it's because of validation of tasks but we are working I think in Liberty we will finally finish this work so you'll be able to use really as common tool for anything okay thanks hi and this is Alain Navarro from Midokura we are testing and working in our company and I checked Rally so far a few months ago and I think the networking part is not as elaborate as other areas in Rally and I wanted to ask you about the roadmap for network testing and how does it relate to project shaker I think Mirantis is also driving that project that I think has more insight more it's getting to the point network testing so this is very hard question but I'll try to reply so shaker was created because Rally team failed to do in time network testing but we are still working I hope someday we'll finish so we have bunch of patches about 10 patches now that are base for porting any existing benchmark tools that are working on multiply servers or VMs to rally and extending these reports to support any kind of data that they provide and display it in pretty ways we have working on that okay so I'm not sure yet about integration with shaker but why not probably will integrate but it's the same story we need to finish this base to integrate even shaker yeah another question you mentioned to this high availability you wanted to tear down some services like or service how do you do that so for example scenario can accept SSH address and the user and password and it will just go to the node and turn off some service for example yeah thank you hi I would like to find out key stone v3 support in rally so it's already supported and keystone domain support is that something available in rally he's on what keystone domain security domains but we can support you have a pretty easy way to ask us to do this so it's actually called future requests so you can go here on documentation and so there are plenty of future requests like ability to compare results between so just provide few lines of description why you need this and we'll implement or just publish patches that implement this sure thank you okay so it seems that this is a really cool tool to when you bring up a new cloud and you have admin how about it works for for example in a multi tenant cloud where you do not have admin capabilities so the use case is for example a cloud that you have several tenants and there are certain times in the day that you see performance issues so I'm seeing how I could use rally to get you know nice graphs about API times that kind of stuff how do you scope that at the test level so you mean to fetch API yes that kind of stuff not full duration of operation yeah duration yeah so in that case you don't want to do it as an admin you want to do it as a regular existing tenant so I believe it when you specify the deployment you can specify the users to act to use and then you have non-admin level users that will do these operations okay so I assume when you when you define if I understand correctly you define the testing either in YAML or JSON right so do you put that the set of capabilities that you need to run that test at all or it will just fail if you don't have them it's a good question let's say that you have for example a test that requires create a tenant and you're running it by sourcing your open RC file and you're gonna have that capability would that fail miserably or the YAML file you know really checks you know that you have that capability so when you're running a rally deployment check it will check for now only admin user but in future it will check that past users in user section are not admin so it check roles and so on all right okay okay and here is documentation so here is the section users so you can specify any users that you already have in system and rally will use them to generate lot okay great thank you okay hey this is Yan from Time Warner cable so I have a question for the for the count constant type when you run the test remember in your YAML configuration file yeah you have a configuration to configure the constant type right yeah is that my question is is there any other other types like I don't want to start everything at the same time I won't use like a stepping threads mode is there any way to do that yeah so there's with constant there's also a there's a concurrency so constant basically means I'm going to do all these tests and then tear everything down but the concurrency says how many of those tests you're gonna do at once so if you have a constant test that you're gonna do a hundred times and you give it a concurrency of five it's only going to create and if we're doing boot and delete server it's gonna create five servers at a time yeah and if the test is boot and delete it'll create five servers delete five servers create five servers do these five servers if it's just boot then it will create five servers at a time until it reaches a hundred and then delete them all but there are other scenarios besides just constant runners yeah there's it's a RPC so it doesn't wait for finish of previous iteration it just run each interval new iteration and actually it's quite easy to implement on runners that does exactly what you want to do if you would like they are pluggable so they are simple to implementation in future we will make a stress that just rise slow and song okay but you're still talking about starting everything like for example I want to put a server boot server test right so we put server five servers at the same time or can we just put server one server first then put two servers second so do something like that now in current constant runner it's it won't do this but you can implement on that can do this yeah that would be yeah it's not hard thanks okay yeah is there a runner that lets you like if you have 10 or 15 scenarios that you want to sort of interleave and I want to run them in parallel so it can happen kind of in any order so I already talked about this so it's multi-scenario load generation that will be implemented I hope in liberty so I hope so we already accept new task format so we are going to that allows us to specify any amount of scenarios that can be run in parallel I know it's very important it will be implemented okay any other questions okay thank you for coming