 Hello, everyone. I'm Mahito Ogura. I'm very honored to welcome all of you to our first sponsor session. In this session, we are going to talk about how we are performing our automated deployment and benchmarking for OpenStack. I hope you can learn something from our sessions. Let me start with ourselves. Again, I'm Mahito Ogura. I'm DevOps Engineer in entity communications. And I'm working on various cloud services include NoSQL, Divya's service, Hardware's service, et cetera, and OpenStack. I'm mainly contributing DevStack and Rally. Here's my co-worker, Yuki Nishawaki. Hi, everyone. I'm Yuki. I'm a software engineer in entity communications and contribute to also DevStack and Rally. Here is our agenda. Before we talk about our automation, we will give some overview of what our company is doing with OpenStack and why we are trying to automate this deployment and benchmarking. As you know, OpenStack automation is really hard work. So we will introduce kind of best practice using Shehl, Cobra, and Rally for OpenStack deployment automation. So our company entity communications is actually using OpenStack for production. One is for public cloud called CloudEnd and the other called Enterprise Cloud. Enterprise Cloud is not released yet. Hopefully, we can deliver it to you by the end of the year. And details will be introduced in our host sponsor session. We also contribute to OpenStack community joining Foundation Corporate Sponsor and Japan OpenStack user groups. And we are sponsoring this Tokyo Summit as well. As I said, mentioned, we are using OpenStack in production and our first production was released in 2013. This product features that you can establish VPN connection for closed network while keeping global internet access. To make this happen, we choose OpenStack rather than CloudStack because OpenStack has much more control and flexibility on networking via standardized APIs. Using OpenStack for our product is successful, but we also learned that managing OpenStack environment is not easy and sometimes very painful. As you know, OpenStack is consisted of many components and modules and each component has to be integrated properly with others. And environment sometimes, sorry, and environment sometimes or frequently need to be updated for bug fixes and adding new features. There is no magic script to guarantee you to make environment up and running each time you deploy OpenStack. So we are facing a challenge that how we can continuously deploy OpenStack while constantly keeping service, functionality, and performance. This becomes very critical issues to service provider like entity communications, targeting to provide cloud service to enterprise companies. This is why we are working on automation of deployment and benchmarking. Before the automation, we did the following step. Starting from deployment, functional test, benchmark, evaluate result. If you don't satisfy the result, you need one more cycle. As you can see, there are many tasks. So it's hard to do this manually. Let me explain our solution. Deployment, functional test, and benchmark are fully automated using open source software, Chef, Cobra, Tempest, and Rally. Evaluating result is one of the challenging part. It's tough to be automated. So we developed visualization system to make evaluation easier. Again, we used open source software for Wendy, Grafana, and Inflex TV. I will talk about the whole process in this session. We start to explain from deployment. We defined deployment as four steps, orchestration, configuration, bootstrapping, and networking. We can't automate these steps by using only one tool. Besides, even in configuration tool, there are many tools. Chef, Puppet, Ansible, and Mura. So I would like to explain definition of each step and best practice mixing several tools. I'll explain from the bottom. Networking. It means setting switch to change network layout. We are using traditional automation tool expect for this automation. Changing cluster size sometimes needs to change network switch configuration. This expect script work for that. Bootstrapping is another challenge for us because installing OpenStack should be for physical servers, not virtual machine. For physical servers bootstrapping, we automated installing OS using Cobra. Cobra provides us pixie boot environment very easily. Configuration is one of the most popular fields now. We have a lot of good tools like Puppet, Chef, and Ansible. We are using Chef for this step. Orchestration is the most challenging field. It means making a cluster or setting load balancer. We are using Chef as well. If you want to use Chef for automation tool, or orchestration tool, you need to use with caution because orchestration script often breaks the clusters. So you need to divide Chef script into configuration part and orchestration. All script and configuration as explained the last slide are written in text format. For example, definition file of network settings and system architecture of OpenStack. And Chef cookbook repository. So we can control the version of our OpenStack structure under Git. This is our deployment flow. Our first step is cloning code from Git. Updating Chef server, setting switch, OS installed. Applied Chef for configuration. Applied Chef for orchestration. All process are automated by Jenkins. I will explain later. We will skip Git colon and update Chef server part because just typing commands. So I'll explain from the networking step. Networking automation is auto brilliant setup. Our current approach require two things. Defining server part name and setting appropriate switch part description. Aside from that, the only thing you need is to write configuration file including your ideal network layout. Let me explain how we automate it. First, Jenkins pick up configuration including VLAN information from GitHub. Then Jenkins get port description from network switch. After that, Jenkins generate switch configuration based on network layout and target server settings. Finally, Jenkins executes the command. And that it, auto network settings is finished. So we can move to bootstrapping. In bootstrapping, we need to prepare Pixi server for auto OS installation. Again, we are using Cobra for that. Cobra configuration is managed by Chef server. So changing configuration is enabled automatically. Next step is to turn the servers on. Then start Pixi boot. Last step is to check whether the server is ready or not judging by SSH connection. When we can log in the server, bootstrapping is finished. Let's move on to configuration and orchestration. First step of configuration and orchestration. Get close to the definition file of an open stack structure. This describes the role of servers and order of server setup. Next, we apply Chef script in order of definition file. As you may know, installing Keystone require database and messaging queue. And also for production, we need HA clusters for database and messaging queue. So our Chef script starts from configuration of backend servers including database and messaging queue. After that, the server are orchestrated to join HA clusters by Chef orchestration process. When we finish applying Chef script, we remove orchestration role from Chef server. This is because orchestration script have risks to break HA clusters. When we have finished, now we have the open stack cloud. Automated deployment flow is that all. Here is summary of auto deployment. Auto deployment system enabled that anyone will create the same environment anytime. Imagine you deploy manually. Even experienced engineer will miss it. But as I show you, auto deployment system require several tools. Any single tool does not satisfy full automation. So it requires wide range of knowledge. I'd like to talk about functional test in this cycle. We use Tempest for testing our open stack. Let me briefly explain about Tempest. Tempest is official integration test suite. It can test any open stack. For example, all-in-one open stack or a wireless and non-open stack. And this is used for upstream developing. If you have experience of contributing, you are familiar with it. When you upload a patch to the community, Jenkins tests your patch with Tempest. If your patch doesn't pass the test, it will not be reviewed or merged. Tempest supports four types of tests. API, scenario, third party, and stress test. We use API and scenario test. API test is a unit test that enjoy API works properly. Scenario tests have a series of APIs to simulate actual user's action. It's easy to use Tempest. We can execute it in only two steps. First, edit Tempest configuration file. For example, registering credential, enabling or disabling feature. Next, execute run Tempest. That's all. Then I'll explain how I test by showing a flow from start to finish. First, create a Tempest container from the image which I prepared already. The reason why I'm using container is that we customize Tempest and package. Next, in order to generate Tempest configuration file fitting in our OpenStack, apply the Chef recipe to the container. Run Tempest against OpenStack. Lastly, get result and delete container. This ends the functional test. I'd like to review the test section. Our auto-test system integrated with auto-deployment. So, test configuration file is automatically generated. Also, we adopted a container from Tempest. Also, we adopted container for Tempest so that we can execute functional test simultaneously with different customized Tempest. Unfortunately, Tempest does not cover all OpenStack system architecture. So, we propose 11 patches for fixing Tempest. And also, we cannot control the test case based on APR level. Next, I'd like to talk about benchmark in this cycle. We use Rally for Benchmark 2. The current version is 011, released on 6 October 2015. Rally doesn't only focus on benchmarking but support function of deploy and verification. First, we try to use Rally for deployment, but it does not satisfy plugins we want to use. I show you the procedure from Execute Benchmark to confirm the result. First step is to create deployment. In this step, we can choose to create a new OpenStack or register an existing OpenStack. In this right, I use existing OpenStack. And here is a configuration file which describes existing credentials. So, when I finish creating a deployment, I check if I can get endpoints from Keystone by using my deployment. If you can't get endpoints, you have to recheck the deployment. In this right, I got endpoints properly. So, I move to the next step. In order to do the benchmark, I create a task file which specifies a scenario plugin and parameter. We can use Jinja2 to write the task file. This is a sample of the task file. This is to measure duration time, creating VM and deleting VM. RallyOfficial repository has many task files, so you can use any task file from this repository. When I finish creating the task file, we start benchmark test. When benchmark is finished, Rally provides summary of results, max time, median, and so on. After you finish benchmark, you can generate HTML report of benchmark. This report provides a lot of information, and we can see detailed result of the benchmark. But we can't compare multiple results in HTML. I hope Rally implements near future. Now, you know how to use Rally, so next let me introduce how I check the result. After we did some benchmark scenarios, we always check the summary and compare it to the past results. If we detect an unusual result, we check the details by using Rally task report and analyze the cause. But current Rally doesn't have a comparison function with past result. So after each benchmark, I open the HTML file manually, so the target score and judge whether the score is good or not. That really hard work, isn't it? So I developed a benchmark comparison function outside Rally. We call it benchmark dashboard. This is website to check benchmark summary for each deployment. A benchmark dashboard has two functions. One is comparing summary with the past. Second is providing links to HTML report created by Rally. This is what benchmark dashboard looks like. The x-axis is commit ID, and the y-axis is duration time. And this way, we check summary of result for each deployment. Next, I will explain the process on benchmark dashboard. First, I create a Rally container from the image, which I prepared. Second, I register OpenStack to Rally. Third, start each benchmark against OpenStack. When each benchmark finished, summarized results are sent to InfraxDB and HTML reports are sent to benchmark dashboard. This way, we can view results on benchmark dashboard. I'd like to review the benchmark section. Our benchmark system enables us to compare benchmark results with the past. And the benchmark process is fully automated. But currently, we don't support many benchmark scenarios. The best practice for benchmark is not fixed yet. Those are our next challenge. Up to here, we finish to automate from deploying to benchmark. In this section, I'll introduce our way of identifying bottlenecks and root cause of failure. Here in this cycle, when benchmark score is unusual, we need to find bottlenecks or issues. But OpenStack Cloud consists of many components and output many defined logs. That makes it difficult to find bottlenecks and issues. So, we created an analyzing system to enable data collection and monitoring. This is a system architecture. We use three open source software and one free service to search, notify, and visualize logs and resource. We also use a tool called REND in FluxDB, Grafana Slack. Explaining detail takes a long time. If you want to know more details, feel free to reach out to us. We are in entity coms booth. I explain OpenStack deployment, functional test, benchmark, and evaluation. So, this is the end of our system architecture for deploying with production qualities. Next, we will introduce one use case using this system. First scenario I use is VM task scenarios, which is contained in Rally at default. VM task scenario is to measure duration from creating VM to deleting VM. More specifically, it measure Nova boot, rank command via SSH, and Nova delete. We prepared two task files, which set different concurrency, 13 and 15. Why did I use VM task scenarios? There are two reasons. First, as you may know, VM creation and duration is most common scenario in providing cloud service. The other is that we need to know which component will be affected by increasing the number of users. Unfortunately, benchmark for fast environment does not score 100% success rate. So, we tried to find the root cause. First, I checked the Rally report. Then we found that virtual machine could not get IP addresses. So, I moved to check Newton logs. When I checked Newton logs file, I found that database connection in pool were overflowed. So, I reviewed configuration about database section in Newton. So, I changed some items in database section of Newton configuration file, max-width rise, width run interval, mean pool size, and so on. After I finished to change configuration file, I checked if new configuration was valid or not. So, I used our auto-deployment and auto-benchmark system to use it. Just what I did was git push, request Jenkins, and wait. When Jenkins finished to benchmark, we checked result. In here, this change solves the problem and we could improve open stack performance based on objective benchmark result. Also, I found that includes the number of users affects Newton database connection. For other situation, we need to execute the same procedure to find other bottlenecks. In conclusion, we learned four things via this project. First, default configuration of open stack does not cover all use case. So, we need to investigate the best configuration for each case. The best configuration requires us to follow the cycle deployment, functional test, benchmark, and evaluation again and again. And any single tools cannot automate hard process, but we show the multiple open source software can automate hard process. Thank you for listening. This is our first sponsored session. Our three sessions are remaining. I hope you enjoy the session as well. And we have a booth. So, if you have... We have a booth. So, if you have questions, please feel free to come our booth. Thank you for your attention. Our session is that all. Thank you.